Re: Long GC pauses while reading Solr docs using Cursor approach

2017-04-15 Thread Toke Eskildsen
Chetas Joshi wrote: > Thanks for the insights into the memory requirements. Looks like cursor > approach is going to require a lot of memory for millions of documents. Sorry, that is a premature conclusion from your observations. > If I run a query that returns only 500K

Re: Long GC pauses while reading Solr docs using Cursor approach

2017-04-14 Thread Shawn Heisey
On 4/13/2017 11:51 AM, Chetas Joshi wrote: > Thanks for the insights into the memory requirements. Looks like cursor > approach is going to require a lot of memory for millions of documents. > If I run a query that returns only 500K documents still keeping 100K docs > per page, I don't see long GC

Re: Long GC pauses while reading Solr docs using Cursor approach

2017-04-13 Thread Chetas Joshi
Hi Shawn, Thanks for the insights into the memory requirements. Looks like cursor approach is going to require a lot of memory for millions of documents. If I run a query that returns only 500K documents still keeping 100K docs per page, I don't see long GC pauses. So it is not really the number

Re: Long GC pauses while reading Solr docs using Cursor approach

2017-04-12 Thread Erick Erickson
You're missing the point of my comment. Since they already are docValues, you can use the /export functionality to get the results back as a _stream_ and avoid all of the overhead of the aggregator node doing a merge sort and all of that. You'll have to do this from SolrJ, but see

Re: Long GC pauses while reading Solr docs using Cursor approach

2017-04-12 Thread Shawn Heisey
On 4/12/2017 5:19 PM, Chetas Joshi wrote: > I am getting back 100K results per page. > The fields have docValues enabled and I am getting sorted results based on > "id" and 2 more fields (String: 32 Bytes and Long: 8 Bytes). > > I have a solr Cloud of 80 nodes. There will be one shard that will

Re: Long GC pauses while reading Solr docs using Cursor approach

2017-04-12 Thread Chetas Joshi
I am getting back 100K results per page. The fields have docValues enabled and I am getting sorted results based on "id" and 2 more fields (String: 32 Bytes and Long: 8 Bytes). I have a solr Cloud of 80 nodes. There will be one shard that will get top 100K docs from each shard and apply merge

Re: Long GC pauses while reading Solr docs using Cursor approach

2017-04-12 Thread Erick Erickson
Oh my. Returning 100K rows per request is usually poor practice. One hopes these are very tiny docs. But this may well be an "XY" problem. What kinds of information are you returning in your docs and could they all be docValues types? In which case you would be waaay far ahead by using the

Re: Long GC pauses while reading Solr docs using Cursor approach

2017-04-12 Thread Chetas Joshi
I am running a query that returns 10 MM docs in total and the number of rows per page is 100K. On Wed, Apr 12, 2017 at 12:53 PM, Mikhail Khludnev wrote: > And what is the rows parameter? > > 12 апр. 2017 г. 21:32 пользователь "Chetas Joshi" > написал:

Re: Long GC pauses while reading Solr docs using Cursor approach

2017-04-12 Thread Mikhail Khludnev
And what is the rows parameter? 12 апр. 2017 г. 21:32 пользователь "Chetas Joshi" написал: > Thanks for your response Shawn and Wunder. > > Hi Shawn, > > Here is the system config: > > Total system memory = 512 GB > each server handles two 500 MB cores > Number of solr

Re: Long GC pauses while reading Solr docs using Cursor approach

2017-04-12 Thread Chetas Joshi
Thanks for your response Shawn and Wunder. Hi Shawn, Here is the system config: Total system memory = 512 GB each server handles two 500 MB cores Number of solr docs per 500 MB core = 200 MM The average heap usage is around 4-6 GB. When the read starts using the Cursor approach, the heap usage

Re: Long GC pauses while reading Solr docs using Cursor approach

2017-04-11 Thread Walter Underwood
JVM version? We’re running v8 update 121 with the G1 collector and it is working really well. We also have an 8GB heap. Graph your heap usage. You’ll see a sawtooth shape, where it grows, then there is a major GC. The maximum of the base of the sawtooth is the working set of heap that your

Re: Long GC pauses while reading Solr docs using Cursor approach

2017-04-11 Thread Shawn Heisey
On 4/11/2017 2:56 PM, Chetas Joshi wrote: > I am using Solr (5.5.0) on HDFS. SolrCloud of 80 nodes. Sold collection > with number of shards = 80 and replication Factor=2 > > Sold JVM heap size = 20 GB > solr.hdfs.blockcache.enabled = true > solr.hdfs.blockcache.direct.memory.allocation = true >

Long GC pauses while reading Solr docs using Cursor approach

2017-04-11 Thread Chetas Joshi
Hello, I am using Solr (5.5.0) on HDFS. SolrCloud of 80 nodes. Sold collection with number of shards = 80 and replication Factor=2 Sold JVM heap size = 20 GB solr.hdfs.blockcache.enabled = true solr.hdfs.blockcache.direct.memory.allocation = true MaxDirectMemorySize = 25 GB I am querying a solr