Re: Long GC pauses while reading Solr docs using Cursor approach
Chetas Joshiwrote: > Thanks for the insights into the memory requirements. Looks like cursor > approach is going to require a lot of memory for millions of documents. Sorry, that is a premature conclusion from your observations. > If I run a query that returns only 500K documents still keeping 100K docs > per page, I don't see long GC pauses. 500K docs is far less than your worst-case 80*100K. You are not keeping the effective page size constant across your tests. You need to do that in order to conclude that it is the result set size that is the problem. > So it is not really the number of rows per page but the overall number of > docs. It is the effective maximum number of document results handled at any point (the merger really) during the transaction. If your page size is 100K and you match 8M documents, then the maximum is 8M (as you indirectly calculated earlier). If you match 800M documents, the maximum is _still_ 8M. (note: Okay, it is not just the maximum number of results as the internal structures for determining the result sets at the individual nodes are allocated from the page size. However, that does not affect the merging process) The high number 8M might be the reason for your high GC activity. Effectively 2 or 3 times that many tiny objects needs to be allocated, be alive at the same time, then de-allocated. A very short time after de-allocation, a new bunch needs to be allocated, so a guess is that the garbage collector has a hard time keeping up with this pattern. One strategy for coping is to allocate more memory and hope for the barrage to end, which would explain your jump in heap. But I'm in guess-land here. Hopefully it is simple for you to turn the page size way down - to 10K or even 1K. Why don't you try that, then see how it affects speed and memory requirements? - Toke
Re: Long GC pauses while reading Solr docs using Cursor approach
On 4/13/2017 11:51 AM, Chetas Joshi wrote: > Thanks for the insights into the memory requirements. Looks like cursor > approach is going to require a lot of memory for millions of documents. > If I run a query that returns only 500K documents still keeping 100K docs > per page, I don't see long GC pauses. So it is not really the number of > rows per page but the overall number of docs. May be I can reduce the > document cache and the field cache. What do you think? Lucene handles the field cache automatically and as far as I am aware, it is not configurable in any way. Having docValues on fields that you are using will reduce the amount of memory required for the field cache. The filterCache is typically going to be much larger than any of the other configurable caches. Each entry in filterCache will be 25 million bytes on a 200 million document index. The filterCache should not be configured with a large size -- typical example defaults have a size of 512 ... 512 entries that are each 25 million bytes will use 12 gigabytes. The other caches typically have much smaller entries and therefore can usually be configured with fairly large sizes. Thanks, Shawn
Re: Long GC pauses while reading Solr docs using Cursor approach
Hi Shawn, Thanks for the insights into the memory requirements. Looks like cursor approach is going to require a lot of memory for millions of documents. If I run a query that returns only 500K documents still keeping 100K docs per page, I don't see long GC pauses. So it is not really the number of rows per page but the overall number of docs. May be I can reduce the document cache and the field cache. What do you think? Erick, I was using the streaming approach to get back results from Solr but I was running into some run time exceptions. That bug has been fixed in solr 6.0. But because of some reasons, I won't be able to move to Java 8 and hence I will have to stick to solr 5.5.0. That is the reason I had to switch to the cursor approach. Thanks! On Wed, Apr 12, 2017 at 8:37 PM, Erick Ericksonwrote: > You're missing the point of my comment. Since they already are > docValues, you can use the /export functionality to get the results > back as a _stream_ and avoid all of the overhead of the aggregator > node doing a merge sort and all of that. > > You'll have to do this from SolrJ, but see CloudSolrStream. You can > see examples of its usage in StreamingTest.java. > > this should > 1> complete much, much faster. The design goal is 400K rows/second but YMMV > 2> use vastly less memory on your Solr instances. > 3> only require _one_ query > > Best, > Erick > > On Wed, Apr 12, 2017 at 7:36 PM, Shawn Heisey wrote: > > On 4/12/2017 5:19 PM, Chetas Joshi wrote: > >> I am getting back 100K results per page. > >> The fields have docValues enabled and I am getting sorted results based > on "id" and 2 more fields (String: 32 Bytes and Long: 8 Bytes). > >> > >> I have a solr Cloud of 80 nodes. There will be one shard that will get > top 100K docs from each shard and apply merge sort. So, the max memory > usage of any shard could be 40 bytes * 100K * 80 = 320 MB. Why would heap > memory usage shoot up from 8 GB to 17 GB? > > > > From what I understand, Java overhead for a String object is 56 bytes > > above the actual byte size of the string itself. And each character in > > the string will be two bytes -- Java uses UTF-16 for character > > representation internally. If I'm right about these numbers, it means > > that each of those id values will take 120 bytes -- and that doesn't > > include the size the actual response (xml, json, etc). > > > > I don't know what the overhead for a long is, but you can be sure that > > it's going to take more than eight bytes total memory usage for each one. > > > > Then there is overhead for all the Lucene memory structures required to > > execute the query and gather results, plus Solr memory structures to > > keep track of everything. I have absolutely no idea how much memory > > Lucene and Solr use to accomplish a query, but it's not going to be > > small when you have 200 million documents per shard. > > > > Speaking of Solr memory requirements, under normal query circumstances > > the aggregating node is going to receive at least 100K results from > > *every* shard in the collection, which it will condense down to the > > final result with 100K entries. The behavior during a cursor-based > > request may be more memory-efficient than what I have described, but I > > am unsure whether that is the case. > > > > If the cursor behavior is not more efficient, then each entry in those > > results will contain the uniqueKey value and the score. That's going to > > be many megabytes for every shard. If there are 80 shards, it would > > probably be over a gigabyte for one request. > > > > Thanks, > > Shawn > > >
Re: Long GC pauses while reading Solr docs using Cursor approach
You're missing the point of my comment. Since they already are docValues, you can use the /export functionality to get the results back as a _stream_ and avoid all of the overhead of the aggregator node doing a merge sort and all of that. You'll have to do this from SolrJ, but see CloudSolrStream. You can see examples of its usage in StreamingTest.java. this should 1> complete much, much faster. The design goal is 400K rows/second but YMMV 2> use vastly less memory on your Solr instances. 3> only require _one_ query Best, Erick On Wed, Apr 12, 2017 at 7:36 PM, Shawn Heiseywrote: > On 4/12/2017 5:19 PM, Chetas Joshi wrote: >> I am getting back 100K results per page. >> The fields have docValues enabled and I am getting sorted results based on >> "id" and 2 more fields (String: 32 Bytes and Long: 8 Bytes). >> >> I have a solr Cloud of 80 nodes. There will be one shard that will get top >> 100K docs from each shard and apply merge sort. So, the max memory usage of >> any shard could be 40 bytes * 100K * 80 = 320 MB. Why would heap memory >> usage shoot up from 8 GB to 17 GB? > > From what I understand, Java overhead for a String object is 56 bytes > above the actual byte size of the string itself. And each character in > the string will be two bytes -- Java uses UTF-16 for character > representation internally. If I'm right about these numbers, it means > that each of those id values will take 120 bytes -- and that doesn't > include the size the actual response (xml, json, etc). > > I don't know what the overhead for a long is, but you can be sure that > it's going to take more than eight bytes total memory usage for each one. > > Then there is overhead for all the Lucene memory structures required to > execute the query and gather results, plus Solr memory structures to > keep track of everything. I have absolutely no idea how much memory > Lucene and Solr use to accomplish a query, but it's not going to be > small when you have 200 million documents per shard. > > Speaking of Solr memory requirements, under normal query circumstances > the aggregating node is going to receive at least 100K results from > *every* shard in the collection, which it will condense down to the > final result with 100K entries. The behavior during a cursor-based > request may be more memory-efficient than what I have described, but I > am unsure whether that is the case. > > If the cursor behavior is not more efficient, then each entry in those > results will contain the uniqueKey value and the score. That's going to > be many megabytes for every shard. If there are 80 shards, it would > probably be over a gigabyte for one request. > > Thanks, > Shawn >
Re: Long GC pauses while reading Solr docs using Cursor approach
On 4/12/2017 5:19 PM, Chetas Joshi wrote: > I am getting back 100K results per page. > The fields have docValues enabled and I am getting sorted results based on > "id" and 2 more fields (String: 32 Bytes and Long: 8 Bytes). > > I have a solr Cloud of 80 nodes. There will be one shard that will get top > 100K docs from each shard and apply merge sort. So, the max memory usage of > any shard could be 40 bytes * 100K * 80 = 320 MB. Why would heap memory usage > shoot up from 8 GB to 17 GB? >From what I understand, Java overhead for a String object is 56 bytes above the actual byte size of the string itself. And each character in the string will be two bytes -- Java uses UTF-16 for character representation internally. If I'm right about these numbers, it means that each of those id values will take 120 bytes -- and that doesn't include the size the actual response (xml, json, etc). I don't know what the overhead for a long is, but you can be sure that it's going to take more than eight bytes total memory usage for each one. Then there is overhead for all the Lucene memory structures required to execute the query and gather results, plus Solr memory structures to keep track of everything. I have absolutely no idea how much memory Lucene and Solr use to accomplish a query, but it's not going to be small when you have 200 million documents per shard. Speaking of Solr memory requirements, under normal query circumstances the aggregating node is going to receive at least 100K results from *every* shard in the collection, which it will condense down to the final result with 100K entries. The behavior during a cursor-based request may be more memory-efficient than what I have described, but I am unsure whether that is the case. If the cursor behavior is not more efficient, then each entry in those results will contain the uniqueKey value and the score. That's going to be many megabytes for every shard. If there are 80 shards, it would probably be over a gigabyte for one request. Thanks, Shawn
Re: Long GC pauses while reading Solr docs using Cursor approach
I am getting back 100K results per page. The fields have docValues enabled and I am getting sorted results based on "id" and 2 more fields (String: 32 Bytes and Long: 8 Bytes). I have a solr Cloud of 80 nodes. There will be one shard that will get top 100K docs from each shard and apply merge sort. So, the max memory usage of any shard could be 40 bytes * 100K * 80 = 320 MB. Why would heap memory usage shoot up from 8 GB to 17 GB? Thanks! On Wed, Apr 12, 2017 at 1:32 PM, Erick Ericksonwrote: > Oh my. Returning 100K rows per request is usually poor practice. > One hopes these are very tiny docs. > > But this may well be an "XY" problem. What kinds of information > are you returning in your docs and could they all be docValues > types? In which case you would be waaay far ahead by using > the various Streaming options. > > Best, > Erick > > On Wed, Apr 12, 2017 at 12:59 PM, Chetas Joshi > wrote: > > I am running a query that returns 10 MM docs in total and the number of > > rows per page is 100K. > > > > On Wed, Apr 12, 2017 at 12:53 PM, Mikhail Khludnev > wrote: > > > >> And what is the rows parameter? > >> > >> 12 апр. 2017 г. 21:32 пользователь "Chetas Joshi" < > chetas.jo...@gmail.com> > >> написал: > >> > >> > Thanks for your response Shawn and Wunder. > >> > > >> > Hi Shawn, > >> > > >> > Here is the system config: > >> > > >> > Total system memory = 512 GB > >> > each server handles two 500 MB cores > >> > Number of solr docs per 500 MB core = 200 MM > >> > > >> > The average heap usage is around 4-6 GB. When the read starts using > the > >> > Cursor approach, the heap usage starts increasing with the base of the > >> > sawtooth at 8 GB and then shooting up to 17 GB. Even after the full > GC, > >> the > >> > heap usage remains around 15 GB and then it comes down to 8 GB. > >> > > >> > With 100K docs, the requirement will be in MBs so it is strange it is > >> > jumping from 8 GB to 17 GB while preparing the sorted response. > >> > > >> > Thanks! > >> > > >> > > >> > > >> > On Tue, Apr 11, 2017 at 8:48 PM, Walter Underwood < > wun...@wunderwood.org > >> > > >> > wrote: > >> > > >> > > JVM version? We’re running v8 update 121 with the G1 collector and > it > >> is > >> > > working really well. We also have an 8GB heap. > >> > > > >> > > Graph your heap usage. You’ll see a sawtooth shape, where it grows, > >> then > >> > > there is a major GC. The maximum of the base of the sawtooth is the > >> > working > >> > > set of heap that your Solr installation needs. Set the heap to that > >> > value, > >> > > plus a gigabyte or so. We run with a 2GB eden (new space) because so > >> much > >> > > of Solr’s allocations have a lifetime of one request. So, the base > of > >> the > >> > > sawtooth, plus a gigabyte breathing room, plus two more for eden. > That > >> > > should work. > >> > > > >> > > I don’t set all the ratios and stuff. When were running CMS, I set a > >> size > >> > > for the heap and a size for the new space. Done. With G1, I don’t > even > >> > get > >> > > that fussy. > >> > > > >> > > wunder > >> > > Walter Underwood > >> > > wun...@wunderwood.org > >> > > http://observer.wunderwood.org/ (my blog) > >> > > > >> > > > >> > > > On Apr 11, 2017, at 8:22 PM, Shawn Heisey > >> wrote: > >> > > > > >> > > > On 4/11/2017 2:56 PM, Chetas Joshi wrote: > >> > > >> I am using Solr (5.5.0) on HDFS. SolrCloud of 80 nodes. Sold > >> > collection > >> > > >> with number of shards = 80 and replication Factor=2 > >> > > >> > >> > > >> Sold JVM heap size = 20 GB > >> > > >> solr.hdfs.blockcache.enabled = true > >> > > >> solr.hdfs.blockcache.direct.memory.allocation = true > >> > > >> MaxDirectMemorySize = 25 GB > >> > > >> > >> > > >> I am querying a solr collection with index size = 500 MB per > core. > >> > > > > >> > > > I see that you and I have traded messages before on the list. > >> > > > > >> > > > How much total system memory is there per server? How many of > these > >> > > > 500MB cores are on each server? How many docs are in a 500MB > core? > >> > The > >> > > > answers to these questions may affect the other advice that I give > >> you. > >> > > > > >> > > >> The off-heap (25 GB) is huge so that it can load the entire > index. > >> > > > > >> > > > I still know very little about how HDFS handles caching and > memory. > >> > You > >> > > > want to be sure that as much data as possible from your indexes is > >> > > > sitting in local memory on the server. > >> > > > > >> > > >> Using cursor approach (number of rows = 100K), I read 2 fields > >> (Total > >> > 40 > >> > > >> bytes per solr doc) from the Solr docs that satisfy the query. > The > >> > docs > >> > > are sorted by "id" and then by those 2 fields. > >> > > >> > >> > > >> I am not able to understand why the heap memory is getting full > and > >> > Full > >> > > >> GCs are consecutively running with long GC pauses (> 30 > seconds). I > >> am >
Re: Long GC pauses while reading Solr docs using Cursor approach
Oh my. Returning 100K rows per request is usually poor practice. One hopes these are very tiny docs. But this may well be an "XY" problem. What kinds of information are you returning in your docs and could they all be docValues types? In which case you would be waaay far ahead by using the various Streaming options. Best, Erick On Wed, Apr 12, 2017 at 12:59 PM, Chetas Joshiwrote: > I am running a query that returns 10 MM docs in total and the number of > rows per page is 100K. > > On Wed, Apr 12, 2017 at 12:53 PM, Mikhail Khludnev wrote: > >> And what is the rows parameter? >> >> 12 апр. 2017 г. 21:32 пользователь "Chetas Joshi" >> написал: >> >> > Thanks for your response Shawn and Wunder. >> > >> > Hi Shawn, >> > >> > Here is the system config: >> > >> > Total system memory = 512 GB >> > each server handles two 500 MB cores >> > Number of solr docs per 500 MB core = 200 MM >> > >> > The average heap usage is around 4-6 GB. When the read starts using the >> > Cursor approach, the heap usage starts increasing with the base of the >> > sawtooth at 8 GB and then shooting up to 17 GB. Even after the full GC, >> the >> > heap usage remains around 15 GB and then it comes down to 8 GB. >> > >> > With 100K docs, the requirement will be in MBs so it is strange it is >> > jumping from 8 GB to 17 GB while preparing the sorted response. >> > >> > Thanks! >> > >> > >> > >> > On Tue, Apr 11, 2017 at 8:48 PM, Walter Underwood > > >> > wrote: >> > >> > > JVM version? We’re running v8 update 121 with the G1 collector and it >> is >> > > working really well. We also have an 8GB heap. >> > > >> > > Graph your heap usage. You’ll see a sawtooth shape, where it grows, >> then >> > > there is a major GC. The maximum of the base of the sawtooth is the >> > working >> > > set of heap that your Solr installation needs. Set the heap to that >> > value, >> > > plus a gigabyte or so. We run with a 2GB eden (new space) because so >> much >> > > of Solr’s allocations have a lifetime of one request. So, the base of >> the >> > > sawtooth, plus a gigabyte breathing room, plus two more for eden. That >> > > should work. >> > > >> > > I don’t set all the ratios and stuff. When were running CMS, I set a >> size >> > > for the heap and a size for the new space. Done. With G1, I don’t even >> > get >> > > that fussy. >> > > >> > > wunder >> > > Walter Underwood >> > > wun...@wunderwood.org >> > > http://observer.wunderwood.org/ (my blog) >> > > >> > > >> > > > On Apr 11, 2017, at 8:22 PM, Shawn Heisey >> wrote: >> > > > >> > > > On 4/11/2017 2:56 PM, Chetas Joshi wrote: >> > > >> I am using Solr (5.5.0) on HDFS. SolrCloud of 80 nodes. Sold >> > collection >> > > >> with number of shards = 80 and replication Factor=2 >> > > >> >> > > >> Sold JVM heap size = 20 GB >> > > >> solr.hdfs.blockcache.enabled = true >> > > >> solr.hdfs.blockcache.direct.memory.allocation = true >> > > >> MaxDirectMemorySize = 25 GB >> > > >> >> > > >> I am querying a solr collection with index size = 500 MB per core. >> > > > >> > > > I see that you and I have traded messages before on the list. >> > > > >> > > > How much total system memory is there per server? How many of these >> > > > 500MB cores are on each server? How many docs are in a 500MB core? >> > The >> > > > answers to these questions may affect the other advice that I give >> you. >> > > > >> > > >> The off-heap (25 GB) is huge so that it can load the entire index. >> > > > >> > > > I still know very little about how HDFS handles caching and memory. >> > You >> > > > want to be sure that as much data as possible from your indexes is >> > > > sitting in local memory on the server. >> > > > >> > > >> Using cursor approach (number of rows = 100K), I read 2 fields >> (Total >> > 40 >> > > >> bytes per solr doc) from the Solr docs that satisfy the query. The >> > docs >> > > are sorted by "id" and then by those 2 fields. >> > > >> >> > > >> I am not able to understand why the heap memory is getting full and >> > Full >> > > >> GCs are consecutively running with long GC pauses (> 30 seconds). I >> am >> > > >> using CMS GC. >> > > > >> > > > A 20GB heap is quite large. Do you actually need it to be that >> large? >> > > > If you graph JVM heap usage over a long period of time, what are the >> > low >> > > > points in the graph? >> > > > >> > > > A result containing 100K docs is going to be pretty large, even with >> a >> > > > limited number of fields. It is likely to be several megabytes. It >> > > > will need to be entirely built in the heap memory before it is sent >> to >> > > > the client -- both as Lucene data structures (which will probably be >> > > > much larger than the actual response due to Java overhead) and as the >> > > > actual response format. Then it will be garbage as soon as the >> > response >> > > > is done. Repeat this enough times, and you're going to go
Re: Long GC pauses while reading Solr docs using Cursor approach
I am running a query that returns 10 MM docs in total and the number of rows per page is 100K. On Wed, Apr 12, 2017 at 12:53 PM, Mikhail Khludnevwrote: > And what is the rows parameter? > > 12 апр. 2017 г. 21:32 пользователь "Chetas Joshi" > написал: > > > Thanks for your response Shawn and Wunder. > > > > Hi Shawn, > > > > Here is the system config: > > > > Total system memory = 512 GB > > each server handles two 500 MB cores > > Number of solr docs per 500 MB core = 200 MM > > > > The average heap usage is around 4-6 GB. When the read starts using the > > Cursor approach, the heap usage starts increasing with the base of the > > sawtooth at 8 GB and then shooting up to 17 GB. Even after the full GC, > the > > heap usage remains around 15 GB and then it comes down to 8 GB. > > > > With 100K docs, the requirement will be in MBs so it is strange it is > > jumping from 8 GB to 17 GB while preparing the sorted response. > > > > Thanks! > > > > > > > > On Tue, Apr 11, 2017 at 8:48 PM, Walter Underwood > > > wrote: > > > > > JVM version? We’re running v8 update 121 with the G1 collector and it > is > > > working really well. We also have an 8GB heap. > > > > > > Graph your heap usage. You’ll see a sawtooth shape, where it grows, > then > > > there is a major GC. The maximum of the base of the sawtooth is the > > working > > > set of heap that your Solr installation needs. Set the heap to that > > value, > > > plus a gigabyte or so. We run with a 2GB eden (new space) because so > much > > > of Solr’s allocations have a lifetime of one request. So, the base of > the > > > sawtooth, plus a gigabyte breathing room, plus two more for eden. That > > > should work. > > > > > > I don’t set all the ratios and stuff. When were running CMS, I set a > size > > > for the heap and a size for the new space. Done. With G1, I don’t even > > get > > > that fussy. > > > > > > wunder > > > Walter Underwood > > > wun...@wunderwood.org > > > http://observer.wunderwood.org/ (my blog) > > > > > > > > > > On Apr 11, 2017, at 8:22 PM, Shawn Heisey > wrote: > > > > > > > > On 4/11/2017 2:56 PM, Chetas Joshi wrote: > > > >> I am using Solr (5.5.0) on HDFS. SolrCloud of 80 nodes. Sold > > collection > > > >> with number of shards = 80 and replication Factor=2 > > > >> > > > >> Sold JVM heap size = 20 GB > > > >> solr.hdfs.blockcache.enabled = true > > > >> solr.hdfs.blockcache.direct.memory.allocation = true > > > >> MaxDirectMemorySize = 25 GB > > > >> > > > >> I am querying a solr collection with index size = 500 MB per core. > > > > > > > > I see that you and I have traded messages before on the list. > > > > > > > > How much total system memory is there per server? How many of these > > > > 500MB cores are on each server? How many docs are in a 500MB core? > > The > > > > answers to these questions may affect the other advice that I give > you. > > > > > > > >> The off-heap (25 GB) is huge so that it can load the entire index. > > > > > > > > I still know very little about how HDFS handles caching and memory. > > You > > > > want to be sure that as much data as possible from your indexes is > > > > sitting in local memory on the server. > > > > > > > >> Using cursor approach (number of rows = 100K), I read 2 fields > (Total > > 40 > > > >> bytes per solr doc) from the Solr docs that satisfy the query. The > > docs > > > are sorted by "id" and then by those 2 fields. > > > >> > > > >> I am not able to understand why the heap memory is getting full and > > Full > > > >> GCs are consecutively running with long GC pauses (> 30 seconds). I > am > > > >> using CMS GC. > > > > > > > > A 20GB heap is quite large. Do you actually need it to be that > large? > > > > If you graph JVM heap usage over a long period of time, what are the > > low > > > > points in the graph? > > > > > > > > A result containing 100K docs is going to be pretty large, even with > a > > > > limited number of fields. It is likely to be several megabytes. It > > > > will need to be entirely built in the heap memory before it is sent > to > > > > the client -- both as Lucene data structures (which will probably be > > > > much larger than the actual response due to Java overhead) and as the > > > > actual response format. Then it will be garbage as soon as the > > response > > > > is done. Repeat this enough times, and you're going to go through > even > > > > a 20GB heap pretty fast, and need a full GC. Full GCs on a 20GB heap > > > > are slow. > > > > > > > > You could try switching to G1, as long as you realize that you're > going > > > > against advice from Lucene experts but honestly, I do not expect > > > > this to really help, because you would probably still need full GCs > due > > > > to the rate that garbage is being created. If you do try it, I would > > > > strongly recommend the latest Java 8, either Oracle or OpenJDK. > Here's > > > > my wiki page where
Re: Long GC pauses while reading Solr docs using Cursor approach
And what is the rows parameter? 12 апр. 2017 г. 21:32 пользователь "Chetas Joshi"написал: > Thanks for your response Shawn and Wunder. > > Hi Shawn, > > Here is the system config: > > Total system memory = 512 GB > each server handles two 500 MB cores > Number of solr docs per 500 MB core = 200 MM > > The average heap usage is around 4-6 GB. When the read starts using the > Cursor approach, the heap usage starts increasing with the base of the > sawtooth at 8 GB and then shooting up to 17 GB. Even after the full GC, the > heap usage remains around 15 GB and then it comes down to 8 GB. > > With 100K docs, the requirement will be in MBs so it is strange it is > jumping from 8 GB to 17 GB while preparing the sorted response. > > Thanks! > > > > On Tue, Apr 11, 2017 at 8:48 PM, Walter Underwood > wrote: > > > JVM version? We’re running v8 update 121 with the G1 collector and it is > > working really well. We also have an 8GB heap. > > > > Graph your heap usage. You’ll see a sawtooth shape, where it grows, then > > there is a major GC. The maximum of the base of the sawtooth is the > working > > set of heap that your Solr installation needs. Set the heap to that > value, > > plus a gigabyte or so. We run with a 2GB eden (new space) because so much > > of Solr’s allocations have a lifetime of one request. So, the base of the > > sawtooth, plus a gigabyte breathing room, plus two more for eden. That > > should work. > > > > I don’t set all the ratios and stuff. When were running CMS, I set a size > > for the heap and a size for the new space. Done. With G1, I don’t even > get > > that fussy. > > > > wunder > > Walter Underwood > > wun...@wunderwood.org > > http://observer.wunderwood.org/ (my blog) > > > > > > > On Apr 11, 2017, at 8:22 PM, Shawn Heisey wrote: > > > > > > On 4/11/2017 2:56 PM, Chetas Joshi wrote: > > >> I am using Solr (5.5.0) on HDFS. SolrCloud of 80 nodes. Sold > collection > > >> with number of shards = 80 and replication Factor=2 > > >> > > >> Sold JVM heap size = 20 GB > > >> solr.hdfs.blockcache.enabled = true > > >> solr.hdfs.blockcache.direct.memory.allocation = true > > >> MaxDirectMemorySize = 25 GB > > >> > > >> I am querying a solr collection with index size = 500 MB per core. > > > > > > I see that you and I have traded messages before on the list. > > > > > > How much total system memory is there per server? How many of these > > > 500MB cores are on each server? How many docs are in a 500MB core? > The > > > answers to these questions may affect the other advice that I give you. > > > > > >> The off-heap (25 GB) is huge so that it can load the entire index. > > > > > > I still know very little about how HDFS handles caching and memory. > You > > > want to be sure that as much data as possible from your indexes is > > > sitting in local memory on the server. > > > > > >> Using cursor approach (number of rows = 100K), I read 2 fields (Total > 40 > > >> bytes per solr doc) from the Solr docs that satisfy the query. The > docs > > are sorted by "id" and then by those 2 fields. > > >> > > >> I am not able to understand why the heap memory is getting full and > Full > > >> GCs are consecutively running with long GC pauses (> 30 seconds). I am > > >> using CMS GC. > > > > > > A 20GB heap is quite large. Do you actually need it to be that large? > > > If you graph JVM heap usage over a long period of time, what are the > low > > > points in the graph? > > > > > > A result containing 100K docs is going to be pretty large, even with a > > > limited number of fields. It is likely to be several megabytes. It > > > will need to be entirely built in the heap memory before it is sent to > > > the client -- both as Lucene data structures (which will probably be > > > much larger than the actual response due to Java overhead) and as the > > > actual response format. Then it will be garbage as soon as the > response > > > is done. Repeat this enough times, and you're going to go through even > > > a 20GB heap pretty fast, and need a full GC. Full GCs on a 20GB heap > > > are slow. > > > > > > You could try switching to G1, as long as you realize that you're going > > > against advice from Lucene experts but honestly, I do not expect > > > this to really help, because you would probably still need full GCs due > > > to the rate that garbage is being created. If you do try it, I would > > > strongly recommend the latest Java 8, either Oracle or OpenJDK. Here's > > > my wiki page where I discuss this: > > > > > > https://wiki.apache.org/solr/ShawnHeisey#G1_.28Garbage_ > > First.29_Collector > > > > > > Reducing the heap size (which may not be possible -- need to know the > > > answer to the question about memory graphing) and reducing the number > of > > > rows per query are the only quick solutions I can think of. > > > > > > Thanks, > > > Shawn > > > > > > > >
Re: Long GC pauses while reading Solr docs using Cursor approach
Thanks for your response Shawn and Wunder. Hi Shawn, Here is the system config: Total system memory = 512 GB each server handles two 500 MB cores Number of solr docs per 500 MB core = 200 MM The average heap usage is around 4-6 GB. When the read starts using the Cursor approach, the heap usage starts increasing with the base of the sawtooth at 8 GB and then shooting up to 17 GB. Even after the full GC, the heap usage remains around 15 GB and then it comes down to 8 GB. With 100K docs, the requirement will be in MBs so it is strange it is jumping from 8 GB to 17 GB while preparing the sorted response. Thanks! On Tue, Apr 11, 2017 at 8:48 PM, Walter Underwoodwrote: > JVM version? We’re running v8 update 121 with the G1 collector and it is > working really well. We also have an 8GB heap. > > Graph your heap usage. You’ll see a sawtooth shape, where it grows, then > there is a major GC. The maximum of the base of the sawtooth is the working > set of heap that your Solr installation needs. Set the heap to that value, > plus a gigabyte or so. We run with a 2GB eden (new space) because so much > of Solr’s allocations have a lifetime of one request. So, the base of the > sawtooth, plus a gigabyte breathing room, plus two more for eden. That > should work. > > I don’t set all the ratios and stuff. When were running CMS, I set a size > for the heap and a size for the new space. Done. With G1, I don’t even get > that fussy. > > wunder > Walter Underwood > wun...@wunderwood.org > http://observer.wunderwood.org/ (my blog) > > > > On Apr 11, 2017, at 8:22 PM, Shawn Heisey wrote: > > > > On 4/11/2017 2:56 PM, Chetas Joshi wrote: > >> I am using Solr (5.5.0) on HDFS. SolrCloud of 80 nodes. Sold collection > >> with number of shards = 80 and replication Factor=2 > >> > >> Sold JVM heap size = 20 GB > >> solr.hdfs.blockcache.enabled = true > >> solr.hdfs.blockcache.direct.memory.allocation = true > >> MaxDirectMemorySize = 25 GB > >> > >> I am querying a solr collection with index size = 500 MB per core. > > > > I see that you and I have traded messages before on the list. > > > > How much total system memory is there per server? How many of these > > 500MB cores are on each server? How many docs are in a 500MB core? The > > answers to these questions may affect the other advice that I give you. > > > >> The off-heap (25 GB) is huge so that it can load the entire index. > > > > I still know very little about how HDFS handles caching and memory. You > > want to be sure that as much data as possible from your indexes is > > sitting in local memory on the server. > > > >> Using cursor approach (number of rows = 100K), I read 2 fields (Total 40 > >> bytes per solr doc) from the Solr docs that satisfy the query. The docs > are sorted by "id" and then by those 2 fields. > >> > >> I am not able to understand why the heap memory is getting full and Full > >> GCs are consecutively running with long GC pauses (> 30 seconds). I am > >> using CMS GC. > > > > A 20GB heap is quite large. Do you actually need it to be that large? > > If you graph JVM heap usage over a long period of time, what are the low > > points in the graph? > > > > A result containing 100K docs is going to be pretty large, even with a > > limited number of fields. It is likely to be several megabytes. It > > will need to be entirely built in the heap memory before it is sent to > > the client -- both as Lucene data structures (which will probably be > > much larger than the actual response due to Java overhead) and as the > > actual response format. Then it will be garbage as soon as the response > > is done. Repeat this enough times, and you're going to go through even > > a 20GB heap pretty fast, and need a full GC. Full GCs on a 20GB heap > > are slow. > > > > You could try switching to G1, as long as you realize that you're going > > against advice from Lucene experts but honestly, I do not expect > > this to really help, because you would probably still need full GCs due > > to the rate that garbage is being created. If you do try it, I would > > strongly recommend the latest Java 8, either Oracle or OpenJDK. Here's > > my wiki page where I discuss this: > > > > https://wiki.apache.org/solr/ShawnHeisey#G1_.28Garbage_ > First.29_Collector > > > > Reducing the heap size (which may not be possible -- need to know the > > answer to the question about memory graphing) and reducing the number of > > rows per query are the only quick solutions I can think of. > > > > Thanks, > > Shawn > > > >
Re: Long GC pauses while reading Solr docs using Cursor approach
JVM version? We’re running v8 update 121 with the G1 collector and it is working really well. We also have an 8GB heap. Graph your heap usage. You’ll see a sawtooth shape, where it grows, then there is a major GC. The maximum of the base of the sawtooth is the working set of heap that your Solr installation needs. Set the heap to that value, plus a gigabyte or so. We run with a 2GB eden (new space) because so much of Solr’s allocations have a lifetime of one request. So, the base of the sawtooth, plus a gigabyte breathing room, plus two more for eden. That should work. I don’t set all the ratios and stuff. When were running CMS, I set a size for the heap and a size for the new space. Done. With G1, I don’t even get that fussy. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Apr 11, 2017, at 8:22 PM, Shawn Heiseywrote: > > On 4/11/2017 2:56 PM, Chetas Joshi wrote: >> I am using Solr (5.5.0) on HDFS. SolrCloud of 80 nodes. Sold collection >> with number of shards = 80 and replication Factor=2 >> >> Sold JVM heap size = 20 GB >> solr.hdfs.blockcache.enabled = true >> solr.hdfs.blockcache.direct.memory.allocation = true >> MaxDirectMemorySize = 25 GB >> >> I am querying a solr collection with index size = 500 MB per core. > > I see that you and I have traded messages before on the list. > > How much total system memory is there per server? How many of these > 500MB cores are on each server? How many docs are in a 500MB core? The > answers to these questions may affect the other advice that I give you. > >> The off-heap (25 GB) is huge so that it can load the entire index. > > I still know very little about how HDFS handles caching and memory. You > want to be sure that as much data as possible from your indexes is > sitting in local memory on the server. > >> Using cursor approach (number of rows = 100K), I read 2 fields (Total 40 >> bytes per solr doc) from the Solr docs that satisfy the query. The docs are >> sorted by "id" and then by those 2 fields. >> >> I am not able to understand why the heap memory is getting full and Full >> GCs are consecutively running with long GC pauses (> 30 seconds). I am >> using CMS GC. > > A 20GB heap is quite large. Do you actually need it to be that large? > If you graph JVM heap usage over a long period of time, what are the low > points in the graph? > > A result containing 100K docs is going to be pretty large, even with a > limited number of fields. It is likely to be several megabytes. It > will need to be entirely built in the heap memory before it is sent to > the client -- both as Lucene data structures (which will probably be > much larger than the actual response due to Java overhead) and as the > actual response format. Then it will be garbage as soon as the response > is done. Repeat this enough times, and you're going to go through even > a 20GB heap pretty fast, and need a full GC. Full GCs on a 20GB heap > are slow. > > You could try switching to G1, as long as you realize that you're going > against advice from Lucene experts but honestly, I do not expect > this to really help, because you would probably still need full GCs due > to the rate that garbage is being created. If you do try it, I would > strongly recommend the latest Java 8, either Oracle or OpenJDK. Here's > my wiki page where I discuss this: > > https://wiki.apache.org/solr/ShawnHeisey#G1_.28Garbage_First.29_Collector > > Reducing the heap size (which may not be possible -- need to know the > answer to the question about memory graphing) and reducing the number of > rows per query are the only quick solutions I can think of. > > Thanks, > Shawn >
Re: Long GC pauses while reading Solr docs using Cursor approach
On 4/11/2017 2:56 PM, Chetas Joshi wrote: > I am using Solr (5.5.0) on HDFS. SolrCloud of 80 nodes. Sold collection > with number of shards = 80 and replication Factor=2 > > Sold JVM heap size = 20 GB > solr.hdfs.blockcache.enabled = true > solr.hdfs.blockcache.direct.memory.allocation = true > MaxDirectMemorySize = 25 GB > > I am querying a solr collection with index size = 500 MB per core. I see that you and I have traded messages before on the list. How much total system memory is there per server? How many of these 500MB cores are on each server? How many docs are in a 500MB core? The answers to these questions may affect the other advice that I give you. > The off-heap (25 GB) is huge so that it can load the entire index. I still know very little about how HDFS handles caching and memory. You want to be sure that as much data as possible from your indexes is sitting in local memory on the server. > Using cursor approach (number of rows = 100K), I read 2 fields (Total 40 > bytes per solr doc) from the Solr docs that satisfy the query. The docs are > sorted by "id" and then by those 2 fields. > > I am not able to understand why the heap memory is getting full and Full > GCs are consecutively running with long GC pauses (> 30 seconds). I am > using CMS GC. A 20GB heap is quite large. Do you actually need it to be that large? If you graph JVM heap usage over a long period of time, what are the low points in the graph? A result containing 100K docs is going to be pretty large, even with a limited number of fields. It is likely to be several megabytes. It will need to be entirely built in the heap memory before it is sent to the client -- both as Lucene data structures (which will probably be much larger than the actual response due to Java overhead) and as the actual response format. Then it will be garbage as soon as the response is done. Repeat this enough times, and you're going to go through even a 20GB heap pretty fast, and need a full GC. Full GCs on a 20GB heap are slow. You could try switching to G1, as long as you realize that you're going against advice from Lucene experts but honestly, I do not expect this to really help, because you would probably still need full GCs due to the rate that garbage is being created. If you do try it, I would strongly recommend the latest Java 8, either Oracle or OpenJDK. Here's my wiki page where I discuss this: https://wiki.apache.org/solr/ShawnHeisey#G1_.28Garbage_First.29_Collector Reducing the heap size (which may not be possible -- need to know the answer to the question about memory graphing) and reducing the number of rows per query are the only quick solutions I can think of. Thanks, Shawn
Long GC pauses while reading Solr docs using Cursor approach
Hello, I am using Solr (5.5.0) on HDFS. SolrCloud of 80 nodes. Sold collection with number of shards = 80 and replication Factor=2 Sold JVM heap size = 20 GB solr.hdfs.blockcache.enabled = true solr.hdfs.blockcache.direct.memory.allocation = true MaxDirectMemorySize = 25 GB I am querying a solr collection with index size = 500 MB per core. The off-heap (25 GB) is huge so that it can load the entire index. Using cursor approach (number of rows = 100K), I read 2 fields (Total 40 bytes per solr doc) from the Solr docs that satisfy the query. The docs are sorted by "id" and then by those 2 fields. I am not able to understand why the heap memory is getting full and Full GCs are consecutively running with long GC pauses (> 30 seconds). I am using CMS GC. -XX:NewRatio=3 \ -XX:SurvivorRatio=4 \ -XX:TargetSurvivorRatio=90 \ -XX:MaxTenuringThreshold=8 \ -XX:+UseConcMarkSweepGC \ -XX:+UseParNewGC \ -XX:ConcGCThreads=4 -XX:ParallelGCThreads=4 \ -XX:+CMSScavengeBeforeRemark \ -XX:PretenureSizeThreshold=64m \ -XX:+UseCMSInitiatingOccupancyOnly \ -XX:CMSInitiatingOccupancyFraction=50 \ -XX:CMSMaxAbortablePrecleanTime=6000 \ -XX:+CMSParallelRemarkEnabled \ -XX:+ParallelRefProcEnabled" Please guide me in debugging the heap usage issue. Thanks!