Re: Excessive Heap Usage from docValues?
On Wed, 2014-03-19 at 22:01 +0100, tradergene wrote: I have a Solr index with about 32 million docs. Each doc is relatively small but has multiple dynamic fields that are storing INTs. The initial problem that I had to resolve is that we were running into OOMs (on a 48GB heap, 130GB on-disk index). I narrowed that issue down to Lucene FieldCache filling up the heap due to all the dynamic fields. 48GB heap for a 130GB, 32M docs index sounds excessive. Could you tell us how many unique fields your searcher uses in total for faceting and maybe the overall layout of your index? Is this perhaps a case of many distinct groups of data put in the same index, where the searches are always within a single group and each group has its own fields for faceting? Are the fields single- or multi-valued? - Toke Eskildsen, State and University Library, Denmark
Excessive Heap Usage from docValues?
Hello All, I'm hoping to get your assistance in debugging what seems like a memory issue. I have a Solr index with about 32 million docs. Each doc is relatively small but has multiple dynamic fields that are storing INTs. The initial problem that I had to resolve is that we were running into OOMs (on a 48GB heap, 130GB on-disk index). I narrowed that issue down to Lucene FieldCache filling up the heap due to all the dynamic fields. To mitigate this, I enabled docValues on the schema for many of the dynamicField culprits. This dropped the FieldCache down to almost nothing. Now, when re-indexing for docValues functionality, I ran into OOMs as soon as I reached 12 million of the 32 million documents. Before enabling docValues, I was able to load up Solr on a 48GB heap but ran into problems after enough unique searches occurred (normal FieldCache issue). Now, with docValues, a 48GB heap is giving me OOM after 12 million docs indexed. I split the collection into 10 shards and with 2 nodes (48GB heap each) was able to get up to 21 million docs indexed. Now, I've had to move the shards to more nodes and am up to 10 shards across 4 nodes and am hoping to be able to get all 32 million docs indexed. This will be 48GB x 4 heap which seems really excessive for an index that was only 132GB pre-docValues. I would love some thoughts as to whether I'm expecting too much efficiency with docValues enabled. I was under the impression that docValues would increase storage requirements on disk (which it has), but l thought that RAM usage would go down during searching (which I haven't tested) as well as indexing. Thanks for any assistance anyone can provide. Gene -- View this message in context: http://lucene.472066.n3.nabble.com/Excessive-Heap-Usage-from-docValues-tp4125577.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Excessive Heap Usage from docValues?
Hi, Which type of doc values? See Wiki or reference guide for a list of types. Otis Solr ElasticSearch Support http://sematext.com/ On Mar 19, 2014 5:02 PM, tradergene nos...@krevets.com wrote: Hello All, I'm hoping to get your assistance in debugging what seems like a memory issue. I have a Solr index with about 32 million docs. Each doc is relatively small but has multiple dynamic fields that are storing INTs. The initial problem that I had to resolve is that we were running into OOMs (on a 48GB heap, 130GB on-disk index). I narrowed that issue down to Lucene FieldCache filling up the heap due to all the dynamic fields. To mitigate this, I enabled docValues on the schema for many of the dynamicField culprits. This dropped the FieldCache down to almost nothing. Now, when re-indexing for docValues functionality, I ran into OOMs as soon as I reached 12 million of the 32 million documents. Before enabling docValues, I was able to load up Solr on a 48GB heap but ran into problems after enough unique searches occurred (normal FieldCache issue). Now, with docValues, a 48GB heap is giving me OOM after 12 million docs indexed. I split the collection into 10 shards and with 2 nodes (48GB heap each) was able to get up to 21 million docs indexed. Now, I've had to move the shards to more nodes and am up to 10 shards across 4 nodes and am hoping to be able to get all 32 million docs indexed. This will be 48GB x 4 heap which seems really excessive for an index that was only 132GB pre-docValues. I would love some thoughts as to whether I'm expecting too much efficiency with docValues enabled. I was under the impression that docValues would increase storage requirements on disk (which it has), but l thought that RAM usage would go down during searching (which I haven't tested) as well as indexing. Thanks for any assistance anyone can provide. Gene -- View this message in context: http://lucene.472066.n3.nabble.com/Excessive-Heap-Usage-from-docValues-tp4125577.html Sent from the Solr - User mailing list archive at Nabble.com.