Hello All,

I'm hoping to get your assistance in debugging what seems like a memory
issue.

I have a Solr index with about 32 million docs.  Each doc is relatively
small but has multiple dynamic fields that are storing INTs.  The initial
problem that I had to resolve is that we were running into OOMs (on a 48GB
heap, 130GB on-disk index).  I narrowed that issue down to Lucene FieldCache
filling up the heap due to all the dynamic fields.  To mitigate this, I
enabled docValues on the schema for many of the dynamicField culprits.  This
dropped the FieldCache down to almost nothing.

Now, when re-indexing for docValues functionality, I ran into OOMs as soon
as I reached 12 million of the 32 million documents.  Before enabling
docValues, I was able to load up Solr on a 48GB heap but ran into problems
after enough unique searches occurred (normal FieldCache issue).  Now, with
docValues, a 48GB heap is giving me OOM after 12 million docs indexed.  I
split the collection into 10 shards and with 2 nodes (48GB heap each) was
able to get up to 21 million docs indexed.  Now, I've had to move the shards
to more nodes and am up to 10 shards across 4 nodes and am hoping to be able
to get all 32 million docs indexed.  This will be 48GB x 4 heap which seems
really excessive for an index that was only 132GB pre-docValues.

I would love some thoughts as to whether I'm expecting too much efficiency
with docValues enabled.  I was under the impression that docValues would
increase storage requirements on disk (which it has), but l thought that RAM
usage would go down during searching (which I haven't tested) as well as
indexing.

Thanks for any assistance anyone can provide.

Gene



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Excessive-Heap-Usage-from-docValues-tp4125577.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to