On 10/16/2017 3:19 PM, Randy Fradin wrote:
> We are seeing a lot of full GC events and eventual OOM errors in Solr
> during indexing. This is Solr 6.5.1 running in cloud mode with a 24G heap.
> At these times indexing is the only activity taking place. The collection
> has 4 shards and 2 replicas across 3 nodes. Each document is ~10KB (a few
> hundred fields each), and indexing is using the normal update handler, 1
> document per request, up to 240 request at a time.
>
> The heap dump taken automatically on OOM shows 18.3GB of heap taken by 3
> instances of DocumentsWriter. Within those instances, all of the heap is
> retained by the blockedFlushes LinkedList inside the flushControl object.
> Each node in the LinkedList appears to be retaining around 55MB.
>
> Clearly something to do with flushing is at play here but I'm at a loss
> what tuning parameters I should be looking at. I would expect things to
> start blocking if I fall too far behind on flushing but apparently that's
> not happening. The ramBufferSizeMB is set to the default 100. My heap size
> is already absurdly more than I thought we would need for this volume.

One of the first things we need to find out is about your index size.

In each of your shards, how many documents are there?  How much disk
space does one shard replica take up?  How many shard replica cores does
each node have on it in total?

I would also like to get a look at your full solrconfig.xml file.  The
schema may be helpful at a later date, along with an example of a
document that you're indexing.  With ramBufferSizeMB at the default,
having a ton of memory used up by a class used for indexing seems very odd.

Do you have the text of the OOM exception? Is it saying out of heap
space, or some other problem?

Thanks,
Shawn

Reply via email to