Thanks Emir and Zisis.

I added the maxRamMB for filterCache and reduced the size. I could the
benefit immediately, the hit ratio went to 0.97. Here's the configuration:

<filterCache class="solr.FastLRUCache" size="512" initialSize="512"
autowarmCount="128" maxRamMB="500" />
<queryResultCache class="solr.LRUCache" size="512" initialSize="512"
autowarmCount="128" />
<documentCache class="solr.LRUCache" size="512" initialSize="512"
autowarmCount="0" />

It seemed to be stable for few days, the cache hits and jvm pool utilization
seemed to be well within expected range. But the OOM issue occurred on one
of the nodes as the heap size reached 30gb. The hit ratio for query result
cache and document cache at that point was recorded as 0.18 and 0.65. I'm
not sure if the cache caused the memory spike at this point, with filter
cache restricted to 500mb, it should be negligible. One thing I noticed is
that the eviction rate now (with the addition of maxRamMB) is staying at 0.
Index hard commit happens at every 10 min, that's when the cache gets
flushed. Based on the monitoring log, the spike happened on the indexing
side where almost 8k docs went to pending state.

On the query performance standpoint, there have been occasional slow queries
(1sec+), but nothing alarming so far. Same goes for deep paging, I haven't
seen any evidence which points to that.

Based on the hit ratio, I can further scale down the query result and
document cache, also change to FastLRUCache and add maxRamMB. For filter
cache, I think this setting should be optimal enough to work on a 30gb heap
space unless I'm wrong on the maxRamMB concept. I'll have to get a heap dump
somehow, unfortunately, the whole process (of the node going down) happens
so quickly, I've hardly any time to run a profiler.



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Reply via email to