Re: Java GC issue investigation

Erick Erickson Tue, 06 Oct 2020 07:52:24 -0700

12G is not that huge, it’s surprising that you’re seeing this problem.

However, there are a couple of things to look at:


1> If you’re saying that you have 16G total physical memory and are allocating 
12G to Solr, that’s an anti-pattern. See: 
https://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
If at all possible, you should allocate between 25% and 50% of your physical 
memory to Solr...

2> what garbage collector are you using? G1GC might be a better choice.

> On Oct 6, 2020, at 10:44 AM, matthew sporleder <msporle...@gmail.com> wrote:
> 
> Your index is so small that it should easily get cached into OS memory
> as it is accessed.  Having a too-big heap is a known problem
> situation.
> 
> https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems#SolrPerformanceProblems-HowmuchheapspacedoIneed?
> 
> On Tue, Oct 6, 2020 at 9:44 AM Karol Grzyb <grz...@gmail.com> wrote:
>> 
>> Hi Matthew,
>> 
>> Thank you for the answer, I cannot reproduce the setup locally I'll
>> try to convince them to reduce Xmx, I guess they will rather not agree
>> to 1GB but something less than 12G for sure.
>> And have some proper dev setup because for now we could only test prod
>> or stage which are difficult to adjust.
>> 
>> Is being stuck in GC common behaviour when the index is small compared
>> to available heap during bigger load? I was more worried about the
>> ratio of heap to total host memory.
>> 
>> Regards,
>> Karol
>> 
>> 
>> wt., 6 paź 2020 o 14:39 matthew sporleder <msporle...@gmail.com> napisał(a):
>>> 
>>> You have a 12G heap for a 200MB index?  Can you just try changing Xmx
>>> to, like, 1g ?
>>> 
>>> On Tue, Oct 6, 2020 at 7:43 AM Karol Grzyb <grz...@gmail.com> wrote:
>>>> 
>>>> Hi,
>>>> 
>>>> I'm involved in investigation of issue that involves huge GC overhead
>>>> that happens during performance tests on Solr Nodes. Solr version is
>>>> 6.1. Last test were done on staging env, and we run into problems for
>>>> <100 requests/second.
>>>> 
>>>> The size of the index itself is ~200MB ~ 50K docs
>>>> Index has small updates every 15min.
>>>> 
>>>> 
>>>> 
>>>> Queries involve sorting and faceting.
>>>> 
>>>> I've gathered some heap dumps, I can see from them that most of heap
>>>> memory is retained because of object of following classes:
>>>> 
>>>> -org.apache.lucene.search.grouping.term.TermSecondPassGroupingCollector
>>>> (>4G, 91% of heap)
>>>> -org.apache.lucene.search.grouping.AbstractSecondPassGroupingCollector$SearchGroupDocs
>>>> -org.apache.lucene.search.FieldValueHitQueue$MultiComparatorsFieldValueHitQueue
>>>> -org.apache.lucene.search.TopFieldCollector$SimpleFieldCollector
>>>> (>3.7G 76% of heap)
>>>> 
>>>> 
>>>> 
>>>> Based on information above is there anything generic that can been
>>>> looked at as source of potential improvement without diving deeply
>>>> into schema and queries (which may be very difficlut to change at this
>>>> moment)? I don't see docvalues being enabled - could this help, as if
>>>> I get the docs correctly, it's specifically helpful when there are
>>>> many sorts/grouping/facets? Or I
>>>> 
>>>> Additionaly I see, that many threads are blocked on LRUCache.get,
>>>> should I recomend switching to FastLRUCache?
>>>> 
>>>> Also, I wonder if -Xmx12288m for java heap is not too much for 16G
>>>> memory? I see some (~5/s) page faults in Dynatrace during the biggest
>>>> traffic.
>>>> 
>>>> Thank you very much for any help,
>>>> Kind regards,
>>>> Karol

Re: Java GC issue investigation

Reply via email to