12G is not that huge, it’s surprising that you’re seeing this problem. However, there are a couple of things to look at:
1> If you’re saying that you have 16G total physical memory and are allocating 12G to Solr, that’s an anti-pattern. See: https://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html If at all possible, you should allocate between 25% and 50% of your physical memory to Solr... 2> what garbage collector are you using? G1GC might be a better choice. > On Oct 6, 2020, at 10:44 AM, matthew sporleder <msporle...@gmail.com> wrote: > > Your index is so small that it should easily get cached into OS memory > as it is accessed. Having a too-big heap is a known problem > situation. > > https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems#SolrPerformanceProblems-HowmuchheapspacedoIneed? > > On Tue, Oct 6, 2020 at 9:44 AM Karol Grzyb <grz...@gmail.com> wrote: >> >> Hi Matthew, >> >> Thank you for the answer, I cannot reproduce the setup locally I'll >> try to convince them to reduce Xmx, I guess they will rather not agree >> to 1GB but something less than 12G for sure. >> And have some proper dev setup because for now we could only test prod >> or stage which are difficult to adjust. >> >> Is being stuck in GC common behaviour when the index is small compared >> to available heap during bigger load? I was more worried about the >> ratio of heap to total host memory. >> >> Regards, >> Karol >> >> >> wt., 6 paź 2020 o 14:39 matthew sporleder <msporle...@gmail.com> napisał(a): >>> >>> You have a 12G heap for a 200MB index? Can you just try changing Xmx >>> to, like, 1g ? >>> >>> On Tue, Oct 6, 2020 at 7:43 AM Karol Grzyb <grz...@gmail.com> wrote: >>>> >>>> Hi, >>>> >>>> I'm involved in investigation of issue that involves huge GC overhead >>>> that happens during performance tests on Solr Nodes. Solr version is >>>> 6.1. Last test were done on staging env, and we run into problems for >>>> <100 requests/second. >>>> >>>> The size of the index itself is ~200MB ~ 50K docs >>>> Index has small updates every 15min. >>>> >>>> >>>> >>>> Queries involve sorting and faceting. >>>> >>>> I've gathered some heap dumps, I can see from them that most of heap >>>> memory is retained because of object of following classes: >>>> >>>> -org.apache.lucene.search.grouping.term.TermSecondPassGroupingCollector >>>> (>4G, 91% of heap) >>>> -org.apache.lucene.search.grouping.AbstractSecondPassGroupingCollector$SearchGroupDocs >>>> -org.apache.lucene.search.FieldValueHitQueue$MultiComparatorsFieldValueHitQueue >>>> -org.apache.lucene.search.TopFieldCollector$SimpleFieldCollector >>>> (>3.7G 76% of heap) >>>> >>>> >>>> >>>> Based on information above is there anything generic that can been >>>> looked at as source of potential improvement without diving deeply >>>> into schema and queries (which may be very difficlut to change at this >>>> moment)? I don't see docvalues being enabled - could this help, as if >>>> I get the docs correctly, it's specifically helpful when there are >>>> many sorts/grouping/facets? Or I >>>> >>>> Additionaly I see, that many threads are blocked on LRUCache.get, >>>> should I recomend switching to FastLRUCache? >>>> >>>> Also, I wonder if -Xmx12288m for java heap is not too much for 16G >>>> memory? I see some (~5/s) page faults in Dynatrace during the biggest >>>> traffic. >>>> >>>> Thank you very much for any help, >>>> Kind regards, >>>> Karol