Re: CMS GC - Old Generation collection never finishes (due to GC Allocation Failure?)

Walter Underwood Wed, 03 Oct 2018 20:22:27 -0700

We run a big cluster with 8 GB heap on the JVMs. When we used CMS, I gave 2 GB 
to
the new generation. Solr queries make a ton of short-lived allocations. You 
want all of that
to come from the new gen. I don’t fool around with ratios. I just set the 
numbers.


We used these:

-d64
-server
-XX:+UseConcMarkSweepGC
-XX:+UseParNewGC
-XX:+ExplicitGCInvokesConcurrent
-Xms8g
-Xmx8g
-XX:NewSize=2g
-XX:MaxPermSize=256m

Now we run G1.

This is a cluster with 25 million documents, 8 shards, 48 nodes, each node has 
36 CPUs.
Queries average 25 terms, which uses a lot of CPU.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Oct 3, 2018, at 6:56 PM, Jeff Courtade <courtadej...@gmail.com> wrote:
> 
> We use 4.3.0 I found that we went into gc hell as you describe with small
> newgen. We use  CMS gc as well
> 
> Using newration=2 got us out of that 3 wasn't enough...heap of 32 gig
> only....
> I have not gone over 32 gig as testing showed diminishing returns over 32
> gig. I only was brave enough to go to 40 though.
> 
> On Wed, Oct 3, 2018, 5:34 PM Shawn Heisey <apa...@elyograg.org> wrote:
> 
>> On 10/3/2018 8:01 AM, yasoobhaider wrote:
>>> Master and slave config:
>>> ram: 120GB
>>> cores: 16
>>> 
>>> At any point there are between 10-20 slaves in the cluster, each serving
>> ~2k
>>> requests per minute. Each slave houses two collections of approx 10G
>>> (~2.5mil docs) and 2G(10mil docs) when optimized.
>>> 
>>> I am working with Solr 6.2.1
>>> 
>>> Solr configuration:
>> <snip>
>>> -Xmn10G
>>> -Xms80G
>>> -Xmx80G
>> 
>> I cannot imagine that an 80GB heap is needed when there are only 12.5
>> million documents and 12GB of index data.  I've handled MUCH larger
>> indexes with only 8GB of heap.  Even with your very high query rate, if
>> you really do need 80GB of heap, there's something unusual going on.
>> 
>>> I would really be grateful for any advice on the following:
>>> 
>>> 1. What could be the reason behind CMS not being able to free up the
>> memory?
>>> What are some experiments I can run to solve this problem?
>> 
>> Maybe there's no garbage in the heap to free up?  If the GC never
>> finishes, that sounds like a possible problem with either Java or the
>> operating system, maybe even some kind of hardware issue.
>> 
>>> 2. Can stopping/starting indexing be a reason for such drastic changes
>> to GC
>>> pattern?
>> 
>> Indexing generally requires more heap than just handling queries.
>> 
>>> 3. I have read at multiple places on this mailing list that the heap size
>>> should be much lower (2x-3x the size of collection), but the last time I
>>> tried CMS was not able to run smoothly and GC STW would occur which was
>> only
>>> solved by a restart. My reasoning for this is that the type of queries
>> and
>>> the throughput are also a factor in deciding the heap size, so it may be
>>> that our queries are creating too many objects maybe. Is my reasoning
>>> correct or should I try with a lower heap size (if it helps achieve a
>> stable
>>> gc pattern)?
>> 
>> Do you have a GC log covering a good long runtime, where the problems
>> happened during the time the log covers?  Can you share it?  Attachments
>> rarely make it to the list, you'll need to find a file sharing site.
>> The small excerpt from the GC log that you included in your message
>> isn't enough to make any kind of determination.  Full disclosure:  I'm
>> going to send your log to http://gceasy.io for analysis.  You can do
>> this yourself, their analysis is really good.
>> 
>> There is no generic advice possible regarding how large a heap you
>> need.  It will depend on many factors.
>> 
>>> (4. Silly question, but what is the right way to ask question on the
>> mailing
>>> list? via mail or via the nabble website? I sent this question earlier
>> as a
>>> mail, but it was not showing up on the nabble website so I am posting it
>>> from the website now)
>> 
>> Nabble mirrors the mailing list in forum format.  It's generally better
>> to use the mailing list directly.  The project has absolutely no
>> influence over the Nabble website, and things do not always work
>> correctly when Nabble is involved.  The IRC channel is another good way
>> to get support.  If there is somebody paying attention when you ask your
>> question, a far more interactive chat can be obtained.
>> 
>> Thanks,
>> Shawn
>> 
>>

Re: CMS GC - Old Generation collection never finishes (due to GC Allocation Failure?)

Reply via email to