Hi Friso,

Great to know! Todd was the last one to try to crash G1 and the recent 
iteration seemed much more stable. 

Lars

On Nov 29, 2010, at 10:49, Friso van Vollenhoven <fvanvollenho...@xebia.com> 
wrote:

> On a slightly related note, we've been running with G1 with default settings 
> on a 16GB heap for some weeks now. It's never given us trouble, so I didn't 
> do any real analysis on the GC times, just some eye balling.
> 
> I looked at the longer GCs (everything longer than 1 second: grep -C 5 -i 
> real=[1-9] gc-hbase.log), which gives a list of full GCs all around 10s. The 
> minor pauses all appear to be around 0.2s. I can pastebin a GC log if anyone 
> is interested in the G1 behavior.
> 
> 
> 
> Friso
> 
> 
> 
> On 29 nov 2010, at 09:47, Ryan Rawson wrote:
> 
>> I'd love to hear the kinds of minor pauses you get... left alone to
>> it's devices, 1.6.0_14 or so wants to grow the new gen to 1gb if your
>> xmx is large enough, at that size you are looking at 800ms minor
>> pauses!
>> 
>> It's a tough subject.
>> 
>> -ryan
>> 
>> On Wed, Nov 24, 2010 at 12:52 PM, Sean Sechrist <ssechr...@gmail.com> wrote:
>>> Interesting. The settings we tried earlier today slowed jobs significantly,
>>> but no failures (yet). We're going to try the 512MB newSize and 60%
>>> CMSInitiatingOccupancyFraction. 1 second pauses here and there would be OK
>>> for us.... we just want to avoid the long pauses right now. We'll also do
>>> what we can to avoid swapping. The ganglia metrics on on there.
>>> 
>>> Thanks,
>>> Sean
>>> 
>>> On Wed, Nov 24, 2010 at 3:34 PM, Todd Lipcon <t...@cloudera.com> wrote:
>>> 
>>>> On Wed, Nov 24, 2010 at 7:01 AM, Sean Sechrist <ssechr...@gmail.com>wrote:
>>>> 
>>>>> Hey guys,
>>>>> 
>>>>> I just want to get an idea about how everyone avoids these long GC pauses
>>>>> that cause regionservers to die.
>>>>> 
>>>>> What kind of java heap and garbage collection settings do you use?
>>>>> 
>>>>> What do you do to make sure that the HBase vm never uses swap? I have
>>>>> heard
>>>>> turning off swap altogether can be dangerous, so right now we have the
>>>>> setting vm.swappiness=0. How do you tell if it's using swap? On Ganglia,
>>>>> we
>>>>> see the "CPU wio" metric at around 4.5% before one of our crashes. Is that
>>>>> high?
>>>>> 
>>>>> To try to avoid using too much memory, is reducing the memstore
>>>>> upper/lower
>>>>> limit, or the block cache size a good idea? Should we just tune down
>>>>> HBase's
>>>>> total heap to try to avoid swap?
>>>>> 
>>>>> In terms of our specific problem:
>>>>> 
>>>>> We seem to keep running into garbage collection pauses that cause the
>>>>> regionservers to die. We have mix of some random read jobs, as well as a
>>>>> few
>>>>> full-scan jobs (~1.5 billion rows, 800-900GB of data, 1500 regions), and
>>>>> we
>>>>> are always inserting data. We would rather sacrifice a little speed for
>>>>> stability, if that means anything. We have 7 nodes (RS + DN + TT) with
>>>>> 12GB
>>>>> max heap given to HBase, and 24GB memory total.
>>>>> 
>>>>> We were using the following garbage collection options:
>>>>> -XX:+UseConcMarkSweepGC -XX:NewSize=64m -XX:MaxNewSize=64m
>>>>> -XX:CMSInitiatingOccupancyFraction=75
>>>>> 
>>>>> After looking at http://wiki.apache.org/hadoop/PerformanceTuning, we are
>>>>> trying to lower NewSize/MaxNewSize to 6m as well as reducing
>>>>> CMSInitiatingOccupancyFraction to 50.
>>>>> 
>>>> 
>>>> Rather than reducing the new size, you should consider increasing new size
>>>> if you're OK with higher latency but fewer long GC pauses.
>>>> 
>>>> GC is a complicated subject, but here are a few rules of thumb:
>>>> 
>>>> - A larger young generation means that the young GC pauses, which are
>>>> stop-the-world, will take longer. In my experience it's somewhere around 1
>>>> second per GB of new size. So, if you're OK with periodic 1 second pauses, 
>>>> a
>>>> large (1GB) new size should be fine.
>>>> - A larger young generation also means that less data will get tenured to
>>>> the old generation. This means that the old generation will have to collect
>>>> less often and also that it will become less fragmented.
>>>> - In HBase, the long (45second+) pauses generally happen when promotion
>>>> fails due to heap fragmentation in the old generation. So, it falls back to
>>>> stop-the-world compacting collection which takes a long time.
>>>> 
>>>> So, in general, a large young gen will reduce the frequency of super-long
>>>> pauses, but will increase the frequency of shorter pauses.
>>>> 
>>>> It sounds like you may be OK with longer young gen pauses, so maybe
>>>> consider new size at 512M with your 12G total heap?
>>>> 
>>>> I also wouldn't tune CMSInitiatingOccupancy below 60% - that will cause CMS
>>>> to always be running which isn't that efficient.
>>>> 
>>>> -Todd
>>>> 
>>>> 
>>>>> 
>>>>> We see messages like this in our GC logs:
>>>>> 
>>>>>> 8.844/17.169 secs] [Times: user=75.16 sys=1.34, real=17.17 secs]
>>>>> 
>>>>>  (concurrent mode failure): 10126729K->5760080K(13246464K), 91.2530340
>>>>>> secs]
>>>>> 
>>>>> 
>>>>> 
>>>>> 2010-11-23T14:56:01.383-0500: 61297.449: [GC 61297.449: [ParNew (promotion
>>>>> failed): 57425K->57880K(59008K), 0.1880950 secs]61297.637:
>>>>> [CMS2010-11-23T14:56:06.336-0500: 61302.402: [CMS-concurrent-mark:
>>>>> 8.844/17.169 secs] [Times: user=75.16 sys=1.34, real=17.17 secs]
>>>>> (concurrent mode failure): 10126729K->5760080K(13246464K), 91.2530340
>>>>> secs]
>>>>> 10181961K->5760080K(13305472K), [CMS Perm : 20252K->20241K(33868K)],
>>>>> 91.4413320 secs] [Times: user=24.47 sys=1.07, real=91.44 secs]
>>>>> 
>>>>> There's a lot of questions there, but I definitely appreciate any advice
>>>>> or
>>>>> input anybody else has. Thanks so much!
>>>>> 
>>>>> -Sean
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Todd Lipcon
>>>> Software Engineer, Cloudera
>>>> 
>>> 
> 

Reply via email to