Re: gc pause killing regionserver

Michael Segel Wed, 21 Mar 2012 03:14:43 -0700

Do you really need to go to 2048 or 4096 xcievers?
Lars George just wrote a blog on it...


its on the cloudera site.


This formula is used to calculate the number of xcievers for HBase.

Since this number is usually calculated when building a system, you're going to 
have to estimate this and maybe tune it later...

Since there can only be a max of 1500 Regions per RS, and the number of store 
files will be dependent on the number of CF per table * the number of regions 
per table, you can kind of approximate this number as well. 

Note that this formula also doesn't consider the number of xcievers required by 
Hadoop itself. 

Just some food for thought...

On Mar 20, 2012, at 9:05 AM, Marcos Ortiz wrote:

> Which is the size of your region in your regionservers?
> Consider to read the posts at Cloudera's blog from Todd Lipcon
> talking about this topic:
> 
> Avoiding Full GCs in HBase with MemStore-Local Allocation Buffers: Part 1
> http://www.cloudera.com/blog/2011/02/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-1/
> 
> Avoiding Full GCs in HBase with MemStore-Local Allocation Buffers: Part 2
> http://www.cloudera.com/blog/2011/02/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-2/
> 
> Other things to consider is look the advices of Ryan Rawson , Sr.
> Developer at StumbleUpon and HBase commiter
> * Monitor, Monitor, Monitor, Ganglia can help you on this
> * HDFS: set xciever limit to 2048, Xmx2000m. This avoid a lot of
> troubles with HDFS even under heavy loads.
> * Give to HBase enough RAM
> * For row import, a randomized key insert order gives substantial
> speedup (tested at StumbleUpon with 9 billions of rows)
> 
> Look here too:
> http://www.cloudera.com/blog/2011/04/hbase-dos-and-donts/
> http://www.meetup.com/LA-HUG/pages/Video_from_April_13th_HBASE_DO%27S_and_DON%27TS/
> 
> Best wishes
> 
> On 03/20/2012 06:19 AM, Ferdy Galema wrote:
>> This morning there was a crash that led to aborting 10 regionservers out of
>> the 15. After debugging the logs it seems that "OutOfMemoryError: Java heap
>> space" occurred at the regionservers. This was because there was a running
>> job that had a too large scanner-caching combined with retrieving full
>> rows. It makes sense that regionservers cannot cope with handling several
>> clients (mapred tasks) each requesting 100~200 MB buffers to be filled. The
>> obvious solution is a lower scanner-caching value for jobs that retrieve
>> too more data per row on average.
>> 
>> A nice solution server-side would be to dynamically adjust the
>> scanner-caching value when the responses are too large. For example, is a
>> response over 100MB (configurable), then reduce the scanner-caching by half
>> its size.
>> 
>> See log below.
>> 2012-03-20 07:57:40,092 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
>> handler 5 on 60020, responseTooLarge for: next(4438820558358059204, 1000)
>> from 172.23.122.15:50218: Size: 105.0m
>> 2012-03-20 07:57:53,226 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
>> handler 3 on 60020, responseTooLarge for: next(-7429189123174849941, 1000)
>> from 172.23.122.15:50218: Size: 214.4m
>> 2012-03-20 07:57:57,839 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
>> handler 5 on 60020, responseTooLarge for: next(-7429189123174849941, 1000)
>> from 172.23.122.15:50218: Size: 103.2m
>> 2012-03-20 07:57:59,442 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
>> handler 2 on 60020, responseTooLarge for: next(-7429189123174849941, 1000)
>> from 172.23.122.15:50218: Size: 101.8m
>> 2012-03-20 07:58:20,025 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
>> handler 6 on 60020, responseTooLarge for: next(9033159548564260857, 1000)
>> from 172.23.122.15:50218: Size: 107.2m
>> 2012-03-20 07:58:27,273 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
>> handler 3 on 60020, responseTooLarge for: next(9033159548564260857, 1000)
>> from 172.23.122.15:50218: Size: 100.1m
>> 2012-03-20 07:58:52,783 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
>> handler 1 on 60020, responseTooLarge for: next(-8611621895979000997, 1000)
>> from 172.23.122.15:50218: Size: 101.7m
>> 2012-03-20 07:59:02,541 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
>> handler 0 on 60020, responseTooLarge for: next(-511305750191148153, 1000)
>> from 172.23.122.15:50218: Size: 120.9m
>> 2012-03-20 07:59:25,346 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
>> handler 6 on 60020, responseTooLarge for: next(1570572538285935733, 1000)
>> from 172.23.122.15:50218: Size: 107.8m
>> 2012-03-20 07:59:46,805 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
>> handler 3 on 60020, responseTooLarge for: next(-727080724379055435, 1000)
>> from 172.23.122.15:50218: Size: 102.7m
>> 2012-03-20 08:00:00,138 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
>> handler 3 on 60020, responseTooLarge for: next(-3701270248575643714, 1000)
>> from 172.23.122.15:50218: Size: 122.1m
>> 2012-03-20 08:00:21,232 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
>> handler 6 on 60020, responseTooLarge for: next(5831907615409186602, 1000)
>> from 172.23.122.15:50218: Size: 157.5m
>> 2012-03-20 08:00:23,199 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
>> handler 9 on 60020, responseTooLarge for: next(5831907615409186602, 1000)
>> from 172.23.122.15:50218: Size: 160.7m
>> 2012-03-20 08:00:28,174 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
>> handler 2 on 60020, responseTooLarge for: next(5831907615409186602, 1000)
>> from 172.23.122.15:50218: Size: 160.8m
>> 2012-03-20 08:00:32,643 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
>> handler 7 on 60020, responseTooLarge for: next(5831907615409186602, 1000)
>> from 172.23.122.15:50218: Size: 182.4m
>> 2012-03-20 08:00:36,826 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
>> handler 9 on 60020, responseTooLarge for: next(5831907615409186602, 1000)
>> from 172.23.122.15:50218: Size: 237.2m
>> 2012-03-20 08:00:40,850 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
>> handler 3 on 60020, responseTooLarge for: next(5831907615409186602, 1000)
>> from 172.23.122.15:50218: Size: 212.7m
>> 2012-03-20 08:00:44,736 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
>> handler 1 on 60020, responseTooLarge for: next(5831907615409186602, 1000)
>> from 172.23.122.15:50218: Size: 232.9m
>> 2012-03-20 08:00:49,471 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
>> handler 7 on 60020, responseTooLarge for: next(5831907615409186602, 1000)
>> from 172.23.122.15:50218: Size: 227.2m
>> 2012-03-20 08:00:57,566 FATAL
>> org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server
>> serverName=c15.kalooga.nl,60020,1331900161295, load=(requests=706,
>> regions=166, usedHeap=1505, maxHeap=1995): OutOfMemoryError, aborting
>> java.lang.OutOfMemoryError: Java heap space
>>        at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39)
>>        at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)
>>        at
>> org.apache.hadoop.hbase.ipc.ByteBufferOutputStream.<init>(ByteBufferOutputStream.java:44)
>>        at
>> org.apache.hadoop.hbase.ipc.ByteBufferOutputStream.<init>(ByteBufferOutputStream.java:37)
>> 
>> Ferdy.
>> 
>> 
>> 2012/3/16 Ferdy Galema <[email protected]>
>> 
>>> CPU resources never was a problem, munin shows there is enough idle time.
>>> Although the graphs might exclude sharp peaks, I'm pretty certain CPU is
>>> not a problem.
>>> 
>>> After some experiments it seems that disabling swap indeed seems to be the
>>> solution to the aborting regionserver problem. I reverted the GC settings
>>> to be more 'default'.  Currently that is "-XX:+UseConcMarkSweepGC
>>> -XX:+UseParNewGC". Overcommit is set to 50%. The consequence of this is
>>> that the memory policy should be strictly enforced, to prevent important
>>> processes being killed by the OOM-killer.
>>> 
>>> Ferdy.
>>> 
>>> 
>>> 
>>> 
>>> On Fri, Mar 9, 2012 at 5:30 AM, Laxman <[email protected]> wrote:
>>> 
>>>> Hi Ferdy,
>>>> 
>>>>>> I'm running regionservers with 2GB heap and following tuning options:
>>>>>> -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:NewRatio=16
>>>>>> -XX:CMSInitiatingOccupancyFraction=70 -
>>>>> XX:+UseCMSInitiatingOccupancyOnly
>>>>>> -XX:MaxGCPauseMillis=100
>>>> GC with huge heaps may take longer GC pauses.
>>>> 
>>>> But in your case, heap is just 2GB. Doesn't look like a general problem
>>>> to me.
>>>> 
>>>> [Times: user=1.82 sys=0.31, real=51.38 secs]
>>>> 
>>>> User - 1.82 seconds - Excluding context switch
>>>> Real - 51.38 seconds - Including context switch
>>>> 
>>>> Based on these numbers, it looks to me like a problem with CPU
>>>> underprovisioning.
>>>> 
>>>> So, its worth investigating the following
>>>> 
>>>> - What is CPU usage at that point of time?
>>>> - Are there any processes consuming all your CPU resources?
>>>> 
>>>> Alternatively, you can also try increasing the zookeeper session timeout.
>>>> --
>>>> Regards,
>>>> Laxman
>>>> 
>>>>> -----Original Message-----
>>>>> From: Gaojinchao [mailto:[email protected]]
>>>>> Sent: Friday, March 09, 2012 8:29 AM
>>>>> To: [email protected]
>>>>> Subject: re: gc pause killing regionserver
>>>>> 
>>>>> We encountered the same thing. we set swappiness priority to 0. But
>>>>> swap is still working.
>>>>> So we disable swap.
>>>>> 
>>>>> -----邮件原件-----
>>>>> 发件人: [email protected] [mailto:[email protected]] 代表 Jean-Daniel
>>>>> Cryans
>>>>> 发送时间: 2012年3月9日 6:29
>>>>> 收件人: [email protected]
>>>>> 主题: Re: gc pause killing regionserver
>>>>> 
>>>>> When real cpu is bigger than user cpu it very often points to
>>>>> swapping. Even if you think you turned that off or that there's no
>>>>> possible way you could be swapping, check it again.
>>>>> 
>>>>> I could also be that your CPUs were busy doing something else, I've
>>>>> seen crazy context switching CPUs freezing up my nodes, but in my
>>>>> experience it's not very likely.
>>>>> 
>>>>> Setting swappiness to 0 just means it's not going to page anything out
>>>>> until it really needs to do it, meaning it's possible to swap. The
>>>>> only way to guarantee no swapping whatsoever is giving your system 0
>>>>> swap space.
>>>>> 
>>>>> Regarding that promotion failure, you could try reducing the eden
>>>>> size. Try -Xmn128m
>>>>> 
>>>>> J-D
>>>>> 
>>>>> On Sat, Mar 3, 2012 at 5:05 AM, Ferdy Galema <[email protected]>
>>>>> wrote:
>>>>>> Hi,
>>>>>> 
>>>>>> I'm running regionservers with 2GB heap and following tuning options:
>>>>>> -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:NewRatio=16
>>>>>> -XX:CMSInitiatingOccupancyFraction=70 -
>>>>> XX:+UseCMSInitiatingOccupancyOnly
>>>>>> -XX:MaxGCPauseMillis=100
>>>>>> 
>>>>>> A regionserver aborted (YouAreDeadException) and this was printed in
>>>>> the gc
>>>>>> logs (all is shown up until the abort)
>>>>>> 
>>>>>> 211663.516: [GC 211663.516: [ParNew: 118715K->13184K(118912K),
>>>>> 0.0445390
>>>>>> secs] 1373940K->1289814K(2233472K), 0.0446420 secs] [Times: user=0.14
>>>>>> sys=0.01, real=0.05 secs]
>>>>>> 211663.686: [GC 211663.686: [ParNew: 118912K->13184K(118912K),
>>>>> 0.0594280
>>>>>> secs] 1395542K->1310185K(2233472K), 0.0595420 secs] [Times: user=0.15
>>>>>> sys=0.00, real=0.06 secs]
>>>>>> 211663.869: [GC 211663.869: [ParNew: 118790K->13184K(118912K),
>>>>> 0.0434820
>>>>>> secs] 1415792K->1331317K(2233472K), 0.0435930 secs] [Times: user=0.13
>>>>>> sys=0.01, real=0.04 secs]
>>>>>> 211667.598: [GC 211667.598: [ParNew (promotion failed):
>>>>>> 118912K->118912K(118912K), 0.0225390 secs]211667.621: [CMS:
>>>>>> 1330845K->1127914K(2114560K), 51.3610670 secs]
>>>>>> 1437045K->1127914K(2233472K), [CMS Perm : 20680K->20622K(34504K)],
>>>>>> 51.3838170 secs] [Times: user=1.82 sys=0.31, real=51.38 secs]
>>>>>> 211719.713: [GC 211719.714: [ParNew: 105723K->13184K(118912K),
>>>>> 0.0176130
>>>>>> secs] 1233638K->1149393K(2233472K), 0.0177230 secs] [Times: user=0.07
>>>>>> sys=0.00, real=0.02 secs]
>>>>>> 211719.851: [GC 211719.852: [ParNew: 118912K->13184K(118912K),
>>>>> 0.0281860
>>>>>> secs] 1255121K->1170269K(2233472K), 0.0282970 secs] [Times: user=0.10
>>>>>> sys=0.01, real=0.03 secs]
>>>>>> 211719.993: [GC 211719.993: [ParNew: 118795K->13184K(118912K),
>>>>> 0.0276320
>>>>>> secs] 1275880K->1191268K(2233472K), 0.0277350 secs] [Times: user=0.09
>>>>>> sys=0.00, real=0.03 secs]
>>>>>> 211720.490: [GC 211720.490: [ParNew: 118912K->13184K(118912K),
>>>>> 0.0624650
>>>>>> secs] 1296996K->1210640K(2233472K), 0.0625560 secs] [Times: user=0.15
>>>>>> sys=0.00, real=0.06 secs]
>>>>>> 211720.687: [GC 211720.687: [ParNew: 118702K->13184K(118912K),
>>>>> 0.1651750
>>>>>> secs] 1316159K->1231993K(2233472K), 0.1652660 secs] [Times: user=0.25
>>>>>> sys=0.01, real=0.17 secs]
>>>>>> 211721.038: [GC 211721.038: [ParNew: 118912K->13184K(118912K),
>>>>> 0.0952750
>>>>>> secs] 1337721K->1252598K(2233472K), 0.0953660 secs] [Times: user=0.15
>>>>>> sys=0.00, real=0.09 secs]
>>>>>> Heap
>>>>>> par new generation  total 118912K, used 86199K [0x00002aaaae1f0000,
>>>>>> 0x00002aaab62f0000, 0x00002aaab62f0000)
>>>>>> eden space 105728K,  69% used [0x00002aaaae1f0000,
>>>>> 0x00002aaab293dfa8,
>>>>>> 0x00002aaab4930000)
>>>>>> from space 13184K, 100% used [0x00002aaab4930000,
>>>>> 0x00002aaab5610000,
>>>>>> 0x00002aaab5610000)
>>>>>> to  space 13184K,  0% used [0x00002aaab5610000, 0x00002aaab5610000,
>>>>>> 0x00002aaab62f0000)
>>>>>> concurrent mark-sweep generation total 2114560K, used 1239414K
>>>>>> [0x00002aaab62f0000, 0x00002aab373f0000, 0x00002aab373f0000)
>>>>>> concurrent-mark-sweep perm gen total 34504K, used 20728K
>>>>>> [0x00002aab373f0000, 0x00002aab395a2000, 0x00002aab3c7f0000)
>>>>>> 
>>>>>> 
>>>>>> Why did a GC took 51 seconds? The machine still had enough memory
>>>>> available
>>>>>> so it could not be swapping. (swapiness is set to 0). From the 15
>>>>>> regionservers in total, I often see this specific regionserver fail.
>>>>> What
>>>>>> do you recommended in this situation?
>>>>>> 
>>>>>> Ferdy.
>>>> 
>> 
>> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
>> INFORMATICAS...
>> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
>> 
>> http://www.uci.cu
>> http://www.facebook.com/universidad.uci
>> http://www.flickr.com/photos/universidad_uci
>> 
>> 
> 
> -- 
> Marcos Luis Ortíz Valmaseda (@marcosluis2186)
> Data Engineer at UCI
> http://marcosluis2186.posterous.com
> 
> 
> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
> INFORMATICAS...
> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
> 
> http://www.uci.cu
> http://www.facebook.com/universidad.uci
> http://www.flickr.com/photos/universidad_uci
>

Re: gc pause killing regionserver

Reply via email to