Re: RegionServers Crashing every hour in production env

Ted Yu Wed, 03 Apr 2013 11:37:00 -0700

Thanks for the sharing.

Have you looked at
http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired , suggested
below ?


I think zookeeper timeout should be more closely watched.


On Wed, Apr 3, 2013 at 11:21 AM, Pablo Musa <[email protected]> wrote:

> Hello guys,
> I stopped my research on HBase ZK timeout for while due to
> other issues I had to do, but I am back.
>
> A very weird behavior that I would like your comments is that my
> RegionServers perform better (less crashes) under heavy load instead
> of light load.
> There is, if I let HBase alone with 50 requestsPerSecond along the
> whole day the crashes are higher than if I run a mapred Job every hour.
>
>
> Another weird thing is the following:
>
> RS startTime = Mon Apr 01 13:22:35 BRT 2013
>
> [...]$ grep slept /var/log/hbase/hbase-hbase-**regionserver-PSLBHDN00*.log
> 2013-04-01 20:09:21,135 WARN org.apache.hadoop.hbase.util.**Sleeper: We
> slept 45491ms instead of 3000ms, this is likely due to a long garbage
> collecting pause and it's usually bad, see http://hbase.apache.org/book.**
> html#trouble.rs.runtime.**zkexpired<http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired>
> 2013-04-01 22:45:59,407 WARN org.apache.hadoop.hbase.util.**Sleeper: We
> slept 101271ms instead of 3000ms, this is likely due to a long garbage
> collecting pause and it's usually bad, see http://hbase.apache.org/book.**
> html#trouble.rs.runtime.**zkexpired<http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired>
>
> [...]$ egrep 'real=[1-9][0-9][0-9][0-9]*\.[**0-9][0-9]'
> /var/log/hbase/hbase-hbase-**regionserver-PSLBHDN00*.out
> * the below report is the above command for each time range.
> 0.0 - 0.1  secs GCs = 5084
> 0.1 - 0.5  secs GCs = 9
> 0.5 - 1.0  secs GCs = 3
> 1.0 - 010  secs GCs = 0
> 010 - 100  secs GCs = 0
> 100 - 1000 secs GCs = 0
>
> So, my script for getting every gc time ("real=... secs") says that
> there is no gc that took longer than 1 second.
> However the RS log says twice that the RS slept more than 40 seconds
> instead of 3.
>
> "this is likely due to a long garbage collecting pause", yes
> this is likely but I dont think it is the case.
>
> The machine is a huge machine with 70GB RAM, 32 procs, light load,
> no swap or iowait.
>
> Any ideas?
>
> Thanks,
> Pablo
>
>
> On 03/12/2013 12:43 PM, Pablo Musa wrote:
>
>> Guys,
>> thank you very much for the help.
>>
>> Yesterday I spent 14 hours trying to tune the whole cluster.
>> The cluster is not ready yet needs a lot of tunning, but at least is
>> working.
>>
>> My first big problem was namenode + datanode GC. They were not using
>> CMS and thus were taking "incremental" time to run. Ii started in 0.01
>> ms and
>> in 20 minutes was taking 150 secs.
>> After setting CMSGC this time is much smaller taking a maximum of 70 secs,
>> which is VERY HIGH, but for now does not stop HBase.
>>
>> With this issue solved, it was clear that the RS was doing a long pause
>> GC,
>> taking up to 220 secs. Zookeeper expired the RS and it shutdown.
>> I tried a lot of different flags configuration (MORE than 20), and could
>> not
>> get small gcs. Eventually it would take more than 150 secs (zookeeper
>> timeout)
>> and shutdown.
>>
>> Finally I tried a config that so far, 12 hours, is working with a maximum
>> GC
>> time of 90 secs. Which of course is a terrible problem since HBase is a
>> database, but at least the cluster is stable while I can tune it a
>> little more.
>>
>> In my opinion, my biggest problem is to have a few "monster" machines in
>> the
>> cluster instead of a bunch of commodities machines. I don't know if
>> there are
>> a lot companies using this kind of machines inside a hadoop cluster, but
>> a fast search on google could not find a lot of tunes for big heap GCs.
>>
>> I guess my next step will be search for big heap gc tuning.
>>
>> Back to some questions ;)
>>
>>   > You have ganglia or tsdb running?
>>
>> I use zabbix for now, and no there is nothing going on when the big
>> pause happens.
>>
>>   > When you see the big pause above, can you see anything going on on the
>>   > machine? (swap, iowait, concurrent fat mapreduce job?)
>>   > what are you doing during long GC happened? read or write? if
>> reading, what
>>   > the block cache size?
>>
>> The cpu for the RS process goes to 100% and the logs "pause", until it
>> gets out.
>> Ex: [NewPar
>>
>> IO and SWAP are normal. There is no MR running, just normal database
>> load, which is
>> very low. I am probably doing reads AND writes to the database with
>> default block
>> cache size.
>> One problem in this moment might be the big number of regions (1252)
>> since I am
>> using only one RS to be able to track the problem.
>>
>> The links and ideas were very helpful. Thank you very much guys.
>>
>> I will post my future researches as I find a solution ;)
>>
>> If you have more ideas or info (links, flag suggestions, etc.), please
>> post it :)
>>
>> Abs,
>> Pablo
>>
>> On 03/10/2013 11:24 PM, Andrew Purtell wrote:
>>
>>> Be careful with GC tuning, throwing changes at an application without
>>> analysis of what is going on with the heap is shooting in the dark. One
>>> particular good treatment of the subject is here:
>>> http://java.dzone.com/**articles/how-tame-java-gc-**pauses<http://java.dzone.com/articles/how-tame-java-gc-pauses>
>>>
>>> If you have made custom changes to blockcache or memstore configurations,
>>> back them out until you're sure everything else is ok.
>>>
>>> Watch carefully for swapping. Set the vm.swappiness sysctl to 0. Monitor
>>> for spikes in page scanning or any swap activity. Nothing brings on
>>> "Juliette" pauses better than a JVM partially swapped out. The Java GC
>>> starts collection by examining the oldest pages, and those are the first
>>> pages the OS swaps out...
>>>
>>>
>>>
>>> On Mon, Mar 11, 2013 at 10:13 AM, Azuryy Yu <[email protected]> wrote:
>>>
>>>  Hi Pablo,
>>>> It'a terrible for a long minor GC. I don't think there are swaping from
>>>> your vmstat log.
>>>> but I just suggest you
>>>> 1) add following JVM options:
>>>> -XX:+DisableExplicitGC -XX:+UseCompressedOops -XX:GCTimeRatio=19
>>>> -XX:SoftRefLRUPolicyMSPerMB=0 -XX:SurvivorRatio=2
>>>> -XX:MaxTenuringThreshold=3 -XX:+UseFastAccessorMethods
>>>>
>>>> 2) -Xmn is two small, your total Mem is 74GB, just make -Xmn2g
>>>> 3) what are you doing during long GC happened? read or write? if
>>>> reading,
>>>> what the block cache size?
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, Mar 11, 2013 at 6:41 AM, Stack <[email protected]> wrote:
>>>>
>>>>  You could increase your zookeeper session timeout to 5 minutes while
>>>>> you
>>>>> are figuring why these long pauses.
>>>>> http://hbase.apache.org/book.**html#zookeeper.session.timeout<http://hbase.apache.org/book.html#zookeeper.session.timeout>
>>>>>
>>>>> Above, there is an outage for almost 5 minutes:
>>>>>
>>>>>  We slept 225100ms instead of 3000ms, this is likely due to a long
>>>>>>>
>>>>>> You have ganglia or tsdb running?  When you see the big pause above,
>>>>> can
>>>>> you see anything going on on the machine?  (swap, iowait, concurrent
>>>>> fat
>>>>> mapreduce job?)
>>>>>
>>>>> St.Ack
>>>>>
>>>>>
>>>>>
>>>>> On Sun, Mar 10, 2013 at 3:29 PM, Pablo Musa <[email protected]> wrote:
>>>>>
>>>>>  Hi Sreepathi,
>>>>>> they say in the book (or the site), we could try it to see if it is
>>>>>>
>>>>> really
>>>>>
>>>>>> a timeout error
>>>>>> or there is something more. But it is not recomended for production
>>>>>> environments.
>>>>>>
>>>>>> I could give it a try if five minutes will ensure to us that the
>>>>>>
>>>>> problem
>>>>
>>>>> is the GC or
>>>>>> elsewhere!! Anyway, I think it is hard to beleive a GC is taking 2:30
>>>>>> minutes.
>>>>>>
>>>>>> Abs,
>>>>>> Pablo
>>>>>>
>>>>>>
>>>>>> On 03/10/2013 04:06 PM, Sreepathi wrote:
>>>>>>
>>>>>>  Hi Stack/Ted/Pablo,
>>>>>>>
>>>>>>> Should we increase the hbase.rpc.timeout property to 5 minutes (
>>>>>>>
>>>>>> 300000
>>>>
>>>>> ms
>>>>>
>>>>>> )  ?
>>>>>>>
>>>>>>> Regards,
>>>>>>> - Sreepathi
>>>>>>>
>>>>>>> On Sun, Mar 10, 2013 at 11:59 AM, Pablo Musa <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>    That combo should be fine.
>>>>>>>
>>>>>>>> Great!!
>>>>>>>>
>>>>>>>>
>>>>>>>>    If JVM is full GC'ing, the application is stopped.
>>>>>>>>
>>>>>>>>> The below does not look like a full GC but that is a long pause in
>>>>>>>>> system
>>>>>>>>> time, enough to kill your zk session.
>>>>>>>>>
>>>>>>>>>  Exactly. This pause is really making the zk expire the RS which
>>>>>>>>
>>>>>>> shutsdown
>>>>>
>>>>>> (logs
>>>>>>>> in the end of the email).
>>>>>>>> But the question is: what is causing this pause??!!
>>>>>>>>
>>>>>>>>    You swapping?
>>>>>>>> I don't think so (stats below).
>>>>>>>>
>>>>>>>>    Hardware is good?
>>>>>>>> Yes, it is a 16 processor machine with 74GB of RAM and plenty disk
>>>>>>>>
>>>>>>> space.
>>>>>
>>>>>> Below are some metrics I have heard about. Hope it helps.
>>>>>>>>
>>>>>>>>
>>>>>>>> ** I am having some problems with the datanodes[1] which are having
>>>>>>>> trouble to
>>>>>>>> write. I really think the issues are related, but cannot solve any
>>>>>>>> of
>>>>>>>> them
>>>>>>>> :(
>>>>>>>>
>>>>>>>> Thanks again,
>>>>>>>> Pablo
>>>>>>>>
>>>>>>>> [1] http://mail-archives.apache.
>>>>>>>>
>>>>>>> ****org/mod_mbox/hadoop-hdfs-**user/****
>>>> 201303.mbox/%3CCAJzooYfS-F1KS+**********jGOPUt15PwFjcCSzigE0APeM9FXaCr*
>>>> *****
>>>>
>>>>> [email protected]%3E<http:****//mail-archives.apache.org/****<http://mail-archives.apache.org/**>
>>>>>>>> mod_mbox/hadoop-hdfs-user/****201303.mbox/%3CCAJzooYfS-F1KS+****
>>>>>>>> jGOPUt15PwFjcCSzigE0APeM9FXaCr****[email protected]%3E<
>>>>>>>>
>>>>>>> http://mail-archives.apache.**org/mod_mbox/hadoop-hdfs-user/**
>>>> 201303.mbox/%3CCAJzooYfS-F1KS+**jGOPUt15PwFjcCSzigE0APeM9FXaCr**
>>>> [email protected]%3E<http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-user/201303.mbox/%3ccajzooyfs-f1ks+jgoput15pwfjccszige0apem9fxacrqfv...@mail.gmail.com%3E>
>>>>
>>>>> top - 15:38:04 up 297 days, 21:03,  2 users,  load average: 4.34,
>>>>>>>>
>>>>>>> 2.55,
>>>>
>>>>> 1.28
>>>>>>>> Tasks: 528 total,   1 running, 527 sleeping,   0 stopped,   0 zombie
>>>>>>>> Cpu(s):  0.1%us,  0.2%sy,  0.0%ni, 99.7%id,  0.0%wa,  0.0%hi,
>>>>>>>> 0.0%si,
>>>>>>>>     0.0%st
>>>>>>>> Mem:  74187256k total, 29493992k used, 44693264k free,  5836576k
>>>>>>>>
>>>>>>> buffers
>>>>>
>>>>>> Swap: 51609592k total,   128312k used, 51481280k free,  1353400k
>>>>>>>>
>>>>>>> cached
>>>>
>>>>> ]$ vmstat -w
>>>>>>>> procs -------------------memory-----******------------- ---swap--
>>>>>>>> -----io----
>>>>>>>> --system-- -----cpu-------
>>>>>>>>     r  b       swpd       free       buff      cache   si   so
>>>>>>>>  bi bo
>>>>>>>> in
>>>>>>>>      cs  us sy  id wa st
>>>>>>>>     2  0     128312   32416928    5838288    5043560    0    0
>>>>>>>> 202 53
>>>>>>>>    0
>>>>>>>>       0   2  1  96  1  0
>>>>>>>>
>>>>>>>> ]$ sar
>>>>>>>> 02:20:01 PM     all     26.18      0.00      2.90      0.63 0.00
>>>>>>>> 70.29
>>>>>>>> 02:30:01 PM     all      1.66      0.00      1.25      1.05 0.00
>>>>>>>> 96.04
>>>>>>>> 02:40:01 PM     all     10.01      0.00      2.14      0.75 0.00
>>>>>>>> 87.11
>>>>>>>> 02:50:01 PM     all      0.76      0.00      0.80      1.03 0.00
>>>>>>>> 97.40
>>>>>>>> 03:00:01 PM     all      0.23      0.00      0.30      0.71 0.00
>>>>>>>> 98.76
>>>>>>>> 03:10:01 PM     all      0.22      0.00      0.30      0.66 0.00
>>>>>>>> 98.82
>>>>>>>> 03:20:01 PM     all      0.22      0.00      0.31      0.76 0.00
>>>>>>>> 98.71
>>>>>>>> 03:30:01 PM     all      0.24      0.00      0.31      0.64 0.00
>>>>>>>> 98.81
>>>>>>>> 03:40:01 PM     all      1.13      0.00      2.97      1.18 0.00
>>>>>>>> 94.73
>>>>>>>> Average:        all      3.86      0.00      1.38      0.88 0.00
>>>>>>>> 93.87
>>>>>>>>
>>>>>>>> ]$ iostat
>>>>>>>> Linux 2.6.32-220.7.1.el6.x86_64 (PSLBHDN002)     03/10/2013 _x86_64_
>>>>>>>>     (16 CPU)
>>>>>>>>
>>>>>>>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>>>>>>>               1.86    0.00    0.96    0.78    0.00   96.41
>>>>>>>>
>>>>>>>> Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read Blk_wrtn
>>>>>>>> sda               1.23        20.26        23.53  521533196
>>>>>>>> 605566924
>>>>>>>> sdb               6.51       921.55       241.90 23717850730
>>>>>>>>
>>>>>>> 6225863488
>>>>
>>>>> sdc               6.22       921.83       236.41 23725181162
>>>>>>>>
>>>>>>> 6084471192
>>>>
>>>>> sdd               6.25       925.13       237.26 23810004970
>>>>>>>>
>>>>>>> 6106357880
>>>>
>>>>> sde               6.19       913.90       235.60 23521108818
>>>>>>>>
>>>>>>> 6063722504
>>>>
>>>>> sdh               6.26       933.08       237.77 24014594546
>>>>>>>>
>>>>>>> 6119511376
>>>>
>>>>> sdg               6.18       914.36       235.31 23532747378
>>>>>>>>
>>>>>>> 6056257016
>>>>
>>>>> sdf               6.24       923.66       235.33 23772251810
>>>>>>>>
>>>>>>> 6056604008
>>>>
>>>>> Some more logging which reinforce that the RS crash is happening
>>>>>>>>
>>>>>>> because
>>>>>
>>>>>> of
>>>>>>>> timeout. However this time the GC log is not accusing a big time.
>>>>>>>>
>>>>>>>> #####RS LOG#####
>>>>>>>> 2013-03-10 15:37:46,712 INFO org.apache.zookeeper.******ClientCnxn:
>>>>>>>>
>>>>>>> Client
>>>>>
>>>>>> session timed out, have not heard from server in 257739ms for
>>>>>>>>
>>>>>>> sessionid
>>>>
>>>>> 0x13d3c4bcba6014a, closing socket connection and attempting reconnect
>>>>>>>> 2013-03-10 15:37:46,712 INFO org.apache.zookeeper.******ClientCnxn:
>>>>>>>>
>>>>>>> Client
>>>>>
>>>>>> session timed out, have not heard from server in 226785ms for
>>>>>>>>
>>>>>>> sessionid
>>>>
>>>>> 0x13d3c4bcba60149, closing socket connection and attempting reconnect
>>>>>>>> 2013-03-10 15:37:46,712 DEBUG org.apache.hadoop.hbase.io.******
>>>>>>>> hfile.LruBlockCache:
>>>>>>>> Stats: total=61.91 MB, free=1.94 GB, max=2 GB, blocks=1254,
>>>>>>>> accesses=60087,
>>>>>>>> hits=58811, hitRatio=97.87%, , cachingAccesses=60069,
>>>>>>>>
>>>>>>> cachingHits=58811,
>>>>>
>>>>>> cachingHitsRatio=97.90%, , evictions=0, evicted=0, evictedPerRun=NaN
>>>>>>>> 2013-03-10 15:37:46,712 WARN
>>>>>>>>
>>>>>>> org.apache.hadoop.hbase.util.******Sleeper:
>>>>
>>>>> We
>>>>>>>> slept 225100ms instead of 3000ms, this is likely due to a long
>>>>>>>>
>>>>>>> garbage
>>>>
>>>>> collecting pause and it's usually bad, see
>>>>>>>> http://hbase.apache.org/book.******<http://hbase.apache.org/book.****><
>>>>>>>> http://hbase.apache.org/book.**** <http://hbase.apache.org/book.**>
>>>>>>>> >
>>>>>>>> html#trouble.rs.runtime.******zkexpired<http://hbase.apache.****
>>>>>>>> org/book.html#trouble.rs.****runtime.zkexpired<
>>>>>>>>
>>>>>>> http://hbase.apache.org/book.**html#trouble.rs.runtime.**zkexpired<http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired>
>>>>> >
>>>>>
>>>>>> 2013-03-10 15:37:46,714 WARN org.apache.hadoop.hdfs.******DFSClient:
>>>>>>>> DFSOutputStream ResponseProcessor exception  for block
>>>>>>>> BP-43236042-172.17.2.10-******1362490844340:blk_-**
>>>>>>>> 6834190810033122569_25150229
>>>>>>>>
>>>>>>>> java.io.EOFException: Premature EOF: no length prefix available
>>>>>>>>            at org.apache.hadoop.hdfs.******
>>>>>>>> protocol.HdfsProtoUtil.**
>>>>>>>> vintPrefixed(HdfsProtoUtil.******java:171)
>>>>>>>>            at org.apache.hadoop.hdfs.******protocol.datatransfer.**
>>>>>>>> PipelineAck.readFields(******PipelineAck.java:114)
>>>>>>>>            at
>>>>>>>>
>>>>>>> org.apache.hadoop.hdfs.******DFSOutputStream$DataStreamer$******
>>>>>
>>>>>> ResponseProcessor.run(******DFSOutputStream.java:670)
>>>>>>>> 2013-03-10 15:37:46,716 ERROR org.apache.hadoop.hbase.**
>>>>>>>> regionserver.HRegionServer:
>>>>>>>> org.apache.hadoop.hbase.ipc.******CallerDisconnectedException:
>>>>>>>> Aborting
>>>>>>>> call
>>>>>>>> get([B@7caf69ed, {"timeRange":[0,******9223372036854775807],"**
>>>>>>>> totalColumns":1,"cacheBlocks":******true,"families":{"details"**
>>>>>>>> :[**"**
>>>>>>>> ALL"]},"maxVersions":1,"row":"******\\x00\\x00\\x00\\x00\\x00\**
>>>>>>>> \***
>>>>>>>> *x12\\x93@"}),
>>>>>>>> rpc version=1, client version=29, methodsFingerPrint=1891768260 from
>>>>>>>> 172.17.1.71:51294 after 0 ms, since caller disconnected
>>>>>>>>
>>>>>>>>
>>>>>>>> #####GC LOG#####
>>>>>>>> 2716.635: [GC 2716.635: [ParNew: 57570K->746K(59008K), 0.0046530
>>>>>>>>
>>>>>>> secs]
>>>>
>>>>> 354857K->300891K(1152704K), 0.0047370 secs] [Times: user=0.03
>>>>>>>>
>>>>>>> sys=0.00,
>>>>
>>>>> real=0.00 secs]
>>>>>>>> 2789.478: [GC 2789.478: [ParNew: 53226K->1192K(59008K), 0.0041370
>>>>>>>>
>>>>>>> secs]
>>>>
>>>>> 353371K->301337K(1152704K), 0.0042220 secs] [Times: user=0.04
>>>>>>>>
>>>>>>> sys=0.00,
>>>>
>>>>> real=0.00 secs]
>>>>>>>> 2868.435: [GC 2868.435: [ParNew: 53672K->740K(59008K), 0.0041570
>>>>>>>>
>>>>>>> secs]
>>>>
>>>>> 353817K->300886K(1152704K), 0.0042440 secs] [Times: user=0.03
>>>>>>>>
>>>>>>> sys=0.00,
>>>>
>>>>> real=0.01 secs]
>>>>>>>> 2920.309: [GC 2920.309: [ParNew: 53220K->6202K(59008K), 0.0058440
>>>>>>>>
>>>>>>> secs]
>>>>
>>>>> 353366K->310443K(1152704K), 0.0059410 secs] [Times: user=0.05
>>>>>>>>
>>>>>>> sys=0.00,
>>>>
>>>>> real=0.01 secs]
>>>>>>>> 2984.239: [GC 2984.239: [ParNew: 58680K->5282K(59008K), 0.0048070
>>>>>>>>
>>>>>>> secs]
>>>>
>>>>> 362921K->310977K(1152704K), 0.0049180 secs] [Times: user=0.03
>>>>>>>>
>>>>>>> sys=0.00,
>>>>
>>>>> real=0.01 secs]
>>>>>>>> 3050.620: [GC 3050.620: [ParNew: 57524K->5292K(59008K), 0.0055510
>>>>>>>>
>>>>>>> secs]
>>>>
>>>>> 363219K->313384K(1152704K), 0.0056520 secs] [Times: user=0.04
>>>>>>>>
>>>>>>> sys=0.00,
>>>>
>>>>> real=0.01 secs]
>>>>>>>> 3111.331: [GC 3111.331: [ParNew: 57542K->5637K(59008K), 0.0056900
>>>>>>>>
>>>>>>> secs]
>>>>
>>>>> 365634K->318079K(1152704K), 0.0057950 secs] [Times: user=0.03
>>>>>>>>
>>>>>>> sys=0.00,
>>>>
>>>>> real=0.01 secs]
>>>>>>>> 3163.510: [GC 3163.510: [ParNew: 58117K->1138K(59008K), 0.0046540
>>>>>>>>
>>>>>>> secs]
>>>>
>>>>> 370559K->316086K(1152704K), 0.0047470 secs] [Times: user=0.03
>>>>>>>>
>>>>>>> sys=0.00,
>>>>
>>>>> real=0.01 secs]
>>>>>>>> 3217.642: [GC 3217.643: [ParNew: 53618K->1001K(59008K), 0.0038530
>>>>>>>>
>>>>>>> secs]
>>>>
>>>>> 368566K->315949K(1152704K), 0.0039450 secs] [Times: user=0.03
>>>>>>>>
>>>>>>> sys=0.00,
>>>>
>>>>> real=0.00 secs]
>>>>>>>> 3464.067: [GC 3464.067: [ParNew: 53481K->5348K(59008K), 0.0068370
>>>>>>>>
>>>>>>> secs]
>>>>
>>>>> 368429K->322343K(1152704K), 0.0069170 secs] [Times: user=0.07
>>>>>>>>
>>>>>>> sys=0.01,
>>>>
>>>>> real=0.00 secs]
>>>>>>>> 3465.648: [GC 3465.648: [ParNew: 57828K->1609K(59008K), 0.0111520
>>>>>>>>
>>>>>>> secs]
>>>>
>>>>> 374823K->321496K(1152704K), 0.0112170 secs] [Times: user=0.07
>>>>>>>>
>>>>>>> sys=0.00,
>>>>
>>>>> real=0.01 secs]
>>>>>>>> 3466.291: [GC 3466.291: [ParNew: 54089K->2186K(59008K), 0.0093190
>>>>>>>>
>>>>>>> secs]
>>>>
>>>>> 373976K->322073K(1152704K), 0.0093800 secs] [Times: user=0.06
>>>>>>>>
>>>>>>> sys=0.00,
>>>>
>>>>> real=0.01 secs]
>>>>>>>> Heap
>>>>>>>>     par new generation   total 59008K, used 15497K
>>>>>>>> [0x00000005fae00000,
>>>>>>>> 0x00000005fee00000, 0x00000005fee00000)
>>>>>>>>      eden space 52480K,  25% used [0x00000005fae00000,
>>>>>>>>
>>>>>>> 0x00000005fbaff8c8,
>>>>>
>>>>>> 0x00000005fe140000)
>>>>>>>>      from space 6528K,  33% used [0x00000005fe140000,
>>>>>>>>
>>>>>>> 0x00000005fe362b70,
>>>>
>>>>> 0x00000005fe7a0000)
>>>>>>>>      to   space 6528K,   0% used [0x00000005fe7a0000,
>>>>>>>>
>>>>>>> 0x00000005fe7a0000,
>>>>
>>>>> 0x00000005fee00000)
>>>>>>>>     concurrent mark-sweep generation total 1093696K, used 319887K
>>>>>>>> [0x00000005fee00000, 0x0000000641a10000, 0x00000007fae00000)
>>>>>>>>     concurrent-mark-sweep perm gen total 30464K, used 30427K
>>>>>>>> [0x00000007fae00000, 0x00000007fcbc0000, 0x0000000800000000)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 03/08/2013 07:02 PM, Stack wrote:
>>>>>>>>
>>>>>>>>    On Fri, Mar 8, 2013 at 10:58 AM, Pablo Musa <[email protected]>
>>>>>>>>
>>>>>>> wrote:
>>>>
>>>>>     0.94 currently doesn't support hadoop 2.0
>>>>>>>>>
>>>>>>>>>  Can you deploy hadoop 1.1.1 instead ?
>>>>>>>>>>
>>>>>>>>>>>     I am using cdh4.2.0 which uses this version as default
>>>>>>>>>>>
>>>>>>>>>> installation.
>>>>>
>>>>>> I think it will be a problem for me to deploy 1.1.1 because I would
>>>>>>>>>> need
>>>>>>>>>> to
>>>>>>>>>> "upgrade" the whole cluster with 70TB of data (backup everything,
>>>>>>>>>>
>>>>>>>>> go
>>>>
>>>>> offline, etc.).
>>>>>>>>>>
>>>>>>>>>> Is there a problem to use cdh4.2.0?
>>>>>>>>>> I should send my email to cdh list?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>     That combo should be fine.
>>>>>>>>>>
>>>>>>>>>>        You Full GC'ing around this time?
>>>>>>>>>
>>>>>>>>>  The GC shows it took a long time. However it does not make any
>>>>>>>>>>
>>>>>>>>> sense
>>>>
>>>>> to be it, since the same ammount of data was cleaned before and
>>>>>>>>>>
>>>>>>>>> AFTER
>>>>
>>>>> in just 0.01 secs!
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>     If JVM is full GC'ing, the application is stopped.
>>>>>>>>>>
>>>>>>>>>>      [Times: user=0.08 sys=137.62, real=137.62 secs]
>>>>>>>>>
>>>>>>>>>  Besides the whole time was used by system. That is what is bugging
>>>>>>>>>>
>>>>>>>>> me.
>>>>>
>>>>>>     The below does not look like a full GC but that is a long pause
>>>>>>>>>>
>>>>>>>>> in
>>>>
>>>>> system
>>>>>>>>> time, enough to kill your zk session.
>>>>>>>>>
>>>>>>>>> You swapping?
>>>>>>>>>
>>>>>>>>> Hardware is good?
>>>>>>>>>
>>>>>>>>> St.Ack
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>       ...
>>>>>>>>>
>>>>>>>>>  1044.081: [GC 1044.081: [ParNew: 58970K->402K(59008K), 0.0040990
>>>>>>>>>>
>>>>>>>>> secs]
>>>>>
>>>>>> 275097K->216577K(1152704K), 0.0041820 secs] [Times: user=0.03
>>>>>>>>>>
>>>>>>>>> sys=0.00,
>>>>>
>>>>>> real=0.01 secs]
>>>>>>>>>>
>>>>>>>>>> 1087.319: [GC 1087.319: [ParNew: 52873K->6528K(59008K), 0.0055000
>>>>>>>>>>
>>>>>>>>> secs]
>>>>>
>>>>>> 269048K->223592K(1152704K), 0.0055930 secs] [Times: user=0.04
>>>>>>>>>>
>>>>>>>>> sys=0.01,
>>>>>
>>>>>> real=0.00 secs]
>>>>>>>>>>
>>>>>>>>>> 1087.834: [GC 1087.834: [ParNew: 59008K->6527K(59008K),
>>>>>>>>>> 137.6353620
>>>>>>>>>> secs] 276072K->235097K(1152704K), 137.6354700 secs] [Times:
>>>>>>>>>>
>>>>>>>>> user=0.08
>>>>
>>>>> sys=137.62, real=137.62 secs]
>>>>>>>>>>
>>>>>>>>>> 1226.638: [GC 1226.638: [ParNew: 59007K->1897K(59008K), 0.0079960
>>>>>>>>>>
>>>>>>>>> secs]
>>>>>
>>>>>> 287577K->230937K(1152704K), 0.0080770 secs] [Times: user=0.05
>>>>>>>>>>
>>>>>>>>> sys=0.00,
>>>>>
>>>>>> real=0.01 secs]
>>>>>>>>>>
>>>>>>>>>> 1227.251: [GC 1227.251: [ParNew: 54377K->2379K(59008K), 0.0095650
>>>>>>>>>>
>>>>>>>>> secs]
>>>>>
>>>>>> 283417K->231420K(1152704K), 0.0096340 secs] [Times: user=0.06
>>>>>>>>>>
>>>>>>>>> sys=0.00,
>>>>>
>>>>>> real=0.01 secs]
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I really appreciate you guys helping me to find out what is wrong.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Pablo
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 03/08/2013 02:11 PM, Stack wrote:
>>>>>>>>>>
>>>>>>>>>>     What RAM says.
>>>>>>>>>>
>>>>>>>>>>  2013-03-07 17:24:57,887 INFO
>>>>>>>>>>>
>>>>>>>>>> org.apache.zookeeper.**********ClientCnxn:
>>>>>
>>>>>> Client
>>>>>>>>>>>
>>>>>>>>>>> session timed out, have not heard from server in 159348ms for
>>>>>>>>>>> sessionid
>>>>>>>>>>> 0x13d3c4bcba600a7, closing socket connection and attempting
>>>>>>>>>>>
>>>>>>>>>> reconnect
>>>>>
>>>>>> You Full GC'ing around this time?
>>>>>>>>>>>
>>>>>>>>>>> Put up your configs in a place where we can take a look?
>>>>>>>>>>>
>>>>>>>>>>> St.Ack
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Mar 8, 2013 at 8:32 AM, ramkrishna vasudevan <
>>>>>>>>>>> ramkrishna.s.vasudevan@gmail.********com
>>>>>>>>>>>
>>>>>>>>>> <ramkrishna.s.vasudevan@gmail.
>>>>>
>>>>>> ****
>>>>>>>>>>> com <ramkrishna.s.vasudevan@gmail.****com<
>>>>>>>>>>>
>>>>>>>>>> ramkrishna.s.vasudevan@gmail.**com<[email protected]>
>>>>> >
>>>>>
>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>      I think it is with your GC config.  What is your heap size?
>>>>>>>>>>>
>>>>>>>>>>    What
>>>>
>>>>> is
>>>>>>>>>>> the
>>>>>>>>>>>
>>>>>>>>>>>    data that you pump in and how much is the block cache size?
>>>>>>>>>>>
>>>>>>>>>>>> Regards
>>>>>>>>>>>> Ram
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Mar 8, 2013 at 9:31 PM, Ted Yu <[email protected]>
>>>>>>>>>>>>
>>>>>>>>>>> wrote:
>>>>>
>>>>>>      0.94 currently doesn't support hadoop 2.0
>>>>>>>>>>>>
>>>>>>>>>>>>    Can you deploy hadoop 1.1.1 instead ?
>>>>>>>>>>>>
>>>>>>>>>>>>> Are you using 0.94.5 ?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Mar 8, 2013 at 7:44 AM, Pablo Musa <[email protected]>
>>>>>>>>>>>>>
>>>>>>>>>>>> wrote:
>>>>>
>>>>>>      Hey guys,
>>>>>>>>>>>>>
>>>>>>>>>>>>>    as I sent in an email a long time ago, the RSs in my cluster
>>>>>>>>>>>>>
>>>>>>>>>>>> did
>>>>
>>>>> not
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>     get
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  along
>>>>>>>>>>>>>
>>>>>>>>>>>>>    and crashed 3 times a day. I tried a lot of options we
>>>>>>>>>>>>>
>>>>>>>>>>>> discussed
>>>>
>>>>> in
>>>>>
>>>>>> the
>>>>>>>>>>>>>> emails, but it not solved the problem. As I used an old
>>>>>>>>>>>>>> version
>>>>>>>>>>>>>>
>>>>>>>>>>>>> of
>>>>>
>>>>>>     hadoop I
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>      thought this was the problem.
>>>>>>>>>>>>>
>>>>>>>>>>>>>  So, I upgraded from hadoop 0.20 - hbase 0.90 - zookeeper 3.3.5
>>>>>>>>>>>>>>
>>>>>>>>>>>>> to
>>>>
>>>>>     hadoop
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  2.0.0
>>>>>>>>>>>>>
>>>>>>>>>>>>>    - hbase 0.94 - zookeeper 3.4.5.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Unfortunately the RSs did not stop crashing, and worst! Now
>>>>>>>>>>>>>>
>>>>>>>>>>>>> they
>>>>
>>>>> crash
>>>>>>>>>>>>>> every
>>>>>>>>>>>>>> hour and some times when the RS that holds the .ROOT. crashes
>>>>>>>>>>>>>>
>>>>>>>>>>>>> all
>>>>
>>>>>     cluster
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  get
>>>>>>>>>>>>>
>>>>>>>>>>>>>    stuck in transition and everything stops working.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> In this case I need to clean zookeeper znodes, restart the
>>>>>>>>>>>>>>
>>>>>>>>>>>>> master
>>>>
>>>>> and
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>     the
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  RSs.
>>>>>>>>>>>>>
>>>>>>>>>>>>>    To avoid this case I am running on production with only ONE
>>>>>>>>>>>>> RS
>>>>>>>>>>>>>
>>>>>>>>>>>> and
>>>>>
>>>>>> a
>>>>>>>>>>>>>> monitoring
>>>>>>>>>>>>>> script that check every minute, if the RS is ok. If not,
>>>>>>>>>>>>>>
>>>>>>>>>>>>> restart
>>>>
>>>>> it.
>>>>>>>>>>>>>> * This case does not get the cluster stuck.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This is driving me crazy, but I really cant find a solution
>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>
>>>>>>>>>>>>> the
>>>>>
>>>>>> cluster.
>>>>>>>>>>>>>> I tracked all logs from the start time 16:49 from all
>>>>>>>>>>>>>>
>>>>>>>>>>>>> interesting
>>>>
>>>>> nodes
>>>>>>>>>>>>>> (zoo,
>>>>>>>>>>>>>> namenode, master, rs, dn2, dn9, dn10) and copied here what I
>>>>>>>>>>>>>>
>>>>>>>>>>>>> think
>>>>>
>>>>>> is
>>>>>>>>>>>>>> usefull.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> There are some strange errors in the DATANODE2, as an error
>>>>>>>>>>>>>> copiyng a
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>     block
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>      to itself.
>>>>>>>>>>>>>
>>>>>>>>>>>>>  The gc log points to GC timeout. However it is very weird that
>>>>>>>>>>>>>>
>>>>>>>>>>>>> the
>>>>>
>>>>>> RS
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>     spend
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>      so much time in GC while in the other cases it takes
>>>>>>>>>>>>> 0.001sec.
>>>>>>>>>>>>>
>>>>>>>>>>>>>  Besides,
>>>>>>>>>>>>>> the time
>>>>>>>>>>>>>> spent, is in sys which makes me think that might be a problem
>>>>>>>>>>>>>>
>>>>>>>>>>>>> in
>>>>
>>>>>     another
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  place.
>>>>>>>>>>>>>
>>>>>>>>>>>>>    I know that it is a bunch of logs, and that it is very
>>>>>>>>>>>>>
>>>>>>>>>>>> difficult
>>>>
>>>>> to
>>>>>
>>>>>>     find
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  the
>>>>>>>>>>>>>
>>>>>>>>>>>>>    problem without much context. But I REALLY need some help.
>>>>>>>>>>>>> If
>>>>>>>>>>>>>
>>>>>>>>>>>> it
>>>>
>>>>> is
>>>>>
>>>>>> not
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>     the
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>      solution, at least what I should read, where I should
>>>>>>>>>>>>> look, or
>>>>>>>>>>>>> which
>>>>>>>>>>>>>
>>>>>>>>>>>>>      cases
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>      I
>>>>>>>>>>>>>
>>>>>>>>>>>>>  should monitor.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thank you very much,
>>>>>>>>>>>>>> Pablo Musa
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  --
>>>>>>> *Regards,*
>>>>>>> --- *Sreepathi *
>>>>>>>
>>>>>>>
>>> --
>>> Best regards,
>>>
>>>      - Andy
>>>
>>> Problems worthy of attack prove their worth by hitting back. - Piet Hein
>>> (via Tom White)
>>>
>>
>

Re: RegionServers Crashing every hour in production env

Reply via email to