聪聪: HBase 0.96 is quite old. Please consider upgrading to latest 0.98.14 or 1.1.2 release.
Cheers On Tue, Sep 1, 2015 at 5:40 AM, Samir Ahmic <[email protected]> wrote: > Hi, > Based on you logs: > 2015-09-01 15:35:58,047 INFO [JvmPauseMonitor] util.JvmPauseMonitor: > Detected pause in JVM or host machine *(eg GC): pause of approximately > 4954ms* > > you had long running GC which cause timeout in communication between > regionserver and zookeeper: > 2015-09-01 15:36:04,970 INFO [main-EventThread] > zookeeper.RegionServerTracker: *RegionServer ephemeral node deleted, > processing expiration [l-hbase4.dba.cn1.qunar.com > <http://l-hbase4.dba.cn1.qunar.com>,* > > *60020,1440573682913]* > that have cause to rs shutdown. > > Check your GC configuration in hbase-env.sh, also check load that you are > generating on your cluster. > > Regards > Samir > > > On Tue, Sep 1, 2015 at 1:59 PM, 聪聪 <[email protected]> wrote: > > > hi,all: > > > > > > I use the HBase version is hbase-0.96.0.This afternoon(2015-09-01 > 15:35),I > > met a problem.One of regionservers shutdown,I don't know why.Can we get > > some help over here? > > > > > > > > Regionserver on the log is as follows: > > 2015-09-01 15:35:45,476 DEBUG > > [regionserver60020-smallCompactions-1440573854394] backup.HFileArchiver: > > Finished archiving from class > > org.apache.hadoop.hbase.backup.HFileArchiver$FileableStoreFile, > > > file:hdfs://mycluster:8020/hbase/airfare/data/atp/atp_fare/c0eb67f08c2e3818f2e52812ec69b71c/i/ff73fb07c62044c082f6b4f92e0ed7ca, > > to > > > hdfs://mycluster:8020/hbase/airfare/archive/data/atp/atp_fare/c0eb67f08c2e3818f2e52812ec69b71c/i/ff73fb07c62044c082f6b4f92e0ed7ca > > 2015-09-01 15:35:45,476 INFO > > [regionserver60020-smallCompactions-1440573854394] regionserver.HStore: > > Completed compaction of 3 file(s) in i of > > atp:atp_fare,I,1440507763404.c0eb67f08c2e3818f2e52812ec69b71c. into > > fb746308d219490d81e5f8f1dd8b60f1(size=56.9 M), total size for store is > 1.8 > > G. This selection was in queue for 0sec, and took 4sec to execute. > > 2015-09-01 15:35:45,476 INFO > > [regionserver60020-smallCompactions-1440573854394] > > regionserver.CompactSplitThread: Completed compaction: Request = > > > regionName=atp:atp_fare,I,1440507763404.c0eb67f08c2e3818f2e52812ec69b71c., > > storeName=i, fileCount=3, fileSize=56.9 M, priority=24, > > time=29115372534800314; duration=4sec > > 2015-09-01 15:35:45,476 DEBUG > > [regionserver60020-smallCompactions-1440573854394] > > regionserver.CompactSplitThread: CompactSplitThread Status: > > compaction_queue=(0:0), split_queue=0, merge_queue=0 > > 2015-09-01 15:35:58,047 INFO [JvmPauseMonitor] util.JvmPauseMonitor: > > Detected pause in JVM or host machine (eg GC): pause of approximately > 4954ms > > GC pool 'G1 Young Generation' had collection(s): count=1 time=224ms > > GC pool 'G1 Old Generation' had collection(s): count=1 time=5077ms > > 2015-09-01 15:36:04,883 INFO [main] zookeeper.ZooKeeper: Client > > environment:zookeeper.version=3.4.5-cdh5.2.0--1, built on 10/11/2014 > 20:49 > > GMT > > 2015-09-01 15:36:04,883 INFO [main] zookeeper.ZooKeeper: Client > > environment:host.name=l-hbase4.dba.cn1.qunar.com > > 2015-09-01 15:36:04,884 INFO [main] zookeeper.ZooKeeper: Client > > environment:java.version=1.7.0_45 > > 2015-09-01 15:36:04,884 INFO [main] zookeeper.ZooKeeper: Client > > environment:java.vendor=Oracle Corporation > > 2015-09-01 15:36:04,884 INFO [main] zookeeper.ZooKeeper: Client > > environment:java.home=/home/q/java/jdk1.7.0_45/jre > > > > > > > > > > > > Master on the log is as follows: > > 015-09-01 15:32:40,918 DEBUG [master:l-namenode1:60000.oldLogCleaner] > > master.ReplicationLogCleaner: Didn't find this log in ZK, deleting: > > l-hbase2.dba.cn1.qunar.com%2C60020%2C1440559908245.1441089788639 > > 2015-09-01 15:32:40,920 DEBUG [master:l-namenode1:60000.oldLogCleaner] > > master.ReplicationLogCleaner: Didn't find this log in ZK, deleting: > > l-hbase2.dba.cn1.qunar.com%2C60020%2C1440559908245.1441089839980 > > 2015-09-01 15:32:40,963 DEBUG [master:l-namenode1:60000.oldLogCleaner] > > master.ReplicationLogCleaner: Didn't find this log in ZK, deleting: > > l-hbase2.dba.cn1.qunar.com%2C60020%2C1440559908245.1441089890476 > > 2015-09-01 15:32:40,966 DEBUG [master:l-namenode1:60000.oldLogCleaner] > > master.ReplicationLogCleaner: Didn't find this log in ZK, deleting: > > l-hbase2.dba.cn1.qunar.com%2C60020%2C1440559908245.1441089939050 > > 2015-09-01 15:35:40,697 DEBUG [master:l-namenode1:60000.oldLogCleaner] > > master.ReplicationLogCleaner: Didn't find this log in ZK, deleting: > > l-hbase5.dba.cn1.qunar.com%2C60020%2C1440574191618.1441088646120 > > 2015-09-01 15:36:04,970 INFO [main-EventThread] > > zookeeper.RegionServerTracker: RegionServer ephemeral node deleted, > > processing expiration [l-hbase4.dba.cn1.qunar.com,60020,1440573682913] > > 2015-09-01 15:36:04,973 DEBUG [main-EventThread] > master.AssignmentManager: > > based on AM, current region=hbase:meta,,1.1588230740 is on server= > > l-hbase3.dba.cn1.qunar.com,60020,1440559933207 server being checked: > > l-hbase4.dba.cn1.qunar.com,60020,1440573682913 > > 2015-09-01 15:36:04,973 DEBUG [main-EventThread] master.ServerManager: > > Added=l-hbase4.dba.cn1.qunar.com,60020,1440573682913 to dead servers, > > submitted shutdown handler to be executed meta=false > > 2015-09-01 15:36:04,976 DEBUG [main-EventThread] > > zookeeper.RegionServerTracker: RS node: /hbase/airfare/rs/ > > l-hbase5.dba.cn1.qunar.com,60020,1440574191618 data: PBUF^H��^C > > 2015-09-01 15:36:04,976 DEBUG [main-EventThread] > > zookeeper.RegionServerTracker: RS node: /hbase/airfare/rs/ > > l-hbase3.dba.cn1.qunar.com,60020,1440559933207 data: PBUF^H��^C > > 2015-09-01 15:36:04,976 DEBUG [main-EventThread] > > zookeeper.RegionServerTracker: RS node: /hbase/airfare/rs/ > > l-hbase1.dba.cn1.qunar.com,60020,1440573706827 data: PBUF^H��^C > > 2015-09-01 15:36:04,977 DEBUG [main-EventThread] > > zookeeper.RegionServerTracker: RS node: /hbase/airfare/rs/ > > l-hbase2.dba.cn1.qunar.com,60020,1440559908245 data: PBUF^H��^C > > 2015-09-01 15:36:05,045 INFO > > [MASTER_SERVER_OPERATIONS-l-namenode1:60000-2] > > handler.ServerShutdownHandler: Splitting logs for > > l-hbase4.dba.cn1.qunar.com,60020,1440573682913 before assignment. > > 2015-09-01 15:36:05,047 DEBUG > > [MASTER_SERVER_OPERATIONS-l-namenode1:60000-2] master.MasterFileSystem: > > Renamed region directory: hdfs://mycluster:8020/hbase/airfare/WALs/ > > l-hbase4.dba.cn1.qunar.com,60020,1440573682913-splitting > > 2015-09-01 15:36:05,047 INFO > > [MASTER_SERVER_OPERATIONS-l-namenode1:60000-2] master.SplitLogManager: > dead > > splitlog workers [l-hbase4.dba.cn1.qunar.com,60020,1440573682913] >
