Regards, ch huang. Which version of HBase are you using? Please, check the following things: - zookeeper session timeout - zookeeper ticktime - hbase.zookeeper.property.maxClientsCnxns (default 35) - ulimit - increase the quantity of open files (32k or more)
2014-03-04 2:22 GMT+01:00 ch huang <[email protected]>: > hi,maillist: > this morning i check my hbase cluster log,and find all region > server down ,i do not know why,hope some expert can show me some clue, here > is the log which i find in the first death happened node > > 2014-03-03 17:16:11,413 DEBUG > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Stats: total=16.78 MB, > free=1.98 GB, max=2.00 GB, blocks=0, accesses=82645, hits=4, > hitRatio=0.00%, > , cachingAccesses=5, cachingHits=0, cachingHitsRatio=0, evictions=0, > evicted=5, evictedPerRun=Infinity > 2014-03-03 17:20:30,093 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server > listener on 60020: readAndProcess threw exception java.io.IOException: > Connection reset by peer. Count > of bytes read: 0 > java.io.IOException: Connection reset by peer > at sun.nio.ch.FileDispatcherImpl.read0(Native Method) > at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) > at sun.nio.ch.IOUtil.read(IOUtil.java:197) > at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379) > at > org.apache.hadoop.hbase.ipc.HBaseServer.channelRead(HBaseServer.java:1798) > at > > org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1181) > at > > org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:750) > at > > org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:541) > at > > org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:516) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > 2014-03-03 17:21:11,413 DEBUG > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Stats: total=17.28 MB, > free=1.98 GB, max=2.00 GB, blocks=4, accesses=88870, hits=3112, > hitRatio=3.5 > 0%, , cachingAccesses=3117, cachingHits=3108, cachingHitsRatio=99.71%, , > evictions=0, evicted=5, evictedPerRun=Infinity > 2014-03-03 17:21:45,112 WARN org.apache.hadoop.hdfs.DFSClient: > DFSOutputStream ResponseProcessor exception for block > BP-1043055049-192.168.11.11-1382442676609:blk_-716939259337 > 565008_4210841 > java.io.EOFException: Premature EOF: no length prefix available > at > > org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:171) > at > > org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:114) > at > > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:695) > 2014-03-03 17:21:45,116 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block > BP-1043055049-192.168.11.11-1382442676609:blk_-716939259337565008_4210841 > in pipeline 192 > .168.11.14:50010, 192.168.11.10:50010, 192.168.11.15:50010: bad datanode > 192.168.11.14:50010 > 2014-03-03 17:24:58,114 WARN org.apache.hadoop.ipc.HBaseServer: > (responseTooSlow): > {"processingtimems":36837,"call":"next(-2524485469510465096, 100), rpc > version=1, client versi > on=29, methodsFingerPrint=-1368823753","client":"192.168.11.174:39642 > > ","starttimems":1393838661274,"queuetimems":0,"class":"HRegionServer","responsesize":6,"method":"next"} > 2014-03-03 17:24:58,117 WARN org.apache.hadoop.ipc.HBaseServer: > (responseTooSlow): > {"processingtimems":36880,"call":"next(6510031569997476480, 100), rpc > version=1, client versio > n=29, methodsFingerPrint=-1368823753","client":"192.168.11.174:39642 > > ","starttimems":1393838661234,"queuetimems":1,"class":"HRegionServer","responsesize":6,"method":"next"} > 2014-03-03 17:24:58,117 WARN org.apache.hadoop.ipc.HBaseServer: > (responseTooSlow): > {"processingtimems":36880,"call":"next(-8080468273710364924, 100), rpc > version=1, client versi > on=29, methodsFingerPrint=-1368823753","client":"192.168.11.174:39642 > > ","starttimems":1393838661234,"queuetimems":1,"class":"HRegionServer","responsesize":6,"method":"next"} > 2014-03-03 17:24:58,118 WARN org.apache.hadoop.ipc.HBaseServer: > (responseTooSlow): > {"processingtimems":36882,"call":"next(-1838307716001367158, 100), rpc > version=1, client version=29, methodsFingerPrint=-1368823753","client":" > 192.168.11.174:39642 > > ","starttimems":1393838661234,"queuetimems":1,"class":"HRegionServer","responsesize":6,"method":"next"} > 2014-03-03 17:24:58,119 INFO org.apache.zookeeper.ClientCnxn: Client > session timed out, have not heard from server in 38421ms for sessionid > 0x441fb1d01a1759, closing socket connection and attempting reconnect > 2014-03-03 17:24:58,119 INFO org.apache.zookeeper.ClientCnxn: Client > session timed out, have not heard from server in 43040ms for sessionid > 0x1441fb1d0171783, closing socket connection and attempting reconnect > 2014-03-03 17:24:58,119 WARN org.apache.hadoop.hdfs.DFSClient: > DFSOutputStream ResponseProcessor exception for block > BP-1043055049-192.168.11.11-1382442676609:blk_-716939259337565008_4210898 > java.io.EOFException: Premature EOF: no length prefix available > at > > org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:171) > at > > org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:114) > at > > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:695) > 2014-03-03 17:24:58,121 WARN org.apache.hadoop.ipc.HBaseServer: > (responseTooSlow): > {"processingtimems":36885,"call":"next(-8505949350781203523, 100), rpc > version=1, client version=29, methodsFingerPrint=-1368823753","client":" > 192.168.11.174:39642 > > ","starttimems":1393838661234,"queuetimems":1,"class":"HRegionServer","responsesize":6,"method":"next"} > 2014-03-03 17:24:58,121 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block > BP-1043055049-192.168.11.11-1382442676609:blk_-716939259337565008_4210898 > in pipeline 192.168.11.10:50010, 192.168.11.15:50010: bad datanode > 192.168.11.10:50010 > 2014-03-03 17:24:58,123 WARN org.apache.hadoop.hbase.util.Sleeper: We > slept 36938ms instead of 3000ms, this is likely due to a long garbage > collecting pause and it's usually bad, see > http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired > 2014-03-03 17:24:58,123 WARN org.apache.hadoop.ipc.HBaseServer: > (responseTooSlow): > {"processingtimems":36883,"call":"next(-8147092390919583636, 100), rpc > version=1, client version=29, methodsFingerPrint=-1368823753","client":" > 192.168.11.174:39642 > > ","starttimems":1393838661234,"queuetimems":0,"class":"HRegionServer","responsesize":6,"method":"next"} > 2014-03-03 17:24:58,164 FATAL > org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server > ch14,60020,1393466171284: Unhandled exception: > org.apache.hadoop.hbase.YouAreDeadException: Server REPORT rejected; > currently processing ch14,60020,1393466171284 as dead server > at > > org.apache.hadoop.hbase.master.ServerManager.checkIsDead(ServerManager.java:254) > at > > org.apache.hadoop.hbase.master.ServerManager.regionServerReport(ServerManager.java:172) > at > > org.apache.hadoop.hbase.master.HMaster.regionServerReport(HMaster.java:1010) > at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source) > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > > org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1428) > org.apache.hadoop.hbase.YouAreDeadException: > org.apache.hadoop.hbase.YouAreDeadException: Server REPORT rejected; > currently processing ch14,60020,1393466171284 as dead server > at > > org.apache.hadoop.hbase.master.ServerManager.checkIsDead(ServerManager.java:254) > at > > org.apache.hadoop.hbase.master.ServerManager.regionServerReport(ServerManager.java:172) > at > > org.apache.hadoop.hbase.master.HMaster.regionServerReport(HMaster.java:1010) > at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source) > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > > org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1428) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > > org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:90) > at > > org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:79) > at > > org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:913) > at > > org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:770) > at java.lang.Thread.run(Thread.java:744) > Caused by: > > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hbase.YouAreDeadException): > org.apache.hadoop.hbase.YouAreDeadException: Server REPORT rejected; > currently processing ch14,60020,1393466171284 as dead server > at > > org.apache.hadoop.hbase.master.ServerManager.checkIsDead(ServerManager.java:254) > at > > org.apache.hadoop.hbase.master.ServerManager.regionServerReport(ServerManager.java:172) > at > > org.apache.hadoop.hbase.master.HMaster.regionServerReport(HMaster.java:1010) > at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source) > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > > org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1428) > at > org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:1004) > at > > org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:86) > at com.sun.proxy.$Proxy10.regionServerReport(Unknown Source) > at > > org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:910) > ... 2 more > 2014-03-03 17:24:58,165 FATAL > org.apache.hadoop.hbase.regionserver.HRegionServer: RegionServer abort: > loaded coprocessors are: > [org.apache.hadoop.hbase.coprocessor.AggregateImplementation] > 2014-03-03 17:24:58,167 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics: > requestsPerSecond=0, numberOfOnlineRegions=6, numberOfStores=10, > numberOfStorefiles=12, storefileIndexSizeMB=0, rootIndexSizeKB=13, > totalStaticIndexSizeKB=9510, totalStaticBloomSizeKB=19592, > memstoreSizeMB=236, mbInMemoryWithoutWAL=0, numberOfPutsWithoutWAL=0, > readRequestsCount=1019709, writeRequestsCount=6186, compactionQueueSize=0, > flushQueueSize=0, usedHeapMB=655, maxHeapMB=10227, blockCacheSizeMB=21.16, > blockCacheFreeMB=2024.29, blockCacheCount=38, blockCacheHitCount=43956, > blockCacheMissCount=126666, blockCacheEvictedCount=5, > blockCacheHitRatio=25%, blockCacheHitCachingRatio=99%, > hdfsBlocksLocalityIndex=77, slowHLogAppendCount=0, > fsReadLatencyHistogramMean=1057721.77, > fsReadLatencyHistogramCount=78137.00, > fsReadLatencyHistogramMedian=560103.50, > fsReadLatencyHistogram75th=635759.00, fsReadLatencyHistogram95th=922971.60, > fsReadLatencyHistogram99th=21089466.09, > fsReadLatencyHistogram999th=46033801.03, > fsPreadLatencyHistogramMean=2579298.00, fsPreadLatencyHistogramCount=1.00, > fsPreadLatencyHistogramMedian=2579298.00, > fsPreadLatencyHistogram75th=2579298.00, > fsPreadLatencyHistogram95th=2579298.00, > fsPreadLatencyHistogram99th=2579298.00, > fsPreadLatencyHistogram999th=2579298.00, > fsWriteLatencyHistogramMean=135732.90, > fsWriteLatencyHistogramCount=60564.00, > fsWriteLatencyHistogramMedian=114990.50, > fsWriteLatencyHistogram75th=120632.50, > fsWriteLatencyHistogram95th=129889.75, > fsWriteLatencyHistogram99th=138500.74, > fsWriteLatencyHistogram999th=12146321.04 > 2014-03-03 17:24:58,168 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Unhandled > exception: org.apache.hadoop.hbase.YouAreDeadException: Server REPORT > rejected; currently processing ch14,60020,1393466171284 as dead server > at > > org.apache.hadoop.hbase.master.ServerManager.checkIsDead(ServerManager.java:254) > at > > org.apache.hadoop.hbase.master.ServerManager.regionServerReport(ServerManager.java:172) > at > > org.apache.hadoop.hbase.master.HMaster.regionServerReport(HMaster.java:1010) > at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source) > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > > org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1428) > 2014-03-03 17:24:58,168 INFO org.apache.hadoop.ipc.HBaseServer: Stopping > server on 60020 > 2014-03-03 17:24:58,171 ERROR > org.apache.hadoop.hbase.regionserver.HRegionServer: > org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server > ch14,60020,1393466171284 not running, aborting > at > > org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:3395) > at > > org.apache.hadoop.hbase.regionserver.HRegionServer.close(HRegionServer.java:2558) > at sun.reflect.GeneratedMethodAccessor124.invoke(Unknown Source) > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > > org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1428) > 2014-03-03 17:24:58,193 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 72 on 60020: exiting > 2014-03-03 17:24:58,193 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 97 on 60020: exiting > 2014-03-03 17:24:58,207 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 191 on 60020: exiting > 2014-03-03 17:24:58,193 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC > Server handler 4 on 60020: exiting > 2014-03-03 17:24:58,207 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 162 on 60020: exiting > 2014-03-03 17:24:58,193 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC > Server handler 1 on 60020: exiting > 2014-03-03 17:24:58,193 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 71 on 60020: exiting > 2014-03-03 17:24:58,207 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 164 on 60020: exiting > 2014-03-03 17:24:58,193 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 96 on 60020: exiting > 2014-03-03 17:24:58,193 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 69 on 60020: exiting > 2014-03-03 17:24:58,207 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 168 on 60020: exiting > 2014-03-03 17:24:58,207 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 170 on 60020: exiting > 2014-03-03 17:24:58,193 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 95 on 60020: exiting > 2014-03-03 17:24:58,193 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 68 on 60020: exiting > 2014-03-03 17:24:58,207 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 215 on 60020: exiting > 2014-03-03 17:24:58,193 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 94 on 60020: exiting > 2014-03-03 17:24:58,207 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 175 on 60020: exiting > 2014-03-03 17:24:58,193 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 66 on 60020: exiting > 2014-03-03 17:24:58,192 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 93 on 60020: exiting > and here is the log which i find on hbase master node in the same time > > 2014-03-03 17:24:56,001 INFO > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Splitting > logs for ch14,60020,1393466171284 > 2014-03-03 17:24:56,032 DEBUG > org.apache.hadoop.hbase.master.MasterFileSystem: Renamed region directory: > hdfs://product/hbase/.logs/ch14,60020,1393466171284-splitting > 2014-03-03 17:24:56,032 INFO > org.apache.hadoop.hbase.master.SplitLogManager: dead splitlog workers > [ch14,60020,1393466171284] > 2014-03-03 17:24:56,033 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: Scheduling batch of logs to > split > 2014-03-03 17:24:56,033 INFO > org.apache.hadoop.hbase.master.SplitLogManager: started splitting logs in > [hdfs://product/hbase/.logs/ch14,60020,1393466171284-splitting] > 2014-03-03 17:24:56,051 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: put up splitlog task at > znode /hbase/splitlog/hdfs%3A%2F%2Fproduct%2Fhbase%2F.logs%2Fch14%2C60020%2 > C1393466171284-splitting%2Fch14%252C60020%252C1393466171284.1393838154291 > 2014-03-03 17:24:56,052 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: put up splitlog task at > znode /hbase/splitlog/hdfs%3A%2F%2Fproduct%2Fhbase%2F.logs%2Fch14%2C60020%2 > C1393466171284-splitting%2Fch14%252C60020%252C1393466171284.1393838170672 > 2014-03-03 17:24:56,052 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired > > /hbase/splitlog/hdfs%3A%2F%2Fproduct%2Fhbase%2F.logs%2Fch14%2C60020%2C1393466 > 171284-splitting%2Fch14%252C60020%252C1393466171284.1393838154291 ver = 0 > 2014-03-03 17:24:56,054 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired > > /hbase/splitlog/hdfs%3A%2F%2Fproduct%2Fhbase%2F.logs%2Fch14%2C60020%2C1393466 > 171284-splitting%2Fch14%252C60020%252C1393466171284.1393838170672 ver = 0 > 2014-03-03 17:24:56,062 INFO > org.apache.hadoop.hbase.master.SplitLogManager: task > > /hbase/splitlog/hdfs%3A%2F%2Fproduct%2Fhbase%2F.logs%2Fch14%2C60020%2C1393466171284-splitting%2 > Fch14%252C60020%252C1393466171284.1393838170672 acquired by > ch15,60020,1393466103201 > 2014-03-03 17:24:56,088 INFO > org.apache.hadoop.hbase.master.SplitLogManager: task > > /hbase/splitlog/hdfs%3A%2F%2Fproduct%2Fhbase%2F.logs%2Fch14%2C60020%2C1393466171284-splitting%2 > Fch14%252C60020%252C1393466171284.1393838154291 acquired by > ch12,60020,1393497841084 > 2014-03-03 17:24:56,725 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 2 unassigned > = 0 > 2014-03-03 17:24:57,725 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 2 unassigned > = 0 > 2014-03-03 17:24:58,156 DEBUG org.apache.hadoop.hbase.master.ServerManager: > Server REPORT rejected; currently processing ch14,60020,1393466171284 as > dead server > 2014-03-03 17:24:58,169 ERROR org.apache.hadoop.hbase.master.HMaster: > Region server ^@^@ch14,60020,1393466171284 reported a fatal error: > ABORTING region server ch14,60020,1393466171284: Unhandled exception: > org.apache.hadoop.hbase.YouAreDeadException: Server REPORT rejected; > currently processing ch14,60020,139346 > 6171284 as dead server > at > > org.apache.hadoop.hbase.master.ServerManager.checkIsDead(ServerManager.java:254) > at > > org.apache.hadoop.hbase.master.ServerManager.regionServerReport(ServerManager.java:172) > at > > org.apache.hadoop.hbase.master.HMaster.regionServerReport(HMaster.java:1010) > at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source) > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > > org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1428) > Cause: > org.apache.hadoop.hbase.YouAreDeadException: > org.apache.hadoop.hbase.YouAreDeadException: Server REPORT rejected; > currently processing ch14,60020,1393466171284 as dead server > at > > org.apache.hadoop.hbase.master.ServerManager.checkIsDead(ServerManager.java:254) > at > > org.apache.hadoop.hbase.master.ServerManager.regionServerReport(ServerManager.java:172) > at > > org.apache.hadoop.hbase.master.HMaster.regionServerReport(HMaster.java:1010) > at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source) > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > > org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1428) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > > org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:90) > at > > org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:79) > at > > org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:913) > at > > org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:770) > at java.lang.Thread.run(Thread.java:744) > Caused by: > > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hbase.YouAreDeadException): > org.apache.hadoop.hbase.YouAreDeadException: Server REPORT rejected; > currently processing ch14,60020,1393466171284 as dead server > at > > org.apache.hadoop.hbase.master.ServerManager.checkIsDead(ServerManager.java:254) > at > > org.apache.hadoop.hbase.master.ServerManager.regionServerReport(ServerManager.java:172) > at > > org.apache.hadoop.hbase.master.HMaster.regionServerReport(HMaster.java:1010) > at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source) > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > > org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1428) > at > org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:1004) > at > > org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:86) > at com.sun.proxy.$Proxy10.regionServerReport(Unknown Source) > at > > org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:910) > ... 2 more > 2014-03-03 17:24:58,439 ERROR org.apache.hadoop.hbase.master.HMaster: > Region server ^@^@ch14,60020,1393466171284 reported a fatal error: > ABORTING region server ch14,60020,1393466171284: > regionserver:60020-0x1441fb1d0171783 regionserver:60020-0x1441fb1d0171783 > received expired from ZooKeeper, aborting > Cause: > org.apache.zookeeper.KeeperException$SessionExpiredException: > KeeperErrorCode = Session expired > at > > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:363) > at > > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:282) > at > > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > 2014-03-03 17:24:58,726 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 2 unassigned > = 0 > 2014-03-03 17:24:59,726 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 2 unassigned > = 0 > 2014-03-03 17:25:00,726 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 2 unassigned > = 0 > 2014-03-03 17:25:01,726 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 2 unassigned > = 0 > 2014-03-03 17:25:02,726 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 2 unassigned > = 0 > -- Marcos Ortiz Valmaseda http://about.me/marcosortiz
