Our hbase-master-server was shutdown with following message. Hbase is runnig in Distributed mode in a single node. I checked that GC completed in a very short time at the time of output the WARN. In addition the other system that is running in the same architecture doesn't output the following WARN messsage and works well. So I think that this is not due to a long GC pause.
Do you have any idea about the problem? 2013-01-30 03:07:48,582 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 28970ms instead of 1000ms, this is likely due to a long garbage collecting pause and it's usually bad, see http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired 2013-01-30 03:07:48,583 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 36902ms instead of 10000ms, this is likely due to a long garbage collecting pause and it's usually bad, see http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired 2013-01-30 03:07:48,585 INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from server in 39989ms for sessionid 0x13c84cebfce0000, closing socket connection and attempting reconnect 2013-01-30 03:07:48,586 INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from server in 39987ms for sessionid 0x13c84cebfce0001, closing socket connection and attempting reconnect 2013-01-30 03:07:52,779 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server VM_11/192.168.152.1:2181 2013-01-30 03:07:52,789 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to VM_11/192.168.152.1:2181, initiating session 2013-01-30 03:07:52,777 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server VM_11/192.168.152.1:2181 2013-01-30 03:07:52,793 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to VM_11/192.168.152.1:2181, initiating session 2013-01-30 03:07:52,794 INFO org.apache.zookeeper.ClientCnxn: Unable to reconnect to ZooKeeper service, session 0x13c84cebfce0001 has expired, closing socket connection 2013-01-30 03:07:52,794 INFO org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: This client just lost it's session with ZooKeeper, trying to reconnect. 2013-01-30 03:07:52,794 INFO org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: Trying to reconnect to zookeeper. 2013-01-30 03:07:52,795 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=VM_11:2181 sessionTimeout=180000 watcher=hconnection 2013-01-30 03:07:52,812 INFO org.apache.zookeeper.ClientCnxn: Unable to reconnect to ZooKeeper service, session 0x13c84cebfce0000 has expired, closing socket connection 2013-01-30 03:07:52,813 FATAL org.apache.hadoop.hbase.master.HMaster: master:60000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000 master:60000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000 received expired from ZooKeeper, aborting org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:361) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:279) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:526) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:502) 2013-01-30 03:07:52,813 INFO org.apache.hadoop.hbase.master.HMaster: Aborting 2013-01-30 03:07:52,813 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down 2013-01-30 03:07:52,813 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server VM_11/192.168.152.1:2181 2013-01-30 03:07:52,814 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to VM_11/192.168.152.1:2181, initiating session 2013-01-30 03:07:52,815 ERROR org.apache.hadoop.hbase.master.HMaster: Region server serverName=VM_11,60020,1359437833300, load=(requests=0, regions=3, usedHeap=45, maxHeap=997) reported a fatal error: ABORTING region server serverName=VM_11,60020,1359437833300, load=(requests=0, regions=3, usedHeap=45, maxHeap=997): regionserver:60020-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002 regionserver:60020-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002 received expired from ZooKeeper, aborting Cause: org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:361) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:279) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:526) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:502) 2013-01-30 03:07:52,820 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server VM_11/192.168.152.1:2181, sessionid = 0x13c84cebfce0005, negotiated timeout = 40000 2013-01-30 03:07:52,841 INFO org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: Reconnected successfully. This disconnect could have been caused by a network partition or a long-running GC pause, either way it's recommended that you verify your environment. 2013-01-30 03:07:52,841 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down 2013-01-30 03:07:53,614 INFO org.apache.hadoop.hbase.master.LogCleaner: master-VM_11:60000.oldLogCleaner exiting 2013-01-30 03:07:54,251 INFO org.apache.hadoop.hbase.master.HMaster$2: VM_11:60000-BalancerChore exiting 2013-01-30 03:07:54,251 DEBUG org.apache.hadoop.hbase.master.HMaster: Stopping service threads 2013-01-30 03:07:54,251 INFO org.apache.hadoop.ipc.HBaseServer: Stopping server on 60000 2013-01-30 03:07:54,252 INFO org.apache.hadoop.hbase.master.HMaster: Stopping infoServer 2013-01-30 03:07:54,325 INFO org.mortbay.log: Stopped [email protected]:60010 2013-01-30 03:07:54,326 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 5 on 60000: exiting 2013-01-30 03:07:54,326 INFO org.apache.hadoop.ipc.HBaseServer: Stopping IPC Server listener on 60000 2013-01-30 03:07:54,327 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 9 on 60000: exiting 2013-01-30 03:07:54,327 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 8 on 60000: exiting 2013-01-30 03:07:54,327 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 7 on 60000: exiting 2013-01-30 03:07:54,327 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 6 on 60000: exiting 2013-01-30 03:07:54,327 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 4 on 60000: exiting 2013-01-30 03:07:54,327 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 3 on 60000: exiting 2013-01-30 03:07:54,327 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 2 on 60000: exiting 2013-01-30 03:07:54,327 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 1 on 60000: exiting 2013-01-30 03:07:54,327 INFO org.apache.hadoop.hbase.master.CatalogJanitor: VM_11:60000-CatalogJanitor exiting 2013-01-30 03:07:54,328 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 0 on 60000: exiting 2013-01-30 03:07:54,328 INFO org.apache.hadoop.ipc.HBaseServer: Stopping IPC Server Responder 2013-01-30 03:07:54,337 WARN org.apache.hadoop.hbase.zookeeper.ZKUtil: master:60000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000 Unable to get data of znode /hbase/master org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase/master at org.apache.zookeeper.KeeperException.create(KeeperException.java:118) at org.apache.zookeeper.KeeperException.create(KeeperException.java:42) at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:927) at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataInternal(ZKUtil.java:577) at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:554) at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAsAddress(ZKUtil.java:648) at org.apache.hadoop.hbase.master.ActiveMasterManager.stop(ActiveMasterManager.java:202) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:318) 2013-01-30 03:07:54,337 ERROR org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: master:60000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000 Received unexpected KeeperException, re-throwing exception org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase/master at org.apache.zookeeper.KeeperException.create(KeeperException.java:118) at org.apache.zookeeper.KeeperException.create(KeeperException.java:42) at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:927) at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataInternal(ZKUtil.java:577) at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:554) at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAsAddress(ZKUtil.java:648) at org.apache.hadoop.hbase.master.ActiveMasterManager.stop(ActiveMasterManager.java:202) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:318) 2013-01-30 03:07:54,337 ERROR org.apache.hadoop.hbase.master.ActiveMasterManager: master:60000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000 Error deleting our own master address node org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase/master at org.apache.zookeeper.KeeperException.create(KeeperException.java:118) at org.apache.zookeeper.KeeperException.create(KeeperException.java:42) at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:927) at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataInternal(ZKUtil.java:577) at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:554) at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAsAddress(ZKUtil.java:648) at org.apache.hadoop.hbase.master.ActiveMasterManager.stop(ActiveMasterManager.java:202) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:318) 2013-01-30 03:07:54,337 DEBUG org.apache.hadoop.hbase.catalog.CatalogTracker: Stopping catalog tracker org.apache.hadoop.hbase.catalog.CatalogTracker@4743bf3d 2013-01-30 03:07:54,337 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: The connection to hconnection-0x13c84cebfce0005 has been closed. 2013-01-30 03:07:54,338 INFO org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: Closed zookeeper sessionid=0x13c84cebfce0005 2013-01-30 03:07:54,339 INFO org.apache.zookeeper.ZooKeeper: Session: 0x13c84cebfce0005 closed 2013-01-30 03:07:54,339 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: The connection to null has been closed. 2013-01-30 03:07:54,339 INFO org.apache.hadoop.hbase.master.HMaster: HMaster main thread exiting 2013-01-30 03:07:54,339 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down 2013-01-30 03:07:54,339 INFO org.apache.hadoop.hbase.master.AssignmentManager$TimeoutMonitor: VM_11:60000.timeoutMonitor exiting -- View this message in context: http://apache-hbase.679495.n3.nabble.com/hbase-master-server-slept-tp4038192.html Sent from the HBase User mailing list archive at Nabble.com.
