What zookeeper version are you using ? Is the ensemble managed by HBase ? Can you check the zookeeper log on 192.168.152.1<http://192.168.152.1:2181/> ? Use pastebin to show us the log if necessary.
Thanks On Fri, Feb 8, 2013 at 12:55 AM, So Hibino <[email protected]> wrote: > Our hbase-master-server was shutdown with following message. > Hbase is runnig in Distributed mode in a single node. > I checked that GC completed in a very short time at the time of output the > WARN. > In addition the other system that is running in the same architecture > doesn't output the following WARN messsage and works well. > So I think that this is not due to a long GC pause. > > Do you have any idea about the problem? > > 2013-01-30 03:07:48,582 WARN org.apache.hadoop.hbase.util.Sleeper: We slept > 28970ms instead of 1000ms, this is likely due to a long garbage collecting > pause and it's usually bad, see > http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired > 2013-01-30 03:07:48,583 WARN org.apache.hadoop.hbase.util.Sleeper: We slept > 36902ms instead of 10000ms, this is likely due to a long garbage collecting > pause and it's usually bad, see > http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired > 2013-01-30 03:07:48,585 INFO org.apache.zookeeper.ClientCnxn: Client > session > timed out, have not heard from server in 39989ms for sessionid > 0x13c84cebfce0000, closing socket connection and attempting reconnect > 2013-01-30 03:07:48,586 INFO org.apache.zookeeper.ClientCnxn: Client > session > timed out, have not heard from server in 39987ms for sessionid > 0x13c84cebfce0001, closing socket connection and attempting reconnect > 2013-01-30 03:07:52,779 INFO org.apache.zookeeper.ClientCnxn: Opening > socket > connection to server VM_11/192.168.152.1:2181 > 2013-01-30 03:07:52,789 INFO org.apache.zookeeper.ClientCnxn: Socket > connection established to VM_11/192.168.152.1:2181, initiating session > 2013-01-30 03:07:52,777 INFO org.apache.zookeeper.ClientCnxn: Opening > socket > connection to server VM_11/192.168.152.1:2181 > 2013-01-30 03:07:52,793 INFO org.apache.zookeeper.ClientCnxn: Socket > connection established to VM_11/192.168.152.1:2181, initiating session > 2013-01-30 03:07:52,794 INFO org.apache.zookeeper.ClientCnxn: Unable to > reconnect to ZooKeeper service, session 0x13c84cebfce0001 has expired, > closing socket connection > 2013-01-30 03:07:52,794 INFO > > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: > This client just lost it's session with ZooKeeper, trying to reconnect. > 2013-01-30 03:07:52,794 INFO > > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: > Trying to reconnect to zookeeper. > 2013-01-30 03:07:52,795 INFO org.apache.zookeeper.ZooKeeper: Initiating > client connection, connectString=VM_11:2181 sessionTimeout=180000 > watcher=hconnection > 2013-01-30 03:07:52,812 INFO org.apache.zookeeper.ClientCnxn: Unable to > reconnect to ZooKeeper service, session 0x13c84cebfce0000 has expired, > closing socket connection > 2013-01-30 03:07:52,813 FATAL org.apache.hadoop.hbase.master.HMaster: > > master:60000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000 > > master:60000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000 > received expired from ZooKeeper, aborting > org.apache.zookeeper.KeeperException$SessionExpiredException: > KeeperErrorCode = Session expired > at > > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:361) > at > > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:279) > at > > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:526) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:502) > 2013-01-30 03:07:52,813 INFO org.apache.hadoop.hbase.master.HMaster: > Aborting > 2013-01-30 03:07:52,813 INFO org.apache.zookeeper.ClientCnxn: EventThread > shut down > 2013-01-30 03:07:52,813 INFO org.apache.zookeeper.ClientCnxn: Opening > socket > connection to server VM_11/192.168.152.1:2181 > 2013-01-30 03:07:52,814 INFO org.apache.zookeeper.ClientCnxn: Socket > connection established to VM_11/192.168.152.1:2181, initiating session > 2013-01-30 03:07:52,815 ERROR org.apache.hadoop.hbase.master.HMaster: > Region > server serverName=VM_11,60020,1359437833300, load=(requests=0, regions=3, > usedHeap=45, maxHeap=997) reported a fatal error: > ABORTING region server serverName=VM_11,60020,1359437833300, > load=(requests=0, regions=3, usedHeap=45, maxHeap=997): > > regionserver:60020-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002 > > regionserver:60020-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002-0x13c84cebfce0002 > received expired from ZooKeeper, aborting > Cause: > org.apache.zookeeper.KeeperException$SessionExpiredException: > KeeperErrorCode = Session expired > at > > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:361) > at > > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:279) > at > > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:526) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:502) > > 2013-01-30 03:07:52,820 INFO org.apache.zookeeper.ClientCnxn: Session > establishment complete on server VM_11/192.168.152.1:2181, sessionid = > 0x13c84cebfce0005, negotiated timeout = 40000 > 2013-01-30 03:07:52,841 INFO > > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: > Reconnected successfully. This disconnect could have been caused by a > network partition or a long-running GC pause, either way it's recommended > that you verify your environment. > 2013-01-30 03:07:52,841 INFO org.apache.zookeeper.ClientCnxn: EventThread > shut down > 2013-01-30 03:07:53,614 INFO org.apache.hadoop.hbase.master.LogCleaner: > master-VM_11:60000.oldLogCleaner exiting > 2013-01-30 03:07:54,251 INFO org.apache.hadoop.hbase.master.HMaster$2: > VM_11:60000-BalancerChore exiting > 2013-01-30 03:07:54,251 DEBUG org.apache.hadoop.hbase.master.HMaster: > Stopping service threads > 2013-01-30 03:07:54,251 INFO org.apache.hadoop.ipc.HBaseServer: Stopping > server on 60000 > 2013-01-30 03:07:54,252 INFO org.apache.hadoop.hbase.master.HMaster: > Stopping infoServer > 2013-01-30 03:07:54,325 INFO org.mortbay.log: Stopped > [email protected]:60010 > 2013-01-30 03:07:54,326 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 5 on 60000: exiting > 2013-01-30 03:07:54,326 INFO org.apache.hadoop.ipc.HBaseServer: Stopping > IPC > Server listener on 60000 > 2013-01-30 03:07:54,327 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 9 on 60000: exiting > 2013-01-30 03:07:54,327 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 8 on 60000: exiting > 2013-01-30 03:07:54,327 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 7 on 60000: exiting > 2013-01-30 03:07:54,327 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 6 on 60000: exiting > 2013-01-30 03:07:54,327 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 4 on 60000: exiting > 2013-01-30 03:07:54,327 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 3 on 60000: exiting > 2013-01-30 03:07:54,327 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 2 on 60000: exiting > 2013-01-30 03:07:54,327 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 1 on 60000: exiting > 2013-01-30 03:07:54,327 INFO org.apache.hadoop.hbase.master.CatalogJanitor: > VM_11:60000-CatalogJanitor exiting > 2013-01-30 03:07:54,328 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 0 on 60000: exiting > 2013-01-30 03:07:54,328 INFO org.apache.hadoop.ipc.HBaseServer: Stopping > IPC > Server Responder > 2013-01-30 03:07:54,337 WARN org.apache.hadoop.hbase.zookeeper.ZKUtil: > > master:60000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000 > Unable to get data of znode /hbase/master > org.apache.zookeeper.KeeperException$SessionExpiredException: > KeeperErrorCode = Session expired for /hbase/master > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:118) > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:42) > at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:927) > at > org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataInternal(ZKUtil.java:577) > at > org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:554) > at > org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAsAddress(ZKUtil.java:648) > at > > org.apache.hadoop.hbase.master.ActiveMasterManager.stop(ActiveMasterManager.java:202) > at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:318) > 2013-01-30 03:07:54,337 ERROR > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: > > master:60000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000 > Received unexpected KeeperException, re-throwing exception > org.apache.zookeeper.KeeperException$SessionExpiredException: > KeeperErrorCode = Session expired for /hbase/master > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:118) > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:42) > at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:927) > at > org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataInternal(ZKUtil.java:577) > at > org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:554) > at > org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAsAddress(ZKUtil.java:648) > at > > org.apache.hadoop.hbase.master.ActiveMasterManager.stop(ActiveMasterManager.java:202) > at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:318) > 2013-01-30 03:07:54,337 ERROR > org.apache.hadoop.hbase.master.ActiveMasterManager: > > master:60000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000-0x13c84cebfce0000 > Error deleting our own master address node > org.apache.zookeeper.KeeperException$SessionExpiredException: > KeeperErrorCode = Session expired for /hbase/master > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:118) > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:42) > at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:927) > at > org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataInternal(ZKUtil.java:577) > at > org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:554) > at > org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAsAddress(ZKUtil.java:648) > at > > org.apache.hadoop.hbase.master.ActiveMasterManager.stop(ActiveMasterManager.java:202) > at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:318) > 2013-01-30 03:07:54,337 DEBUG > org.apache.hadoop.hbase.catalog.CatalogTracker: Stopping catalog tracker > org.apache.hadoop.hbase.catalog.CatalogTracker@4743bf3d > 2013-01-30 03:07:54,337 DEBUG > > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: > The connection to hconnection-0x13c84cebfce0005 has been closed. > 2013-01-30 03:07:54,338 INFO > > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: > Closed zookeeper sessionid=0x13c84cebfce0005 > 2013-01-30 03:07:54,339 INFO org.apache.zookeeper.ZooKeeper: Session: > 0x13c84cebfce0005 closed > 2013-01-30 03:07:54,339 DEBUG > > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: > The connection to null has been closed. > 2013-01-30 03:07:54,339 INFO org.apache.hadoop.hbase.master.HMaster: > HMaster > main thread exiting > 2013-01-30 03:07:54,339 INFO org.apache.zookeeper.ClientCnxn: EventThread > shut down > 2013-01-30 03:07:54,339 INFO > org.apache.hadoop.hbase.master.AssignmentManager$TimeoutMonitor: > VM_11:60000.timeoutMonitor exiting > > > > -- > View this message in context: > http://apache-hbase.679495.n3.nabble.com/hbase-master-server-slept-tp4038192.html > Sent from the HBase User mailing list archive at Nabble.com. >
