In the first case there clearly is a pause of 13 seconds, and in the second case it talks of a 60 secs lapse of time when the master's zookeeper client wasn't able to talk to the zookeeper server. As far as I can tell there's something weird going on in your environment (network issues maybe?).
J-D On Thu, Apr 14, 2011 at 12:51 AM, bijieshan <[email protected]> wrote: > Hi, > I found this problem when the HBase cluster was running,here the logs > information: > ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > 2011-03-21 13:26:39,697 INFO org.apache.zookeeper.ClientCnxn: Opening socket > connection to server t1/157.5.111.11:2181 > 2011-03-21 13:26:39,698 INFO org.apache.zookeeper.ClientCnxn: Socket > connection established to t1/157.5.111.11:2181, initiating session > 2011-03-21 13:26:53,035 INFO org.apache.zookeeper.ClientCnxn: Client session > timed out, have not heard from server in 13336ms for sessionid > 0x22e8e6ee15f0046, closing socket connection and attempting reconnect > 2011-03-21 13:26:53,135 WARN org.apache.hadoop.hbase.zookeeper.ZKUtil: > master:60000-0x22e8e6ee15f0046 Unable to get data of znode > /hbase/unassigned/59ba25120921011b7d9ed4025d30c105 > org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode > = ConnectionLoss for /hbase/unassigned/59ba25120921011b7d9ed4025d30c105 > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:90) > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:42) > at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:932) > at > org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:549) > at > org.apache.hadoop.hbase.zookeeper.ZKAssign.getData(ZKAssign.java:739) > at > org.apache.hadoop.hbase.master.AssignmentManager.nodeDataChanged(AssignmentManager.java:525) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:268) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:501) > 2011-03-21 13:26:53,137 ERROR > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: > master:60000-0x22e8e6ee15f0046 Received unexpected KeeperException, > re-throwing exception > org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode > = ConnectionLoss for /hbase/unassigned/59ba25120921011b7d9ed4025d30c105 > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:90) > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:42) > at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:932) > at > org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:549) > at > org.apache.hadoop.hbase.zookeeper.ZKAssign.getData(ZKAssign.java:739) > at > org.apache.hadoop.hbase.master.AssignmentManager.nodeDataChanged(AssignmentManager.java:525) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:268) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:501) > 2011-03-21 13:26:53,138 FATAL org.apache.hadoop.hbase.master.HMaster: > Unexpected ZK exception reading unassigned node data > org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode > = ConnectionLoss for /hbase/unassigned/59ba25120921011b7d9ed4025d30c105 > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:90) > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:42) > at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:932) > at > org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:549) > at > org.apache.hadoop.hbase.zookeeper.ZKAssign.getData(ZKAssign.java:739) > at > org.apache.hadoop.hbase.master.AssignmentManager.nodeDataChanged(AssignmentManager.java:525) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:268) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:501) > 2011-03-21 13:26:53,138 INFO org.apache.hadoop.hbase.master.HMaster: Aborting > ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > When I restart the cluster,the problem is still exist(Due to the unnormally > Zookeeper process): > ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > 11-03-21 14:43:27,565 INFO org.apache.zookeeper.ZooKeeper: Initiating client > connection, connectString=t2:2181,t1:2181,t0:2181 sessionTimeout=180000 > watcher=master:60000 > 2011-03-21 14:43:27,573 INFO org.apache.zookeeper.ClientCnxn: Opening socket > connection to server t1/157.5.111.11:2181 > 2011-03-21 14:43:27,582 INFO org.apache.zookeeper.ClientCnxn: Socket > connection established to t1/157.5.111.11:2181, initiating session > 2011-03-21 14:44:27,586 INFO org.apache.zookeeper.ClientCnxn: Client session > timed out, have not heard from server in 60003ms for sessionid 0x0, closing > socket connection and attempting reconnect > 2011-03-21 14:44:27,699 ERROR > org.apache.hadoop.hbase.master.HMasterCommandLine: Failed to start master > java.lang.RuntimeException: Failed construction of Master: class > org.apache.hadoop.hbase.master.HMaster > at > org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:1071) > at > org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:142) > at > org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:102) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > at > org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76) > at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1085) > Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: > KeeperErrorCode = ConnectionLoss for /hbase > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:90) > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:42) > at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:648) > at > org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndFailSilent(ZKUtil.java:902) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:133) > at org.apache.hadoop.hbase.master.HMaster.<init>(HMaster.java:219) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) > at java.lang.reflect.Constructor.newInstance(Constructor.java:513) > at > org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:1066) > ... 5 more > ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > This problem is most similar to the phenomenon described in the issue of: > https://issues.apache.org/jira/browse/HBASE-3062 > And the bug has been fixed in the version of HBase 0.90.1. > Please help to analysis the problem.Thank you. > Expecting to the response. > > Regards, > Jieshan > >
