In the first case there clearly is a pause of 13 seconds, and in the
second case it talks of a 60 secs lapse of time when the master's
zookeeper client wasn't able to talk to the zookeeper server. As far
as I can tell there's something weird going on in your environment
(network issues maybe?).

J-D

On Thu, Apr 14, 2011 at 12:51 AM, bijieshan <[email protected]> wrote:
> Hi,
>   I found this problem when the HBase cluster was running,here the logs 
> information:
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> 2011-03-21 13:26:39,697 INFO org.apache.zookeeper.ClientCnxn: Opening socket 
> connection to server t1/157.5.111.11:2181
> 2011-03-21 13:26:39,698 INFO org.apache.zookeeper.ClientCnxn: Socket 
> connection established to t1/157.5.111.11:2181, initiating session
> 2011-03-21 13:26:53,035 INFO org.apache.zookeeper.ClientCnxn: Client session 
> timed out, have not heard from server in 13336ms for sessionid 
> 0x22e8e6ee15f0046, closing socket connection and attempting reconnect
> 2011-03-21 13:26:53,135 WARN org.apache.hadoop.hbase.zookeeper.ZKUtil: 
> master:60000-0x22e8e6ee15f0046 Unable to get data of znode 
> /hbase/unassigned/59ba25120921011b7d9ed4025d30c105
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /hbase/unassigned/59ba25120921011b7d9ed4025d30c105
>         at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
>         at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>         at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:932)
>         at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:549)
>         at 
> org.apache.hadoop.hbase.zookeeper.ZKAssign.getData(ZKAssign.java:739)
>         at 
> org.apache.hadoop.hbase.master.AssignmentManager.nodeDataChanged(AssignmentManager.java:525)
>         at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:268)
>         at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:501)
> 2011-03-21 13:26:53,137 ERROR 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: 
> master:60000-0x22e8e6ee15f0046 Received unexpected KeeperException, 
> re-throwing exception
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /hbase/unassigned/59ba25120921011b7d9ed4025d30c105
>         at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
>         at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>         at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:932)
>         at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:549)
>         at 
> org.apache.hadoop.hbase.zookeeper.ZKAssign.getData(ZKAssign.java:739)
>         at 
> org.apache.hadoop.hbase.master.AssignmentManager.nodeDataChanged(AssignmentManager.java:525)
>         at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:268)
>         at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:501)
> 2011-03-21 13:26:53,138 FATAL org.apache.hadoop.hbase.master.HMaster: 
> Unexpected ZK exception reading unassigned node data
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /hbase/unassigned/59ba25120921011b7d9ed4025d30c105
>         at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
>         at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>         at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:932)
>         at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:549)
>         at 
> org.apache.hadoop.hbase.zookeeper.ZKAssign.getData(ZKAssign.java:739)
>         at 
> org.apache.hadoop.hbase.master.AssignmentManager.nodeDataChanged(AssignmentManager.java:525)
>         at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:268)
>         at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:501)
> 2011-03-21 13:26:53,138 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> When I restart the cluster,the problem is still exist(Due to the unnormally 
> Zookeeper process):
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> 11-03-21 14:43:27,565 INFO org.apache.zookeeper.ZooKeeper: Initiating client 
> connection, connectString=t2:2181,t1:2181,t0:2181 sessionTimeout=180000 
> watcher=master:60000
> 2011-03-21 14:43:27,573 INFO org.apache.zookeeper.ClientCnxn: Opening socket 
> connection to server t1/157.5.111.11:2181
> 2011-03-21 14:43:27,582 INFO org.apache.zookeeper.ClientCnxn: Socket 
> connection established to t1/157.5.111.11:2181, initiating session
> 2011-03-21 14:44:27,586 INFO org.apache.zookeeper.ClientCnxn: Client session 
> timed out, have not heard from server in 60003ms for sessionid 0x0, closing 
> socket connection and attempting reconnect
> 2011-03-21 14:44:27,699 ERROR 
> org.apache.hadoop.hbase.master.HMasterCommandLine: Failed to start master
> java.lang.RuntimeException: Failed construction of Master: class 
> org.apache.hadoop.hbase.master.HMaster
>         at 
> org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:1071)
>         at 
> org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:142)
>         at 
> org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:102)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at 
> org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76)
>         at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1085)
> Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: 
> KeeperErrorCode = ConnectionLoss for /hbase
>         at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
>         at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>         at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:648)
>         at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndFailSilent(ZKUtil.java:902)
>         at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:133)
>         at org.apache.hadoop.hbase.master.HMaster.<init>(HMaster.java:219)
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
>         at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>         at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>         at 
> org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:1066)
>         ... 5 more
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> This problem is most similar to the phenomenon described in the issue of:
> https://issues.apache.org/jira/browse/HBASE-3062
> And the bug has been fixed in the version of HBase 0.90.1.
> Please help to analysis the problem.Thank you.
> Expecting to the response.
>
> Regards,
> Jieshan
>
>

Reply via email to