Hi
I've updated our dev environment from Hbase 0.90.0 (ASF+CDH3b3) which behaved very stable to Hbase 0.90.1 (CDH3B4) and since then the HMaster dies regularly. Issue seems to be regarded to the connection to Zookeeper. Even if I use a standby HMaster, this one also dies from same cause:


2011-03-04 15:05:54,699 FATAL org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: Unexpected exception handling nodeDeleted event org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/master
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
    at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:809)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndCheckExists(ZKUtil.java:232) at org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.nodeDeleted(ZooKeeperNodeTracker.java:165) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:261) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530)
    at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506)
2011-03-04 15:05:54,704 INFO org.apache.zookeeper.ZooKeeper: Session: 0x22e80dcc2350001 closed 2011-03-04 15:05:54,704 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down 2011-03-04 15:05:54,718 INFO org.apache.zookeeper.ZooKeeper: Session: 0x22e80dcc2350000 closed 2011-03-04 15:05:54,718 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down 2011-03-04 15:05:54,718 INFO org.apache.hadoop.hbase.master.HMaster: HMaster main thread exiting


just before this one there is an other exception

2011-03-04 15:07:00,611 FATAL org.apache.hadoop.hbase.master.HMaster: Failed assignment of regions to serverName=desktop,60020,1299242075991, load=(requests=0, regions=0, usedHeap=34, maxHeap=996) org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed setting up proxy interface org.apache.hadoop.hbase.ipc.HRegionInterface to /172.28.124.148:60020 after attempts=1
    at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:355)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:954) at org.apache.hadoop.hbase.master.ServerManager.getServerConnection(ServerManager.java:606) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:560) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:776) at org.apache.hadoop.hbase.master.AssignmentManager$SingleServerBulkAssigner.run(AssignmentManager.java:1310) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:619)
Caused by: java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:408)
at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:328) at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:883)
    at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:750)
at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
    at $Proxy6.getProtocolVersion(Unknown Source)
    at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:419)
    at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:393)
    at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:444)
    at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:349)
    ... 8 more
2011-03-04 15:07:00,615 INFO org.apache.hadoop.hbase.master.HMaster: Aborting

Any hint for me what could be wrong there?

Thanks
Daniel

Reply via email to