The first exception is about lost connection to the zk ensemble.  The
second is a failure establishing connection to 'desktop,60020'.  Later
we refer to an IP for desktop.  Is your dns set so resolve and reverse
lookup give same answer?  HBase 0.90.x is finicky in this regard (To
be fixed).
St.Ack

On Fri, Mar 4, 2011 at 6:39 AM, Daniel Iancu <[email protected]> wrote:
> Hi
> I've updated our dev environment from Hbase 0.90.0 (ASF+CDH3b3) which
> behaved very stable to Hbase 0.90.1 (CDH3B4) and since then the HMaster dies
> regularly. Issue seems to be regarded to the connection to Zookeeper. Even
> if I use a standby HMaster, this one also dies from same cause:
>
>
> 2011-03-04 15:05:54,699 FATAL
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
> Unexpected exception handling nodeDeleted event
> org.apache.zookeeper.KeeperException$ConnectionLossException:
> KeeperErrorCode = ConnectionLoss for /hbase/master
>    at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
>    at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>    at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:809)
>    at
> org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndCheckExists(ZKUtil.java:232)
>    at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.nodeDeleted(ZooKeeperNodeTracker.java:165)
>    at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:261)
>    at
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530)
>    at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506)
> 2011-03-04 15:05:54,704 INFO org.apache.zookeeper.ZooKeeper: Session:
> 0x22e80dcc2350001 closed
> 2011-03-04 15:05:54,704 INFO org.apache.zookeeper.ClientCnxn: EventThread
> shut down
> 2011-03-04 15:05:54,718 INFO org.apache.zookeeper.ZooKeeper: Session:
> 0x22e80dcc2350000 closed
> 2011-03-04 15:05:54,718 INFO org.apache.zookeeper.ClientCnxn: EventThread
> shut down
> 2011-03-04 15:05:54,718 INFO org.apache.hadoop.hbase.master.HMaster: HMaster
> main thread exiting
>
>
> just before this one there is an other exception
>
> 2011-03-04 15:07:00,611 FATAL org.apache.hadoop.hbase.master.HMaster: Failed
> assignment of regions to serverName=desktop,60020,1299242075991,
> load=(requests=0, regions=0, usedHeap=34, maxHeap=996)
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed setting up
> proxy interface org.apache.hadoop.hbase.ipc.HRegionInterface to
> /172.28.124.148:60020 after attempts=1
>    at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:355)
>    at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:954)
>    at
> org.apache.hadoop.hbase.master.ServerManager.getServerConnection(ServerManager.java:606)
>    at
> org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:560)
>    at
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:776)
>    at
> org.apache.hadoop.hbase.master.AssignmentManager$SingleServerBulkAssigner.run(AssignmentManager.java:1310)
>    at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>    at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>    at java.lang.Thread.run(Thread.java:619)
> Caused by: java.net.ConnectException: Connection refused
>    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>    at
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:408)
>    at
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:328)
>    at
> org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:883)
>    at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:750)
>    at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
>    at $Proxy6.getProtocolVersion(Unknown Source)
>    at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:419)
>    at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:393)
>    at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:444)
>    at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:349)
>    ... 8 more
> 2011-03-04 15:07:00,615 INFO org.apache.hadoop.hbase.master.HMaster:
> Aborting
>
> Any hint for me what could be wrong there?
>
> Thanks
> Daniel
>

Reply via email to