Thanks Stack, that was indeed the cause, the machines were not properly registered in DNS.
I've corrected that and now is fine.
Regards
Daniel

On 03/04/2011 08:15 PM, Stack wrote:
The first exception is about lost connection to the zk ensemble.  The
second is a failure establishing connection to 'desktop,60020'.  Later
we refer to an IP for desktop.  Is your dns set so resolve and reverse
lookup give same answer?  HBase 0.90.x is finicky in this regard (To
be fixed).
St.Ack

On Fri, Mar 4, 2011 at 6:39 AM, Daniel Iancu<[email protected]>  wrote:
Hi
I've updated our dev environment from Hbase 0.90.0 (ASF+CDH3b3) which
behaved very stable to Hbase 0.90.1 (CDH3B4) and since then the HMaster dies
regularly. Issue seems to be regarded to the connection to Zookeeper. Even
if I use a standby HMaster, this one also dies from same cause:


2011-03-04 15:05:54,699 FATAL
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
Unexpected exception handling nodeDeleted event
org.apache.zookeeper.KeeperException$ConnectionLossException:
KeeperErrorCode = ConnectionLoss for /hbase/master
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
    at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:809)
    at
org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndCheckExists(ZKUtil.java:232)
    at
org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.nodeDeleted(ZooKeeperNodeTracker.java:165)
    at
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:261)
    at
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530)
    at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506)
2011-03-04 15:05:54,704 INFO org.apache.zookeeper.ZooKeeper: Session:
0x22e80dcc2350001 closed
2011-03-04 15:05:54,704 INFO org.apache.zookeeper.ClientCnxn: EventThread
shut down
2011-03-04 15:05:54,718 INFO org.apache.zookeeper.ZooKeeper: Session:
0x22e80dcc2350000 closed
2011-03-04 15:05:54,718 INFO org.apache.zookeeper.ClientCnxn: EventThread
shut down
2011-03-04 15:05:54,718 INFO org.apache.hadoop.hbase.master.HMaster: HMaster
main thread exiting


just before this one there is an other exception

2011-03-04 15:07:00,611 FATAL org.apache.hadoop.hbase.master.HMaster: Failed
assignment of regions to serverName=desktop,60020,1299242075991,
load=(requests=0, regions=0, usedHeap=34, maxHeap=996)
org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed setting up
proxy interface org.apache.hadoop.hbase.ipc.HRegionInterface to
/172.28.124.148:60020 after attempts=1
    at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:355)
    at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:954)
    at
org.apache.hadoop.hbase.master.ServerManager.getServerConnection(ServerManager.java:606)
    at
org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:560)
    at
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:776)
    at
org.apache.hadoop.hbase.master.AssignmentManager$SingleServerBulkAssigner.run(AssignmentManager.java:1310)
    at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:619)
Caused by: java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
    at
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:408)
    at
org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:328)
    at
org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:883)
    at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:750)
    at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
    at $Proxy6.getProtocolVersion(Unknown Source)
    at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:419)
    at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:393)
    at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:444)
    at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:349)
    ... 8 more
2011-03-04 15:07:00,615 INFO org.apache.hadoop.hbase.master.HMaster:
Aborting

Any hint for me what could be wrong there?

Thanks
Daniel


--
Daniel Iancu
Java Developer,Web Components Romania
1&1 Internet Development srl.
18 Mircea Eliade St
Sect 1, Bucharest
RO Bucharest, 012015
www.1and1.ro
Phone:+40-031-223-9081
Email:[email protected]
IM:[email protected]


Reply via email to