I started a hbase cluster of 8 nodes and two regionservers died before I
even started any Map job writing data into it. There are several
interesting exceptions and I really appreciate any help on identifying the
culprit and methods to fix it. BTW, I restarted these regionservers
manually and they came up fine.
Thanks in advance.
2010-07-21 13:08:24,338 WARN org.apache.zookeeper.ClientCnxn: Exception
closing session 0x129f24e134a0005 to sun.nio.ch.selectionkeyi...@356f144c
java.io.IOException: Read error rc = -1 java.nio.DirectByteBuffer[pos=0
lim=4 cap=4]
at
org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:701)
2010-07-21 13:08:33,915 WARN
org.apache.hadoop.hbase.regionserver.HRegionServer: Attempt=10
org.apache.hadoop.hbase.Leases$LeaseStillHeldException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at
org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:94)
at
org.apache.hadoop.hbase.RemoteExceptionHandler.checkThrowable(RemoteExceptionHandler.java:48)
at
org.apache.hadoop.hbase.RemoteExceptionHandler.checkIOException(RemoteExceptionHandler.java:66)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:549)
at java.lang.Thread.run(Thread.java:619)
2010-07-21 13:12:38,159 WARN
org.apache.hadoop.hbase.regionserver.HRegionServer: unable to report to
master for 244244 milliseconds - retrying
2010-07-21 13:12:38,159 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Got ZooKeeper event,
state: Disconnected, type: None, path: null
2010-07-21 13:12:38,196 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGIONSERVER_STOP
2010-07-21 13:12:38,304 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer
Exception: org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException\
: No lease on
/hbase/.logs/grv-hadoopc06.local,60020,1279670492974/hlog.dat.1279674093440
File does not exist. Holder DFSClient_-1256503141 does not have any open
files.
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1343)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1334)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1262)