All We run a load test and after about 3 hours our application stopped. Check the logs I see this in the hbase-master log
2012-05-20 08:08:17,251 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been OFFLINE for too long, reassigning -ROOT-,,0.70236052 to a random server 2012-05-20 08:08:17,252 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; was=.META.,,1.1028785192 state=OFFLINE, ts=1337497517243 2012-05-20 08:08:17,252 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; was=-ROOT-,,0.70236052 state=OFFLINE, ts=1337497517243 2012-05-20 08:10:10,309 INFO org.apache.zookeeper.server.NIOServerCnxn: Accepted socket connection from /0:0:0:0:0:0:0:1%0:62747 2012-05-20 08:10:10,315 INFO org.apache.zookeeper.server.NIOServerCnxn: Client attempting to establish new session at /0:0:0:0:0:0:0:1%0:62747 2012-05-20 08:10:10,316 INFO org.apache.zookeeper.server.NIOServerCnxn: Established session 0x137653a0e8e02fa with negotiated timeout 40000 for client /0:0:0:0:0:0:0:1%0:62747 2012-05-20 08:10:10,316 INFO org.apache.zookeeper.server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x137653a0e8e02fa type:create cxid:0x1 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error Path:/hbase Error:KeeperErrorCode = NodeExists for /hbase 2012-05-20 08:10:10,329 INFO org.apache.zookeeper.server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x137653a0e8e02fa type:create cxid:0x2 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error Path:/hbase/unassigned Error:KeeperErrorCode = NodeExists for /hbase/unassigned 2012-05-20 08:10:10,329 INFO org.apache.zookeeper.server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x137653a0e8e02fa type:create cxid:0x3 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error Path:/hbase/rs Error:KeeperErrorCode = NodeExists for /hbase/rs 2012-05-20 08:10:10,330 INFO org.apache.zookeeper.server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x137653a0e8e02fa type:create cxid:0x4 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error Path:/hbase/table Error:KeeperErrorCode = NodeExists for /hbase/table Hadoop seems to be up and running. last log in the datanode is 12/05/20 06:15:25 INFO datanode.DataBlockScanner: Verification succeeded for blk_-3639294708473848144_3329 12/05/20 06:26:20 INFO datanode.DataBlockScanner: Verification succeeded for blk_2502932128500788221_3413 12/05/20 06:26:20 INFO datanode.DataBlockScanner: Verification succeeded for blk_3390059684225099859_3440 12/05/20 06:59:32 INFO datanode.DataNode: BlockReport of 157 blocks took 19 msec to generate and 3 msecs for RPC and NN processing 12/05/20 07:24:51 INFO datanode.DataBlockScanner: Verification succeeded for blk_8954400942867609419_3363 12/05/20 07:55:51 INFO datanode.DataBlockScanner: Verification succeeded for blk_-3650918785526360502_3387 12/05/20 07:59:33 INFO datanode.DataNode: BlockReport of 157 blocks took 20 msec to generate and 3 msecs for RPC and NN processing 12/05/20 08:07:25 INFO datanode.DataBlockScanner: Verification succeeded for blk_786514597978592338_3336 I tried using hbase-explorer to view the tables but they all seem to down.
