Thanks will try and do an upgrade on the zookeeper and UAT cluster see if it changes anything.
On 20 May 2012, at 17:12, Ben Cuthbert wrote: > So hbase and hadoop are running fine, but we wanted to test our application > performance. So we ran some test cases for about 7 hours sending in events > every 200ms to generate some load. > After the 7 hours the application server could not connect to zookeeper, and > when I checked the logs this is what I saw. So the application functions just > not when we ran the test. > > Config is > hadoop: 0.20.203.0 > hbase: 0.90.3 > > So I am just trying to upgrade to > > hadoop: 1.0.3 > hbase: 0.92.1 > > Then going to run the same test again. > > > On 20 May 2012, at 16:56, Ben Cuthbert wrote: > >> I will try again as I did not run that. I just saw this error when trying to >> use hbase-explorer to connect. >> >> >> On 20 May 2012, at 16:02, Michael Segel wrote: >> >>> What did you see when you ran the HBase shell's status? >>> Did you run status w higher details? >>> (see status help) >>> >>> >>> On May 20, 2012, at 2:12 AM, Ben Cuthbert wrote: >>> >>>> All >>>> >>>> We run a load test and after about 3 hours our application stopped. Check >>>> the logs I see this in the hbase-master log >>>> >>>> 2012-05-20 08:08:17,251 INFO >>>> org.apache.hadoop.hbase.master.AssignmentManager: Region has been OFFLINE >>>> for too long, reassigning -ROOT-,,0.70236052 to a random server >>>> 2012-05-20 08:08:17,252 DEBUG >>>> org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; >>>> was=.META.,,1.1028785192 state=OFFLINE, ts=1337497517243 >>>> 2012-05-20 08:08:17,252 DEBUG >>>> org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; >>>> was=-ROOT-,,0.70236052 state=OFFLINE, ts=1337497517243 >>>> 2012-05-20 08:10:10,309 INFO org.apache.zookeeper.server.NIOServerCnxn: >>>> Accepted socket connection from /0:0:0:0:0:0:0:1%0:62747 >>>> 2012-05-20 08:10:10,315 INFO org.apache.zookeeper.server.NIOServerCnxn: >>>> Client attempting to establish new session at /0:0:0:0:0:0:0:1%0:62747 >>>> 2012-05-20 08:10:10,316 INFO org.apache.zookeeper.server.NIOServerCnxn: >>>> Established session 0x137653a0e8e02fa with negotiated timeout 40000 for >>>> client /0:0:0:0:0:0:0:1%0:62747 >>>> 2012-05-20 08:10:10,316 INFO >>>> org.apache.zookeeper.server.PrepRequestProcessor: Got user-level >>>> KeeperException when processing sessionid:0x137653a0e8e02fa type:create >>>> cxid:0x1 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error >>>> Path:/hbase Error:KeeperErrorCode = NodeExists for /hbase >>>> 2012-05-20 08:10:10,329 INFO >>>> org.apache.zookeeper.server.PrepRequestProcessor: Got user-level >>>> KeeperException when processing sessionid:0x137653a0e8e02fa type:create >>>> cxid:0x2 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error >>>> Path:/hbase/unassigned Error:KeeperErrorCode = NodeExists for >>>> /hbase/unassigned >>>> 2012-05-20 08:10:10,329 INFO >>>> org.apache.zookeeper.server.PrepRequestProcessor: Got user-level >>>> KeeperException when processing sessionid:0x137653a0e8e02fa type:create >>>> cxid:0x3 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error >>>> Path:/hbase/rs Error:KeeperErrorCode = NodeExists for /hbase/rs >>>> 2012-05-20 08:10:10,330 INFO >>>> org.apache.zookeeper.server.PrepRequestProcessor: Got user-level >>>> KeeperException when processing sessionid:0x137653a0e8e02fa type:create >>>> cxid:0x4 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error >>>> Path:/hbase/table Error:KeeperErrorCode = NodeExists for /hbase/table >>>> >>>> >>>> Hadoop seems to be up and running. >>>> >>>> last log in the datanode is >>>> >>>> 12/05/20 06:15:25 INFO datanode.DataBlockScanner: Verification succeeded >>>> for blk_-3639294708473848144_3329 >>>> 12/05/20 06:26:20 INFO datanode.DataBlockScanner: Verification succeeded >>>> for blk_2502932128500788221_3413 >>>> 12/05/20 06:26:20 INFO datanode.DataBlockScanner: Verification succeeded >>>> for blk_3390059684225099859_3440 >>>> 12/05/20 06:59:32 INFO datanode.DataNode: BlockReport of 157 blocks took >>>> 19 msec to generate and 3 msecs for RPC and NN processing >>>> 12/05/20 07:24:51 INFO datanode.DataBlockScanner: Verification succeeded >>>> for blk_8954400942867609419_3363 >>>> 12/05/20 07:55:51 INFO datanode.DataBlockScanner: Verification succeeded >>>> for blk_-3650918785526360502_3387 >>>> 12/05/20 07:59:33 INFO datanode.DataNode: BlockReport of 157 blocks took >>>> 20 msec to generate and 3 msecs for RPC and NN processing >>>> 12/05/20 08:07:25 INFO datanode.DataBlockScanner: Verification succeeded >>>> for blk_786514597978592338_3336 >>>> >>>> I tried using hbase-explorer to view the tables but they all seem to down. >>> >> >
