Region stuck in transition after RS failed while opening
--------------------------------------------------------
Key: HBASE-3406
URL: https://issues.apache.org/jira/browse/HBASE-3406
Project: HBase
Issue Type: Bug
Affects Versions: 0.90.0
Reporter: Todd Lipcon
Priority: Critical
Fix For: 0.90.0
I had a RS fail due to GC pause while it was in the midst of opening a region,
apparently. This got the region stuck in the following repeating sequence in
the master log:
2011-01-03 17:24:33,884 INFO org.apache.hadoop.hbase.master.AssignmentManager:
Region has been OPENING for too long, reassigning
region=usertable,user991629466,1293747979500.c6a54b4d07a44e113b3a4d2ab22daa70.
2011-01-03 17:24:33,885 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil:
master:60000-0x12ce26f6c0600e3 Retrieved 113 byte(s) of data from znode
/hbase/unassigned/c6a54b4d07a44e113b3a4d2ab22daa70;
data=region=usertable,user991629466,1293747979500.c6a54b4d07a44e113b3a4d2ab22daa70.,
server=haus03.sf.cloudera.com:60000, state=M_ZK_REGION_OFFLINE
2011-01-03 17:24:43,886 INFO org.apache.hadoop.hbase.master.AssignmentManager:
Regions in transition timed out:
usertable,user991629466,1293747979500.c6a54b4d07a44e113b3a4d2ab22daa70.
state=OPENING, ts=1293840977790
2011-01-03 17:24:43,886 INFO org.apache.hadoop.hbase.master.AssignmentManager:
Region has been OPENING for too long, reassigning
region=usertable,user991629466,1293747979500.c6a54b4d07a44e113b3a4d2ab22daa70.
2011-01-03 17:24:43,887 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil:
master:60000-0x12ce26f6c0600e3 Retrieved 113 byte(s) of data from znode
/hbase/unassigned/c6a54b4d07a44e113b3a4d2ab22daa70;
data=region=usertable,user991629466,1293747979500.c6a54b4d07a44e113b3a4d2ab22daa70.,
server=haus03.sf.cloudera.com:60000, state=M_ZK_REGION_OFFLINE
etc... repeating every 10 seconds. Eventually I ran hbck -fix which forced it
to OFFLINE in ZK and it reassigned just fine.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.