Regions stuck in transition after rolling restart, perpetual timeout handling
but nothing happens
-------------------------------------------------------------------------------------------------
Key: HBASE-3147
URL: https://issues.apache.org/jira/browse/HBASE-3147
Project: HBase
Issue Type: Bug
Reporter: stack
Fix For: 0.90.0
The rolling restart script is great for bringing on the weird stuff. On my
little loaded cluster if I run it, it horks the cluster and it doesn't recover.
I notice two issues that need fixing:
1. We'll miss noticing that a server was carrying .META. and it never gets
assigned -- the shutdown handlers get stuck in perpetual wait on a .META.
assign that will never happen.
2. Perpetual cycling of the this sequence per region not succesfully assigned:
{code}
2010-10-23 21:37:57,404 INFO org.apache.hadoop.hbase.master.AssignmentManager:
Regions in transition timed out:
usertable,user510588360,1287547556587.7f2d92497d2d03917afd574ea2aca55b.
state=PENDING_OPEN, ts=1287869814294 45154 2010-10-23
21:37:57,404 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has
been PENDING_OPEN or OPENING for too long, reassigning
region=usertable,user510588360,1287547556587.
7f2d92497d2d03917afd574ea2aca55b. 45155 2010-10-23 21:37:57,404 DEBUG
org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x2bd57d1475046a
Attempting to transition node 7f2d92497d2d03917afd574ea2aca55b from
RS_ZK_REGION_OPENING to M_ZK_REGION_OFFLINE 45156 2010-10-23 21:37:57,404 WARN
org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x2bd57d1475046a
Attempt to transition the unassigned node for 7f2d92497d2d03917afd574ea2aca55b
from RS_ZK_REGION_OPENING to M_ZK_REGION_OFFLINE failed, the
node existed but was in the state M_ZK_REGION_OFFLINE 45157 2010-10-23
21:37:57,404 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region
transitioned OPENING to OFFLINE so skipping timeout,
region=usertable,user510588360,1287547556587.7f2d92497d2d03917afd574ea2aca55b.
,,,
{code}
Timeout period again elapses an then same sequence.
This is what I've been working on.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.