[jira] Commented: (HBASE-3147) Regions stuck in transition after rolling restart, perpetual timeout handling but nothing happens

stack (JIRA) Mon, 25 Oct 2010 09:11:43 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-3147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12924602#action_12924602
 ]


stack commented on HBASE-3147:
------------------------------

I got this when I tried running patch....

{code}
java.lang.IllegalAccessError: tried to access method 
org.apache.hadoop.hbase.zookeeper.ZKAssign.getNodeName(Lorg/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher;Ljava/lang/String;)Ljava/lang/String;
 from class org.apache.hadoop.hbase.master.AssignmentManager$TimeoutMonitor
    at 
org.apache.hadoop.hbase.master.AssignmentManager$TimeoutMonitor.chore(AssignmentManager.java:1457)
    at org.apache.hadoop.hbase.Chore.run(Chore.java:66)
2010-10-25 16:07:44,354 INFO 
org.apache.hadoop.hbase.master.AssignmentManager$TimeoutMonitor: 
sv2borg180:60000.timeoutMonitor exiting
{code}

Let me try fix.

> Regions stuck in transition after rolling restart, perpetual timeout handling 
> but nothing happens
> -------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-3147
>                 URL: https://issues.apache.org/jira/browse/HBASE-3147
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>             Fix For: 0.90.0
>
>
> The rolling restart script is great for bringing on the weird stuff.  On my 
> little loaded cluster if I run it, it horks the cluster and it doesn't 
> recover.  I notice two issues that need fixing:
> 1. We'll miss noticing that a server was carrying .META. and it never gets 
> assigned -- the shutdown handlers get stuck in perpetual wait on a .META. 
> assign that will never happen.
> 2. Perpetual cycling of the this sequence per region not succesfully assigned:
> {code}
>  2010-10-23 21:37:57,404 INFO 
> org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed 
> out:  usertable,user510588360,1287547556587.7f2d92497d2d03917afd574ea2aca55b. 
> state=PENDING_OPEN,                       ts=1287869814294  45154 2010-10-23 
> 21:37:57,404 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region 
> has been PENDING_OPEN or OPENING for too long, reassigning 
> region=usertable,user510588360,1287547556587.                                 
>     7f2d92497d2d03917afd574ea2aca55b.  45155 2010-10-23 21:37:57,404 DEBUG 
> org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x2bd57d1475046a 
> Attempting to transition node 7f2d92497d2d03917afd574ea2aca55b from 
> RS_ZK_REGION_OPENING to M_ZK_REGION_OFFLINE  45156 2010-10-23 21:37:57,404 
> WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> master:60000-0x2bd57d1475046a Attempt to transition the unassigned node for 
> 7f2d92497d2d03917afd574ea2aca55b from RS_ZK_REGION_OPENING to                 
> M_ZK_REGION_OFFLINE failed, the node existed but was in the state 
> M_ZK_REGION_OFFLINE  45157 2010-10-23 21:37:57,404 INFO 
> org.apache.hadoop.hbase.master.AssignmentManager: Region transitioned OPENING 
> to OFFLINE so skipping timeout, 
> region=usertable,user510588360,1287547556587.7f2d92497d2d03917afd574ea2aca55b.
>   
> ,,,
> {code}
> Timeout period again elapses an then same sequence.
> This is what I've been working on.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-3147) Regions stuck in transition after rolling restart, perpetual timeout handling but nothing happens

Reply via email to