[ 
https://issues.apache.org/jira/browse/HBASE-6880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13464865#comment-13464865
 ] 

Jimmy Xiang edited comment on HBASE-6880 at 9/28/12 3:45 AM:
-------------------------------------------------------------

@Ram, are you going to fix this in HBASE-6698?  If so, we can close this as a 
duplicate.

HBASE-6881 is just partially fixing this issue, by making the issue happens a 
little less.

I was thinking we should let assignRoot return something to indicate if it is 
successful.  If not,
there is no point to wait for it any more.  We can retry several times.  If it 
still doesn't
work, then abort the master, instead of hanging there forever. No retry and 
fail fast is also ok
with me, which may be cleaner in some sense.

Even if assignRoot does return something say the assign is going on, it may not 
succeed.  So
we also need to make sure the timeout monitor can fix it.


                
      was (Author: jxiang):
    @Ram, are you going to fix this in HBASE-6698?  If so, we can close this as 
a duplicate.

HBASE-6881 is just partially fixing this issue, by making the issue happens a 
little less.

I was thinking we should let assignRoot return something to indicate if it is 
successful.  If not,
there is no point to wait for it any more.  We can retry several times.  If it 
still doesn't
work, then abort the master, instead of hanging there forever. No retry and 
fail fast is also ok
with me, which may be cleaner in some sense.

Even it assignRoot does return something say the assign is going on, it may not 
succeed.  So
we also need to make sure the timeout monitor can fix it.


                  
> Failure in assigning root causes system hang
> --------------------------------------------
>
>                 Key: HBASE-6880
>                 URL: https://issues.apache.org/jira/browse/HBASE-6880
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Jimmy Xiang
>
> In looking into a TestReplication failure, I found out sometimes assignRoot 
> could fail, for example, RS is not serving traffic yet.  In this case, the 
> master will keep waiting for root to be available, which could never happen.
>  
> Need to gracefully terminate master if root is not assigned properly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to