[jira] [Commented] (HBASE-8150) the code that handles RAITE on master in 0.94 should not always use the same plan

ramkrishna.s.vasudevan (JIRA) Tue, 19 Mar 2013 20:31:21 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-8150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13607216#comment-13607216
 ]


ramkrishna.s.vasudevan commented on HBASE-8150:
-----------------------------------------------

bq.Agree with adding some sleep time,
This may be needed but trying on same server is going to prevent multi-assign.  
Same as what Chunhui says.
                
> the code that handles RAITE on master in 0.94 should not always use the same 
> plan
> ---------------------------------------------------------------------------------
>
>                 Key: HBASE-8150
>                 URL: https://issues.apache.org/jira/browse/HBASE-8150
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>            Priority: Minor
>
> The code in 0.94 AM sets the region plan to point to the same server when 
> retrying the assignment due to RAITE.
> {code}
> LOG.warn("Failed assignment of "
>             + state.getRegion().getRegionNameAsString()
>             + " to "
>             + plan.getDestination()
>             + ", trying to assign "
>             + (regionAlreadyInTransitionException ? "to the same region 
> server"
>                 + " because of RegionAlreadyInTransitionException;" : 
> "elsewhere instead; ")
>             + "retry=" + i, t);
> {code}
> However, there's no wait time in the loop that retries the assignment, and if 
> region is being marked failed to open, which may take some time, master can 
> easily exhaust retries in less than half a second, going to the same server 
> every time and getting the same exception (unfortunately I no longer have 
> logs); then the region will be stuck.
> Do you think this is worth fixing (for example, by not using the same server 
> here after a few retries, or by adding timed backoff in such cases)?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8150) the code that handles RAITE on master in 0.94 should not always use the same plan

Reply via email to