[ 
https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13454929#comment-13454929
 ] 

Ted Yu commented on HBASE-6438:
-------------------------------

I think separating the fix would make discussion easier.

Thanks
                
> RegionAlreadyInTransitionException needs to give more info to avoid 
> assignment inconsistencies
> ----------------------------------------------------------------------------------------------
>
>                 Key: HBASE-6438
>                 URL: https://issues.apache.org/jira/browse/HBASE-6438
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: rajeshbabu
>             Fix For: 0.96.0, 0.92.3, 0.94.3
>
>         Attachments: HBASE-6438_2.patch, HBASE-6438_94.patch, 
> HBASE-6438_trunk.patch
>
>
> Seeing some of the recent issues in region assignment, 
> RegionAlreadyInTransitionException is one reason after which the region 
> assignment may or may not happen(in the sense we need to wait for the TM to 
> assign).
> In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on 
> master restart.
> Consider the following case, due to some reason like master restart or 
> external assign call, we try to assign a region that is already getting 
> opened in a RS.
> Now the next call to assign has already changed the state of the znode and so 
> the current assign that is going on the RS is affected and it fails.  The 
> second assignment that started also fails getting RAITE exception.  Finally 
> both assignments not carrying on.  Idea is to find whether any such RAITE 
> exception can be retried or not.
> Here again we have following cases like where
> -> The znode is yet to transitioned from OFFLINE to OPENING in RS
> -> RS may be in the step of openRegion.
> -> RS may be trying to transition OPENING to OPENED.
> -> RS is yet to add to online regions in the RS side.
> Here in openRegion() and updateMeta() any failures we are moving the znode to 
> FAILED_OPEN.  So in these cases getting an RAITE should be ok.  But in other 
> cases the assignment is stopped.
> The idea is to just add the current state of the region assignment in the RIT 
> map in the RS side and using that info we can determine whether the 
> assignment can be retried or not on getting an RAITE.
> Considering the current work going on in AM, pls do share if this is needed 
> atleast in the 0.92/0.94 versions?  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to