[
https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13448843#comment-13448843
]
ramkrishna.s.vasudevan commented on HBASE-6438:
-----------------------------------------------
@Stack
Sorry for missing out this review comment all these days.
Actually we would like to get in HBASe-6299 also and this patch. As you
mentioned can we give a patch for 0.94 and 0.92 combining both.
We faced HBASE-6299 recently in one of our testing. Both should be an useful
one.
> RegionAlreadyInTransitionException needs to give more info to avoid
> assignment inconsistencies
> ----------------------------------------------------------------------------------------------
>
> Key: HBASE-6438
> URL: https://issues.apache.org/jira/browse/HBASE-6438
> Project: HBase
> Issue Type: Bug
> Reporter: ramkrishna.s.vasudevan
> Assignee: rajeshbabu
> Attachments: HBASE-6438_trunk.patch
>
>
> Seeing some of the recent issues in region assignment,
> RegionAlreadyInTransitionException is one reason after which the region
> assignment may or may not happen(in the sense we need to wait for the TM to
> assign).
> In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on
> master restart.
> Consider the following case, due to some reason like master restart or
> external assign call, we try to assign a region that is already getting
> opened in a RS.
> Now the next call to assign has already changed the state of the znode and so
> the current assign that is going on the RS is affected and it fails. The
> second assignment that started also fails getting RAITE exception. Finally
> both assignments not carrying on. Idea is to find whether any such RAITE
> exception can be retried or not.
> Here again we have following cases like where
> -> The znode is yet to transitioned from OFFLINE to OPENING in RS
> -> RS may be in the step of openRegion.
> -> RS may be trying to transition OPENING to OPENED.
> -> RS is yet to add to online regions in the RS side.
> Here in openRegion() and updateMeta() any failures we are moving the znode to
> FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other
> cases the assignment is stopped.
> The idea is to just add the current state of the region assignment in the RIT
> map in the RS side and using that info we can determine whether the
> assignment can be retried or not on getting an RAITE.
> Considering the current work going on in AM, pls do share if this is needed
> atleast in the 0.92/0.94 versions?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira