[ 
https://issues.apache.org/jira/browse/HBASE-25059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17197919#comment-17197919
 ] 

Nick Dimiduk commented on HBASE-25059:
--------------------------------------

So the only way we can abort the meta OPEN on the target RS is if that RS 
aborts? The point of no return is setting the meta location znode? I think if 
we delay registering the target RS as the host of meta up in ZK until after the 
OPEN completes and master can confirm it's on the host intended, we could be 
more nimble here.

> TransitionRegionStateProcedure should timeout, rollback, retry instead of 
> waiting infinitely on CONFIRMED_OPEN
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-25059
>                 URL: https://issues.apache.org/jira/browse/HBASE-25059
>             Project: HBase
>          Issue Type: Bug
>          Components: Region Assignment
>    Affects Versions: 2.3.2
>            Reporter: Nick Dimiduk
>            Priority: Major
>
> Testing 2.3.2RC1 with ITBLL. The region server assigned to open meta locked 
> up due to HBASE-24896. Meanwhile, the master waits indefinitely on a 
> procedure {{pid=176583, ppid=176532, 
> state=WAITING:REGION_STATE_TRANSITION_CONFIRM_OPENED; 
> TransitRegionStateProcedure table=hbase:meta, region=1588230740, ASSIGN}}.
> AssignmentManager needs a way to rescind assignment when a RS fails to 
> complete within a reasonable timeout window, roll back the procedure, and try 
> again with a new target.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to