[
https://issues.apache.org/jira/browse/HBASE-20881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17778758#comment-17778758
]
Viraj Jasani commented on HBASE-20881:
--------------------------------------
[~zhangduo] IIUC, the only reason why we had to introduce ABNORMALLY_CLOSED
state is because when a region is already in RIT, and the target server where
it is assigned or getting assigned to crashes, SCP has to interrupt old TRSP
and create new TRSPs to take care of assigning all regions that were previously
hosted by the target server, but any region already in transition might require
manual intervention because SCP cannot be certain what step of the previous
TRSP, the region was stuck while it was in RIT.
For SCP, any RIT on dead server is a complex state to deal with because it
cannot know for certain whether the region was stuck in any coproc hook on the
host or it was stuck while making RPC call to remote server and what was the
outcome of the RPC call etc.
Does this seem correct? We were thinking of digging a bit more in detail to see
if there are any cases for which we can convert region state to CLOSED rather
than ABNORMALLY_CLOSED and therefore avoid any operator intervention, but i
fear we might introduce double assignment of regions if this is not done
carefully.
> Introduce a region transition procedure to handle all the state transition
> for a region
> ---------------------------------------------------------------------------------------
>
> Key: HBASE-20881
> URL: https://issues.apache.org/jira/browse/HBASE-20881
> Project: HBase
> Issue Type: Sub-task
> Components: amv2, proc-v2
> Reporter: Duo Zhang
> Assignee: Duo Zhang
> Priority: Major
> Fix For: 3.0.0-alpha-1, 2.2.0
>
> Attachments: HBASE-20881-branch-2-v1.patch,
> HBASE-20881-branch-2-v2.patch, HBASE-20881-branch-2.patch,
> HBASE-20881-v1.patch, HBASE-20881-v10.patch, HBASE-20881-v11.patch,
> HBASE-20881-v12.patch, HBASE-20881-v13.patch, HBASE-20881-v13.patch,
> HBASE-20881-v14.patch, HBASE-20881-v14.patch, HBASE-20881-v15.patch,
> HBASE-20881-v16.patch, HBASE-20881-v2.patch, HBASE-20881-v3.patch,
> HBASE-20881-v4.patch, HBASE-20881-v4.patch, HBASE-20881-v5.patch,
> HBASE-20881-v6.patch, HBASE-20881-v7.patch, HBASE-20881-v7.patch,
> HBASE-20881-v8.patch, HBASE-20881-v9.patch, HBASE-20881.patch
>
>
> Now have an AssignProcedure, an UnssignProcedure, and also a
> MoveRegionProcedure which schedules an AssignProcedure and an
> UnssignProcedure to move a region. This makes the logic a bit complicated, as
> MRP is not a RIT, so when SCP can not interrupt it directly...
--
This message was sent by Atlassian Jira
(v8.20.10#820010)