[
https://issues.apache.org/jira/browse/HBASE-24255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17094943#comment-17094943
]
Andrey Elenskiy commented on HBASE-24255:
-----------------------------------------
I don't really see how that addresses the issue in description. The problem is
I was trying to describe can happen if I were to run HBCK2's
addMissingRegionsInMeta which ends up readding parents of merged region into
meta and assigns it to a RegionServer. Then, when GCRegionProcedure runs, it
removes the region from hbase:meta and FS, but doesn't unassign the region from
regionsserver. Hence, I'd like to see that GCRegionProcedure actually makes
sure that the region is not assigned on any regionserver (leading to "Orphan
Regions on RegionServer").
> GCRegionProcedure doesn't assign region from RegionServer leading to orphans
> ----------------------------------------------------------------------------
>
> Key: HBASE-24255
> URL: https://issues.apache.org/jira/browse/HBASE-24255
> Project: HBase
> Issue Type: Bug
> Components: proc-v2, Region Assignment, regionserver
> Affects Versions: 2.2.4
> Environment: hbase 2.2.4
> hadoop 3.1.3
> Reporter: Andrey Elenskiy
> Assignee: niuyulin
> Priority: Major
>
> We've found ourselves in a situation where parents of merged or split regions
> needed to be opened again on a regionserver due to having to recover from
> cluster meltdown (HBCK2's fixMeta kicks off GCMultipleMergedRegionsProcedure
> which requiters all regions to be merged to be open). Then, when a
> GCProcedure is kicked of to clean a parent region up by
> GCMultipleMergedRegionsProcedure, it ends up deleting it from hbase:meta, but
> doesn't unassign it from RegionServer leading for it to show up in "Orphan
> Regions on RegionServer" in hbck tab of HBase Master. Also, the hbase client
> doesn't detect that the region is closed either because it's still
> technically open on a regionserver (it doesn't reread hbase:meta all the
> time). The only way to recover from this is to restart regionserver which
> isn't idea as it can lead to other issues in clusters with region
> inconsistencies.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)