[
https://issues.apache.org/jira/browse/HBASE-26884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17513293#comment-17513293
]
Zheng Wang commented on HBASE-26884:
------------------------------------
Seen this in 2.0(cdh6.0.1, at about 1 years ago) and 2.2.0 rencently, not sure
it could happen without misoperation by user. [~anoop.hbase]
> Find unavailable regions by the startcode checking on hmaster start up and
> reassign them
> ----------------------------------------------------------------------------------------
>
> Key: HBASE-26884
> URL: https://issues.apache.org/jira/browse/HBASE-26884
> Project: HBase
> Issue Type: Improvement
> Components: master
> Reporter: Zheng Wang
> Assignee: Zheng Wang
> Priority: Major
>
> Sometimes we have seen there are regions in open or opening state, but does
> not deployed on any rs and without procs for them, and afting checking the
> meta table, we find these startcode are expired.
> It is no easy to reproduce, may be caused by corner bug or user misoperation.
> My approach is add some checking on hmaster start up, if the startcode of the
> regionLocation expired, and neither TRSP on region nor SCP on regionserver,
> then we should reassign the region, then we can resovle it easily just by
> restart hmaster.
> Hbck2 maybe also useful for some of them cases, but not easily for common
> user to use, especially the number of these regions not small and need to be
> recovery quickly.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)