[
https://issues.apache.org/jira/browse/HBASE-5719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13247865#comment-13247865
]
Jonathan Hsieh commented on HBASE-5719:
---------------------------------------
More context:
We ran into a corrupted cluster that had encountered HBASE-4238 and had several
generations of "grandparent" and regions lingering in HDFS. If you looked at a
region map, we had overlapping regions that looked like this:
[A-I], [A-E], [E-H], [A-C], [A-B], [B-C] ...
The HBASE-5128 version of hbck would see that all these regions fit inside of
A-I and then attempt to merge the all into one mega region. This is
technically correct but could result merging all the regions in an overlap
group into one region that was significantly larger than all others (worst case
all regions of a table could get combined into one region). HBASE-5128
includes some safeguards to prevent these "mega merges". In order to fix these
situations, we sidelined (close, offline, move to different dir) the
grandparent regions with the largest overlapped with the most other regions.
This leaves us with many small groups of overlapping regions instead of a
single large set of overlapping regions. These smaller regions could be safely
repaired automatically via merges, and any data from the sidelined grandparent
regions could be restored via a bulk load later on.
So in the example above, the [A-I], [A-E], [E-H] grandparent regions would get
sidelined, and leaving us with [A-C], [A-B],[B-C]. These smaller regions could
get safely merged automatically into a single [A-C]' region. We'd then bulk
load [A-I], [A-E], and [E-H] regions back in afterwards to restore data.
The goal of this patch is to automatically id and sideline overlapping
grandparent regions.
> Enhance hbck to sideline overlapped mega regions
> ------------------------------------------------
>
> Key: HBASE-5719
> URL: https://issues.apache.org/jira/browse/HBASE-5719
> Project: HBase
> Issue Type: New Feature
> Components: hbck
> Affects Versions: 0.94.0, 0.96.0
> Reporter: Jimmy Xiang
> Assignee: Jimmy Xiang
> Fix For: 0.96.0
>
> Attachments: hbase-5719.patch
>
>
> If there are too many regions in one overlapped group (by default, more than
> 10), hbck currently doesn't merge them since it takes time.
> In this case, we can sideline some regions in the group and break the
> overlapping to fix the inconsistency. Later on, sidelined regions can be
> bulk loaded manually.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira