[
https://issues.apache.org/jira/browse/HBASE-24376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17159355#comment-17159355
]
Huaxiang Sun commented on HBASE-24376:
--------------------------------------
I think in hbase-1, the normalizer uses a force flag to do the merge, which
means that if the two regions are not next to each other, it will still merge
them.
Are you sure that inconsistency is caused by normalizer? I.e, before normalizer
run, table is consistent, after normalizer run, there is inconsistency. If that
is the case, the issue is that normalizer merges two non-adjacent regions,
which will cause overlaps.
There is one such issue with hbase-2, but I checked the code, hbase-1 seems ok.
You can go over the master log, dump out meta table, and inconsistency report
from hbck to check if that is the case.
> MergeNormalizer is merging non-adjacent regions and causing region
> overlaps/holes.
> ----------------------------------------------------------------------------------
>
> Key: HBASE-24376
> URL: https://issues.apache.org/jira/browse/HBASE-24376
> Project: HBase
> Issue Type: Bug
> Components: master
> Affects Versions: 2.3.0
> Reporter: Huaxiang Sun
> Assignee: Huaxiang Sun
> Priority: Critical
> Fix For: 3.0.0-alpha-1, 2.3.0
>
>
> Currently, we found normalizer was merging regions which are non-adjacent, it
> will cause inconsistencies in the cluster.
> {code:java}
> 439055 2020-05-08 17:47:09,814 INFO
> org.apache.hadoop.hbase.master.normalizer.MergeNormalizationPlan: Executing
> merging normalization plan: MergeNormalizationPlan{firstRegion={ENCODED =>
> 47fe236a5e3649ded95cb64ad0c08492, NAME =>
> 'TABLE,\x03\x01\x05\x01\x04\x02,1554838974870.47fe236a5e3649ded95cb64ad
> 0c08492.', STARTKEY => '\x03\x01\x05\x01\x04\x02', ENDKEY =>
> '\x03\x01\x05\x01\x04\x02\x01\x02\x02201904082200\x00\x00\x03Mac\x00\x00\x00\x00\x00\x00\x00\x00\x00iMac13,1\x00\x00\x00\x00\x00\x049.3-14E260\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x05'},
> secondRegion={ENCODED => 0c0f2aa67f4329d5c4 8ba0320f173d31, NAME =>
> 'TABLE,\x03\x01\x05\x02\x01\x01,1554830735526.0c0f2aa67f4329d5c48ba0320f173d31.',
> STARTKEY => '\x03\x01\x05\x02\x01\x01', ENDKEY =>
> '\x03\x01\x05\x02\x01\x02'}}
> 439056 2020-05-08 17:47:11,438 INFO org.apache.hadoop.hbase.ScheduledChore:
> CatalogJanitor-*****:16000 average execution time: 1676219193 ns.
> 439057 2020-05-08 17:47:11,730 INFO org.apache.hadoop.hbase.master.HMaster:
> Client=null/null merge regions [47fe236a5e3649ded95cb64ad0c08492],
> [0c0f2aa67f4329d5c48ba0320f173d31]
> {code}
>
> The root cause is that getMergeNormalizationPlan() uses a list of regionInfo
> which is ordered by regionName. regionName does not necessary guarantee the
> order of STARTKEY (let's say 'aa1', 'aa1!', in order of regionName, it will
> be 'aa1!' followed by 'aa1'. This will result in normalizer merging
> non-adjacent regions into one and creates overlaps. This is not an issue in
> branch-1 as the list is already ordered by RegionInfo.COMPARATOR in
> normalizer.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)