Stephen Yuan Jiang created HBASE-13576:
------------------------------------------
Summary: HBCK enhancement: Failure in checking one region should
not fail the entire HBCK operation.
Key: HBASE-13576
URL: https://issues.apache.org/jira/browse/HBASE-13576
Project: HBase
Issue Type: Bug
Components: hbck
Affects Versions: 2.0.0, 1.1.0, 1.2.0
Reporter: Stephen Yuan Jiang
Assignee: Stephen Yuan Jiang
HBaseFsck#checkRegionConsistency() checks region consistency and repair the
corruption if requested. However, this function expects some exceptions. For
example, in one aspect of region repair, it calls
HBaseFsckRepair#waitUntilAssigned(), if a region is in transition for over 120
seconds (default value of "hbase.hbck.assign.timeout" configuration),
IOException would throw.
The problem is that one exception in checkRegionConsistency() would kill entire
hbck operation, because the exception would propagate up.
The proposal is that if the region is not META region ( or a system table
region if we prefer), we can skip the region if
HBaseFsck#checkRegionConsistency() fails. We could print out skip regions in
summary section so that users know to either re-run or investigate potential
issue for that region.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)