Huaxiang Sun created HBASE-24370:
------------------------------------
Summary: Avoid aggressive MergeRegion and
GCMultipleMergedRegionsProcedure
Key: HBASE-24370
URL: https://issues.apache.org/jira/browse/HBASE-24370
Project: HBase
Issue Type: Bug
Components: master
Reporter: Huaxiang Sun
Assignee: Huaxiang Sun
In
[https://github.com/apache/hbase/blob/a40a0322a73add68d9cb0579abacdd6a2e41e8fb/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/MergeTableRegionsProcedure.java#L478,
|https://github.com/apache/hbase/blob/a40a0322a73add68d9cb0579abacdd6a2e41e8fb/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/MergeTableRegionsProcedure.java#L478]
prepareMergeRegion, it checks if one of merged parent regions is a merged child
region and has not been GCed. If it is ready to GC, it will kick off a
GCMultipleMergedRegionsProcedure and also start the MergeRegionProcedure. There
is a race condition here. If MergeRegionProcedure finishes first, it will
delete meta row for the merged child region. Then
GCMultipleMergedRegionsProcedure runs, and because the newly added check, it
thinks GC has been done and wont schedule GCRegionProcedure to clean up those
merged parent regions. The end result is that these merged parent regions are
left as orphans on Filesystem.
[https://github.com/apache/hbase/blob/a40a0322a73add68d9cb0579abacdd6a2e41e8fb/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/GCMultipleMergedRegionsProcedure.java#L105]
The proposed solution is to avoid being so aggressive, if it needs to kick off
GCMultipleMergedRegionsProcedure, then abort MergeRegionProcedure and user can
try MergeRegionProcedure later.
[|https://github.com/apache/hbase/blob/a40a0322a73add68d9cb0579abacdd6a2e41e8fb/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/MergeTableRegionsProcedure.java#L478]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)