[
https://issues.apache.org/jira/browse/HBASE-30230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
terrytlu updated HBASE-30230:
-----------------------------
Description:
Summary
-------
When executing `truncate_preserve` on a table that has overlapping regions
(regions sharing the same startKey), the procedure gets stuck indefinitely.
This occurs because during truncate, regions are cleaned and recreated at the
same timestamp, producing regions with identical encodedNames (derived from
tableName + startKey + regionId). The duplicate encodedNames cause race
conditions in subsequent procedure steps.
Environment
-----------
- HBase version: 2.4.5
- Reproduction confirmed on internal test cluster
Symptoms
--------
{noformat}
# Multiple truncate procedures stuck in `TRUNCATE_TABLE_CREATE_FS_LAYOUT` state
# Some truncate procedures stuck in `REGION_STATE_TRANSITION_CONFIRM_OPENED`
state
# Tables show RIT (Regions In Transition) after truncate_preserve
# The HMaster log shows a large number of FileNotFoundException errors for
.regioninfo files.
{noformat}
Root Cause
----------
When a table has pre-existing region overlap (multiple regions with the same
startKey), `truncate_preserve` deletes old regions and creates new ones
simultaneously. Since regionId is based on the creation timestamp, overlapping
regions created at the same instant produce identical encodedNames (hash of
tableName + startKey + regionId).
This leads to two failure scenarios:
*{*}Scenario 1 - Stuck at TRUNCATE_TABLE_CREATE_FS_LAYOUT:{*}*
Concurrent threads attempt to create region directories for regions with the
same encodedName. The race condition causes:
{noformat}
truncate_preserve
→ TRUNCATE_TABLE_CREATE_FS_LAYOUT
→ Thread A: delete dir → create dir → write .regioninfo → init region
→ Thread B: delete dir (race!) → destroys Thread A's .regioninfo
→ Thread A: fails on createRegion() → retry forever (STUCK)
{noformat}
*{*}Scenario 2 - Stuck at CONFIRM_REGION_OPEN:{*}*
If both threads succeed in region initialization (Thread B deletes after Thread
A completes init), the procedure advances to ASSIGN_REGIONS. Master sends two
assign requests for the same encodedName. The RegionServer opens the region on
the first request but ignores the second (treating it as a duplicate). The
second sub-procedure waits forever for an RS report that never comes, leaving
the region in OPENING state.
Proposed Fix
------------
Region overlaps are unavoidable in production environments (they can result
from interrupted split operations). If a user triggers truncate_preserve on
such a table, the procedure will get stuck indefinitely. Recovery requires
manual intervention with HBCK2 to manipulate metadata, which heavily depends on
the operator's understanding of the internal region state — leading to
significant repair effort and risk.
I suggest adding a pre-check in `truncate_preserve` to detect region overlaps
in the target table. If overlaps are detected, reject the operation with a
clear error message instead of proceeding into an unrecoverable stuck state.
was:
Summary
-------
When executing `truncate_preserve` on a table that has overlapping regions
(regions sharing the same startKey), the procedure gets stuck indefinitely.
This occurs because during truncate, regions are cleaned and recreated at the
same timestamp, producing regions with identical encodedNames (derived from
tableName + startKey + regionId). The duplicate encodedNames cause race
conditions in subsequent procedure steps.
Environment
-----------
- HBase version: 2.4.5
- Reproduction confirmed on internal test cluster
Symptoms
--------
# Multiple truncate procedures stuck in `TRUNCATE_TABLE_CREATE_FS_LAYOUT`
state
# Some truncate procedures stuck in `REGION_STATE_TRANSITION_CONFIRM_OPENED`
state
# Tables show RIT (Regions In Transition) after truncate_preserve
# The HMaster log shows a large number of FileNotFoundException errors for
.regioninfo files.
Root Cause
----------
When a table has pre-existing region overlap (multiple regions with the same
startKey), `truncate_preserve` deletes old regions and creates new ones
simultaneously. Since regionId is based on the creation timestamp, overlapping
regions created at the same instant produce identical encodedNames (hash of
tableName + startKey + regionId).
This leads to two failure scenarios:
**Scenario 1 - Stuck at TRUNCATE_TABLE_CREATE_FS_LAYOUT:**
Concurrent threads attempt to create region directories for regions with the
same encodedName. The race condition causes:
{noformat}
truncate_preserve
→ TRUNCATE_TABLE_CREATE_FS_LAYOUT
→ Thread A: delete dir → create dir → write .regioninfo → init region
→ Thread B: delete dir (race!) → destroys Thread A's .regioninfo
→ Thread A: fails on createRegion() → retry forever (STUCK)
{noformat}
**Scenario 2 - Stuck at CONFIRM_REGION_OPEN:**
If both threads succeed in region initialization (Thread B deletes after Thread
A completes init), the procedure advances to ASSIGN_REGIONS. Master sends two
assign requests for the same encodedName. The RegionServer opens the region on
the first request but ignores the second (treating it as a duplicate). The
second sub-procedure waits forever for an RS report that never comes, leaving
the region in OPENING state.
Proposed Fix
------------
Region overlaps are unavoidable in production environments (they can result
from interrupted split operations). If a user triggers truncate_preserve on
such a table, the procedure will get stuck indefinitely. Recovery requires
manual intervention with HBCK2 to manipulate metadata, which heavily depends on
the operator's understanding of the internal region state — leading to
significant repair effort and risk.
I suggest adding a pre-check in `truncate_preserve` to detect region overlaps
in the target table. If overlaps are detected, reject the operation with a
clear error message instead of proceeding into an unrecoverable stuck state.
> truncate_preserve gets stuck when table has overlapping regions with same
> startKey
> ----------------------------------------------------------------------------------
>
> Key: HBASE-30230
> URL: https://issues.apache.org/jira/browse/HBASE-30230
> Project: HBase
> Issue Type: Bug
> Components: proc-v2
> Reporter: terrytlu
> Priority: Major
>
> Summary
> -------
> When executing `truncate_preserve` on a table that has overlapping regions
> (regions sharing the same startKey), the procedure gets stuck indefinitely.
> This occurs because during truncate, regions are cleaned and recreated at the
> same timestamp, producing regions with identical encodedNames (derived from
> tableName + startKey + regionId). The duplicate encodedNames cause race
> conditions in subsequent procedure steps.
>
> Environment
> -----------
> - HBase version: 2.4.5
> - Reproduction confirmed on internal test cluster
>
> Symptoms
> --------
> {noformat}
> # Multiple truncate procedures stuck in `TRUNCATE_TABLE_CREATE_FS_LAYOUT`
> state
> # Some truncate procedures stuck in `REGION_STATE_TRANSITION_CONFIRM_OPENED`
> state
> # Tables show RIT (Regions In Transition) after truncate_preserve
> # The HMaster log shows a large number of FileNotFoundException errors for
> .regioninfo files.
> {noformat}
>
> Root Cause
> ----------
> When a table has pre-existing region overlap (multiple regions with the same
> startKey), `truncate_preserve` deletes old regions and creates new ones
> simultaneously. Since regionId is based on the creation timestamp,
> overlapping regions created at the same instant produce identical
> encodedNames (hash of tableName + startKey + regionId).
>
> This leads to two failure scenarios:
>
> *{*}Scenario 1 - Stuck at TRUNCATE_TABLE_CREATE_FS_LAYOUT:{*}*
> Concurrent threads attempt to create region directories for regions with the
> same encodedName. The race condition causes:
> {noformat}
> truncate_preserve
> → TRUNCATE_TABLE_CREATE_FS_LAYOUT
> → Thread A: delete dir → create dir → write .regioninfo → init region
> → Thread B: delete dir (race!) → destroys Thread A's .regioninfo
> → Thread A: fails on createRegion() → retry forever (STUCK)
> {noformat}
>
> *{*}Scenario 2 - Stuck at CONFIRM_REGION_OPEN:{*}*
> If both threads succeed in region initialization (Thread B deletes after
> Thread A completes init), the procedure advances to ASSIGN_REGIONS. Master
> sends two assign requests for the same encodedName. The RegionServer opens
> the region on the first request but ignores the second (treating it as a
> duplicate). The second sub-procedure waits forever for an RS report that
> never comes, leaving the region in OPENING state.
>
> Proposed Fix
> ------------
> Region overlaps are unavoidable in production environments (they can result
> from interrupted split operations). If a user triggers truncate_preserve on
> such a table, the procedure will get stuck indefinitely. Recovery requires
> manual intervention with HBCK2 to manipulate metadata, which heavily depends
> on the operator's understanding of the internal region state — leading to
> significant repair effort and risk.
> I suggest adding a pre-check in `truncate_preserve` to detect region overlaps
> in the target table. If overlaps are detected, reject the operation with a
> clear error message instead of proceeding into an unrecoverable stuck state.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)