[ 
https://issues.apache.org/jira/browse/HBASE-30230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

terrytlu updated HBASE-30230:
-----------------------------
    Description: 
Summary

-------

When executing `truncate_preserve` on a table that has overlapping regions 
(regions sharing the same startKey), the procedure gets stuck indefinitely. 
This occurs because during truncate, regions are cleaned and recreated at the 
same timestamp, producing regions with identical encodedNames (derived from 
tableName + startKey + regionId). The duplicate encodedNames cause race 
conditions in subsequent procedure steps.

 

Environment

-----------
 - HBase version: 2.4.5

 - Reproduction confirmed on internal test cluster

 

Symptoms

--------

{noformat}
 # Multiple truncate procedures stuck in `TRUNCATE_TABLE_CREATE_FS_LAYOUT` state
 # Some truncate procedures stuck in `REGION_STATE_TRANSITION_CONFIRM_OPENED` 
state
 # Tables show RIT (Regions In Transition) after truncate_preserve
 # The HMaster log shows a large number of FileNotFoundException errors for 
.regioninfo files.

{noformat}

 

Root Cause

----------

When a table has pre-existing region overlap (multiple regions with the same 
startKey), `truncate_preserve` deletes old regions and creates new ones 
simultaneously. Since regionId is based on the creation timestamp, overlapping 
regions created at the same instant produce identical encodedNames (hash of 
tableName + startKey + regionId).

 

This leads to two failure scenarios:

 

*{*}Scenario 1 - Stuck at TRUNCATE_TABLE_CREATE_FS_LAYOUT:{*}*

Concurrent threads attempt to create region directories for regions with the 
same encodedName. The race condition causes:
{noformat}
truncate_preserve

 → TRUNCATE_TABLE_CREATE_FS_LAYOUT

 → Thread A: delete dir → create dir → write .regioninfo → init region

 → Thread B: delete dir (race!) → destroys Thread A's .regioninfo

 → Thread A: fails on createRegion() → retry forever (STUCK)

{noformat}
 

*{*}Scenario 2 - Stuck at CONFIRM_REGION_OPEN:{*}*

If both threads succeed in region initialization (Thread B deletes after Thread 
A completes init), the procedure advances to ASSIGN_REGIONS. Master sends two 
assign requests for the same encodedName. The RegionServer opens the region on 
the first request but ignores the second (treating it as a duplicate). The 
second sub-procedure waits forever for an RS report that never comes, leaving 
the region in OPENING state.

 

Proposed Fix

------------

Region overlaps are unavoidable in production environments (they can result 
from interrupted split operations). If a user triggers truncate_preserve on 
such a table, the procedure will get stuck indefinitely. Recovery requires 
manual intervention with HBCK2 to manipulate metadata, which heavily depends on 
the operator's understanding of the internal region state — leading to 
significant repair effort and risk.

I suggest adding a pre-check in `truncate_preserve` to detect region overlaps 
in the target table. If overlaps are detected, reject the operation with a 
clear error message instead of proceeding into an unrecoverable stuck state.

  was:
Summary

-------

When executing `truncate_preserve` on a table that has overlapping regions 
(regions sharing the same startKey), the procedure gets stuck indefinitely. 
This occurs because during truncate, regions are cleaned and recreated at the 
same timestamp, producing regions with identical encodedNames (derived from 
tableName + startKey + regionId). The duplicate encodedNames cause race 
conditions in subsequent procedure steps.

 

Environment

-----------

- HBase version: 2.4.5

- Reproduction confirmed on internal test cluster

 

Symptoms

--------
 #  Multiple truncate procedures stuck in `TRUNCATE_TABLE_CREATE_FS_LAYOUT` 
state
 #  Some truncate procedures stuck in `REGION_STATE_TRANSITION_CONFIRM_OPENED` 
state
 # Tables show RIT (Regions In Transition) after truncate_preserve
 # The HMaster log shows a large number of FileNotFoundException errors for 
.regioninfo files.

 

Root Cause

----------

When a table has pre-existing region overlap (multiple regions with the same 
startKey), `truncate_preserve` deletes old regions and creates new ones 
simultaneously. Since regionId is based on the creation timestamp, overlapping 
regions created at the same instant produce identical encodedNames (hash of 
tableName + startKey + regionId).

 

This leads to two failure scenarios:

 

**Scenario 1 - Stuck at TRUNCATE_TABLE_CREATE_FS_LAYOUT:**

Concurrent threads attempt to create region directories for regions with the 
same encodedName. The race condition causes:

{noformat}

truncate_preserve

 → TRUNCATE_TABLE_CREATE_FS_LAYOUT

 → Thread A: delete dir → create dir → write .regioninfo → init region

 → Thread B: delete dir (race!) → destroys Thread A's .regioninfo

 → Thread A: fails on createRegion() → retry forever (STUCK)

{noformat}

 

**Scenario 2 - Stuck at CONFIRM_REGION_OPEN:**

If both threads succeed in region initialization (Thread B deletes after Thread 
A completes init), the procedure advances to ASSIGN_REGIONS. Master sends two 
assign requests for the same encodedName. The RegionServer opens the region on 
the first request but ignores the second (treating it as a duplicate). The 
second sub-procedure waits forever for an RS report that never comes, leaving 
the region in OPENING state.

 

Proposed Fix

------------

Region overlaps are unavoidable in production environments (they can result 
from interrupted split operations). If a user triggers truncate_preserve on 
such a table, the procedure will get stuck indefinitely. Recovery requires 
manual intervention with HBCK2 to manipulate metadata, which heavily depends on 
the operator's understanding of the internal region state — leading to 
significant repair effort and risk.

I suggest adding a pre-check in `truncate_preserve` to detect region overlaps 
in the target table. If overlaps are detected, reject the operation with a 
clear error message instead of proceeding into an unrecoverable stuck state.


> truncate_preserve gets stuck when table has overlapping regions with same 
> startKey
> ----------------------------------------------------------------------------------
>
>                 Key: HBASE-30230
>                 URL: https://issues.apache.org/jira/browse/HBASE-30230
>             Project: HBase
>          Issue Type: Bug
>          Components: proc-v2
>            Reporter: terrytlu
>            Priority: Major
>
> Summary
> -------
> When executing `truncate_preserve` on a table that has overlapping regions 
> (regions sharing the same startKey), the procedure gets stuck indefinitely. 
> This occurs because during truncate, regions are cleaned and recreated at the 
> same timestamp, producing regions with identical encodedNames (derived from 
> tableName + startKey + regionId). The duplicate encodedNames cause race 
> conditions in subsequent procedure steps.
>  
> Environment
> -----------
>  - HBase version: 2.4.5
>  - Reproduction confirmed on internal test cluster
>  
> Symptoms
> --------
> {noformat}
>  # Multiple truncate procedures stuck in `TRUNCATE_TABLE_CREATE_FS_LAYOUT` 
> state
>  # Some truncate procedures stuck in `REGION_STATE_TRANSITION_CONFIRM_OPENED` 
> state
>  # Tables show RIT (Regions In Transition) after truncate_preserve
>  # The HMaster log shows a large number of FileNotFoundException errors for 
> .regioninfo files.
> {noformat}
>  
> Root Cause
> ----------
> When a table has pre-existing region overlap (multiple regions with the same 
> startKey), `truncate_preserve` deletes old regions and creates new ones 
> simultaneously. Since regionId is based on the creation timestamp, 
> overlapping regions created at the same instant produce identical 
> encodedNames (hash of tableName + startKey + regionId).
>  
> This leads to two failure scenarios:
>  
> *{*}Scenario 1 - Stuck at TRUNCATE_TABLE_CREATE_FS_LAYOUT:{*}*
> Concurrent threads attempt to create region directories for regions with the 
> same encodedName. The race condition causes:
> {noformat}
> truncate_preserve
>  → TRUNCATE_TABLE_CREATE_FS_LAYOUT
>  → Thread A: delete dir → create dir → write .regioninfo → init region
>  → Thread B: delete dir (race!) → destroys Thread A's .regioninfo
>  → Thread A: fails on createRegion() → retry forever (STUCK)
> {noformat}
>  
> *{*}Scenario 2 - Stuck at CONFIRM_REGION_OPEN:{*}*
> If both threads succeed in region initialization (Thread B deletes after 
> Thread A completes init), the procedure advances to ASSIGN_REGIONS. Master 
> sends two assign requests for the same encodedName. The RegionServer opens 
> the region on the first request but ignores the second (treating it as a 
> duplicate). The second sub-procedure waits forever for an RS report that 
> never comes, leaving the region in OPENING state.
>  
> Proposed Fix
> ------------
> Region overlaps are unavoidable in production environments (they can result 
> from interrupted split operations). If a user triggers truncate_preserve on 
> such a table, the procedure will get stuck indefinitely. Recovery requires 
> manual intervention with HBCK2 to manipulate metadata, which heavily depends 
> on the operator's understanding of the internal region state — leading to 
> significant repair effort and risk.
> I suggest adding a pre-check in `truncate_preserve` to detect region overlaps 
> in the target table. If overlaps are detected, reject the operation with a 
> clear error message instead of proceeding into an unrecoverable stuck state.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to