[ 
https://issues.apache.org/jira/browse/HBASE-28064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankit Singhal updated HBASE-28064:
----------------------------------
    Description: 
One of our users has brought up a use-case where they need to truncate a region 
to delete data within a specific range. There are two scenarios to consider:

* In the first scenario, the region boundaries involve a time range defined 
through pre-splitting, and user is looking to efficiently clean old date data. 
If HBase can directly truncate the region from the file system and then the 
user can merge the empty region with adjacent regions to effectively eliminate 
it which will be more optimized compared to deleting the data using Delete API.

* In another case, if the HFile for that region becomes corrupted for some 
reason, user want to get rid of the HFile and reload the entire region to avoid 
consistency issues and ensure performance.

we can do this by taking the region offline and taking write lock to avoid the 
consideration of race conditions involving Region In Transition (RITs), region 
re-opening, and merge/split scenarios. 


  was:
One of our users has brought up a use-case where they need to truncate a region 
to delete data within a specific range. There are two scenarios to consider:

* In the first scenario, the region boundaries involve a time range defined 
through pre-splitting, and user is looking to efficiently clean old date data. 
If HBase can directly truncate the region from the file system and then the 
user can merge the empty region with adjacent regions to effectively eliminate 
it which will be more optimized compared to deleting the data using Delete API.

* In another case, if the HFile for that region becomes corrupted for some 
reason, user want to get rid of the HFile and reload the entire region to avoid 
consistency issues and ensure performance.

we can do this when the table is offline/disabled to avoid the consideration of 
race conditions involving Region In Transition (RITs), region re-opening, and 
merge/split scenarios, as taking the region offline is necessary regardless 



> Implement truncate_region command to truncate region directly from FS
> ---------------------------------------------------------------------
>
>                 Key: HBASE-28064
>                 URL: https://issues.apache.org/jira/browse/HBASE-28064
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Ankit Singhal
>            Priority: Major
>
> One of our users has brought up a use-case where they need to truncate a 
> region to delete data within a specific range. There are two scenarios to 
> consider:
> * In the first scenario, the region boundaries involve a time range defined 
> through pre-splitting, and user is looking to efficiently clean old date 
> data. If HBase can directly truncate the region from the file system and then 
> the user can merge the empty region with adjacent regions to effectively 
> eliminate it which will be more optimized compared to deleting the data using 
> Delete API.
> * In another case, if the HFile for that region becomes corrupted for some 
> reason, user want to get rid of the HFile and reload the entire region to 
> avoid consistency issues and ensure performance.
> we can do this by taking the region offline and taking write lock to avoid 
> the consideration of race conditions involving Region In Transition (RITs), 
> region re-opening, and merge/split scenarios. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to