[jira] [Updated] (HBASE-29310) Handle Bulk Load Operations in Continuous Backup and PITR Workflow

asolomon (Jira) Mon, 04 Aug 2025 23:29:29 -0700


     [ 
https://issues.apache.org/jira/browse/HBASE-29310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


asolomon updated HBASE-29310:
-----------------------------
    Description: 
During Point-In-Time Recovery (PITR), restoring bulk-loaded files (if 
implemented via WALPlayer) efficiently can be challenging.

To address this, we propose the following guidelines for handling bulk loads in 
the context of continuous backup and PITR:
 * When a user performs a bulk load on any table under continuous backup and 
PITR, they *must take a full or incremental backup* afterward. An incremental 
backup is generally sufficient and faster.

*Required Changes*
 # {*}Documentation Update{*}:
 ## Add a note in the HBase Backup and Restore documentation explaining this 
rule and its importance.
 # {*}Logging Suggestion{*}:
 ## After a bulk load operation completes, log a message suggesting the user 
perform a full or incremental backup.
 # {*}PITR Enhancements{*}:

 * During PITR, check if any bulk load operation occurred after the last 
successful backup.

 * If no such backup exists, inform the user and fail the process.

 * -If the user chooses to proceed (e.g., using a {{{{}}{}}}{{{}force{}}} 
flag), continue with a warning that the bulk-loaded files will not be part of 
the restored table.-

  was:
When bulk load operations are performed, the resulting files are backed up to 
the backup location via the continuous backup process.

However, during Point-In-Time Recovery (PITR), restoring these bulk-loaded 
files efficiently can be challenging.

To address this, we propose the following guidelines for handling bulk loads in 
the context of continuous backup and PITR:
 * When a user performs a bulk load on any table under continuous backup and 
PITR, they *must take a full or incremental backup* afterward. An incremental 
backup is generally sufficient and faster.

*Required Changes*
 # {*}Documentation Update{*}:
 ## Add a note in the HBase Backup and Restore documentation explaining this 
rule and its importance.
 # {*}Logging Suggestion{*}:
 ## After a bulk load operation completes, log a message suggesting the user 
perform a full or incremental backup.
 # {*}PITR Enhancements{*}:

 * During PITR, check if any bulk load operation occurred after the last 
successful backup.

 * If no such backup exists, inform the user and fail the process.

 * -If the user chooses to proceed (e.g., using a {{{}{}}}{{{}force{}}} flag), 
continue with a warning that the bulk-loaded files will not be part of the 
restored table.-


> Handle Bulk Load Operations in Continuous Backup and PITR Workflow
> ------------------------------------------------------------------
>
>                 Key: HBASE-29310
>                 URL: https://issues.apache.org/jira/browse/HBASE-29310
>             Project: HBase
>          Issue Type: Task
>          Components: backup&amp;restore
>    Affects Versions: HBASE-28957
>            Reporter: Vinayak Hegde
>            Assignee: asolomon
>            Priority: Major
>              Labels: HBASE-28957, pull-request-available
>             Fix For: HBASE-28957
>
>
> During Point-In-Time Recovery (PITR), restoring bulk-loaded files (if 
> implemented via WALPlayer) efficiently can be challenging.
> To address this, we propose the following guidelines for handling bulk loads 
> in the context of continuous backup and PITR:
>  * When a user performs a bulk load on any table under continuous backup and 
> PITR, they *must take a full or incremental backup* afterward. An incremental 
> backup is generally sufficient and faster.
> *Required Changes*
>  # {*}Documentation Update{*}:
>  ## Add a note in the HBase Backup and Restore documentation explaining this 
> rule and its importance.
>  # {*}Logging Suggestion{*}:
>  ## After a bulk load operation completes, log a message suggesting the user 
> perform a full or incremental backup.
>  # {*}PITR Enhancements{*}:
>  * During PITR, check if any bulk load operation occurred after the last 
> successful backup.
>  * If no such backup exists, inform the user and fail the process.
>  * -If the user chooses to proceed (e.g., using a {{{{}}{}}}{{{}force{}}} 
> flag), continue with a warning that the bulk-loaded files will not be part of 
> the restored table.-



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HBASE-29310) Handle Bulk Load Operations in Continuous Backup and PITR Workflow

Reply via email to