[ https://issues.apache.org/jira/browse/HBASE-29310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vinayak Hegde updated HBASE-29310: ---------------------------------- Description: When bulk load operations are performed, the resulting files are backed up to the backup location via the continuous backup process. However, during Point-In-Time Recovery (PITR), restoring these bulk-loaded files efficiently can be challenging. To address this, we propose the following guidelines for handling bulk loads in the context of continuous backup and PITR: * When a user performs a bulk load on any table under continuous backup and PITR, they *must take a full or incremental backup* afterward. An incremental backup is generally sufficient and faster. *Required Changes* # {*}Documentation Update{*}: Add a note in the HBase Backup and Restore documentation explaining this rule and its importance. # {*}Logging Suggestion{*}: After a bulk load operation completes, log a message suggesting the user perform a full or incremental backup. # {*}PITR Enhancements{*}: * ** During PITR, check if any bulk load operation occurred after the last successful backup. * ** If no such backup exists, inform the user and fail the process. * ** If the user chooses to proceed (e.g., using a {{--force}} flag), continue with a warning that the bulk-loaded files will not be part of the restored table. was: When bulk load operations are performed, the resulting files are backed up to the backup location via the continuous backup process. However, during Point-In-Time Recovery (PITR), restoring these bulk-loaded files efficiently can be challenging. To address this, we propose the following guidelines for handling bulk loads in the context of continuous backup and PITR: * When a user performs a bulk load on any table under continuous backup and PITR, they *must take a full or incremental backup* afterward. An incremental backup is generally sufficient and faster. *Required Changes* # {*}Documentation Update{*}: Add a note in the HBase Backup and Restore documentation explaining this rule and its importance. # {*}Logging Suggestion{*}: After a bulk load operation completes, log a message suggesting the user perform a full or incremental backup. # {*}PITR Enhancements{*}: ** During PITR, check if any bulk load operation occurred after the last successful backup. ** If no such backup exists, inform the user and fail the process. ** If the user chooses to proceed (e.g., using a {{--force}} flag), continue with a warning that the bulk-loaded files will not be part of the restored table. > Handle Bulk Load Operations in Continuous Backup and PITR Workflow > ------------------------------------------------------------------ > > Key: HBASE-29310 > URL: https://issues.apache.org/jira/browse/HBASE-29310 > Project: HBase > Issue Type: Task > Components: backup&restore > Reporter: Vinayak Hegde > Priority: Major > > When bulk load operations are performed, the resulting files are backed up to > the backup location via the continuous backup process. > However, during Point-In-Time Recovery (PITR), restoring these bulk-loaded > files efficiently can be challenging. > To address this, we propose the following guidelines for handling bulk loads > in the context of continuous backup and PITR: > * When a user performs a bulk load on any table under continuous backup and > PITR, they *must take a full or incremental backup* afterward. An incremental > backup is generally sufficient and faster. > *Required Changes* > # {*}Documentation Update{*}: > Add a note in the HBase Backup and Restore documentation explaining this rule > and its importance. > # {*}Logging Suggestion{*}: > After a bulk load operation completes, log a message suggesting the user > perform a full or incremental backup. > # {*}PITR Enhancements{*}: > * > ** During PITR, check if any bulk load operation occurred after the last > successful backup. > * > ** If no such backup exists, inform the user and fail the process. > * > ** If the user chooses to proceed (e.g., using a {{--force}} flag), continue > with a warning that the bulk-loaded files will not be part of the restored > table. -- This message was sent by Atlassian Jira (v8.20.10#820010)