[ 
https://issues.apache.org/jira/browse/HBASE-29310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17951807#comment-17951807
 ] 

Vinayak Hegde commented on HBASE-29310:
---------------------------------------

Regarding the PITR enhancements, before running the PITR command for a given 
point in time, we need to check whether any bulk load operations occurred 
between the last backup (chosen as the restore point) and the specified point 
in time.

To do this, we first determine the timestamp of the last backup and then verify 
whether any bulk load happened between that timestamp and the PITR target time.
If a bulk load is found within that range, we should throw an error and 
instruct the user to take a backup before proceeding. Otherwise, we can safely 
continue with the PITR operation.

Currently, we are considering two possible solutions for this check:

*1. Scan WAL files for bulk load entries:*
We can read the WAL files between the last backup time and the PITR time and 
look for bulk load entries. This approach is straightforward since WALs will 
have entries corresponding to bulk loads.
However, it may be expensive if the time range spans several days, as it 
requires scanning all relevant WAL files.

*2. Store bulk load metadata in a system table:*
We can persist bulk load metadata (e.g., {{{}<filename, timestamp>{}}}) in a 
system table during the bulk load process. Later, we can simply query this 
table to check if any bulk loads occurred in the given time range.
This approach is faster and more efficient but requires a few considerations:
 * After a bulk load, if the corresponding table has continuous backup enabled, 
we must store the metadata in the backup system table.

 * We should ideally clean up these entries once the related WALs are deleted. 
That said, not cleaning them up wouldn’t cause major issues apart from some 
stale data in the system table, which typically doesn't take up much space.

Let me know your thoughts on these approaches or if there’s a better 
alternative we should consider.

> Handle Bulk Load Operations in Continuous Backup and PITR Workflow
> ------------------------------------------------------------------
>
>                 Key: HBASE-29310
>                 URL: https://issues.apache.org/jira/browse/HBASE-29310
>             Project: HBase
>          Issue Type: Task
>          Components: backup&amp;restore
>            Reporter: Vinayak Hegde
>            Priority: Major
>
> When bulk load operations are performed, the resulting files are backed up to 
> the backup location via the continuous backup process.
> However, during Point-In-Time Recovery (PITR), restoring these bulk-loaded 
> files efficiently can be challenging.
> To address this, we propose the following guidelines for handling bulk loads 
> in the context of continuous backup and PITR:
>  * When a user performs a bulk load on any table under continuous backup and 
> PITR, they *must take a full or incremental backup* afterward. An incremental 
> backup is generally sufficient and faster.
> *Required Changes*
>  # {*}Documentation Update{*}:
>  ## Add a note in the HBase Backup and Restore documentation explaining this 
> rule and its importance.
>  # {*}Logging Suggestion{*}:
>  ## After a bulk load operation completes, log a message suggesting the user 
> perform a full or incremental backup.
>  # {*}PITR Enhancements{*}:
>  * During PITR, check if any bulk load operation occurred after the last 
> successful backup.
>  * If no such backup exists, inform the user and fail the process.
>  * If the user chooses to proceed (e.g., using a {{--force}} flag), continue 
> with a warning that the bulk-loaded files will not be part of the restored 
> table.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to