[
https://issues.apache.org/jira/browse/HBASE-28706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17875843#comment-17875843
]
Ray Mattingly commented on HBASE-28706:
---------------------------------------
How do we feel about removing this as a blocker for 2.6.1? I think we have made
substantial progress on stabilizing this experimental feature since 2.6.0,
particularly for single backup roots.
I can also take a look at HBASE-28705, apologies for the delay as I'm just
getting back from a vacation
> Tracking of bulk-loads for backup does not work for multi-root backups
> ----------------------------------------------------------------------
>
> Key: HBASE-28706
> URL: https://issues.apache.org/jira/browse/HBASE-28706
> Project: HBase
> Issue Type: Bug
> Components: backup&restore
> Affects Versions: 2.6.0, 3.0.0, 4.0.0-alpha-1
> Reporter: Dieter De Paepe
> Priority: Blocker
>
> Haven't been able to test this yet, but I highly suspect that
> IncrementalTableBackupClient#handleBulkLoad will delete records of the files
> that were bulk loaded, even if those records are still needed for backups in
> other backuproots.
> I base this on the observation that the code for tracking which WALs should
> be kept around, and backup metadata in general, are all tracked per
> individual backuproot. But for the tracking of bulk uploads, this is not the
> case.
> The result would be data loss (i.e. the bulk loaded data) when taking backups
> across different backuproots.
> Edit: This is minimal test to reproduce the issue from the master branch:
> First, enable backups by adding this to hbase-site.xml
> {code:java}
> <property>
> <name>hbase.backup.enable</name>
> <value>true</value>
> </property>
> <property>
> <name>hbase.master.logcleaner.plugins</name>
>
> <value>org.apache.hadoop.hbase.master.cleaner.TimeToLiveLogCleaner,org.apache.hadoop.hbase.master.cleaner.TimeToLiveProcedureWALCleaner,org.apache.hadoop.hbase.master.cleaner.TimeToLiveMasterLocalStoreWALCleaner,org.apache.hadoop.hbase.backup.master.BackupLogCleaner</value>
> </property>
> <property>
> <name>hbase.procedure.master.classes</name>
>
> <value>org.apache.hadoop.hbase.backup.master.LogRollMasterProcedureManager</value>
> </property>
> <property>
> <name>hbase.procedure.regionserver.classes</name>
>
> <value>org.apache.hadoop.hbase.backup.regionserver.LogRollRegionServerProcedureManager</value>
> </property>
> <property>
> <name>hbase.coprocessor.region.classes</name>
> <value>org.apache.hadoop.hbase.backup.BackupObserver</value>
> </property>
> <property>
> <name>hbase.fs.tmp.dir</name>
> <value>file:/tmp/hbase-tmp</value>
> </property> {code}
> Next, execute:
> {code:java}
> # Create an hfile (to local storage)
> echo -e 'row1\tvalue1' > /tmp/hfile_data
> bin/hbase org.apache.hadoop.hbase.mapreduce.ImportTsv
> -Dimporttsv.columns=HBASE_ROW_KEY,cf:q1
> -Dimporttsv.bulk.output=/tmp/bulk-output table1 /tmp/hfile_data
> # Create a table, and 2 full backups (using different roots) of the empty
> table
> echo "create 'table1', 'cf'" | bin/hbase shell -n
> bin/hbase backup create full file:/tmp/backup1 -t table1
> bin/hbase backup create full file:/tmp/backup2 -t table1
> # Bulk load the HFile into the table, scan confirms it is loaded
> bin/hbase completebulkload /tmp/bulk-output table1
> echo "scan 'table1'" | bin/hbase shell
> # Take an incremental backup for each backup root
> bin/hbase backup create incremental file:/tmp/backup1 -t table1
> export BACKUP_ID1=$(bin/hbase backup history | head -n1 | tail -n -1 | grep
> -o -P "backup_\d+")
> bin/hbase backup create incremental file:/tmp/backup2 -t table1
> export BACKUP_ID2=$(bin/hbase backup history | head -n1 | tail -n -1 | grep
> -o -P "backup_\d+")
> # Restore root 1: bulk loaded data is present
> bin/hbase restore file:/tmp/backup1 $BACKUP_ID1 -t "table1" -m
> "table1-backup1"
> echo "scan 'table1-backup1'" | bin/hbase shell
> # Restore root 2: bulk loaded data is missing
> bin/hbase restore file:/tmp/backup2 $BACKUP_ID2 -t "table1" -m
> "table1-backup2"
> echo "scan 'table1-backup2'" | bin/hbase shell
> {code}
> Output of the final commands for reference:
> {code:java}
> hbase:001:0> scan 'table1-backup1'
> ROW COLUMN+CELL
>
>
> row1 column=cf:q1,
> timestamp=2024-08-02T14:43:24.403, value=value1
>
> 1 row(s)
> hbase:001:0> scan 'table1-backup2'
> ROW COLUMN+CELL
>
>
> 0 row(s)
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)