[ 
https://issues.apache.org/jira/browse/HBASE-29891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18060079#comment-18060079
 ] 

Vinayak Hegde commented on HBASE-29891:
---------------------------------------

Thanks [~kgeiszler]

I think we need to keep the generated HFiles until the copy phase is finished.

Currently, we iterate over all the tables included in the backup and generate 
HFiles from the WALs accumulated since the last backup for each specific table. 
The `previousBackupTs` can be different for different tables because they may 
have been part of different backup sets.

For example:
 * B1 → table1, table2
 * B2 → table1
 * B3 → table1, table2

In this case, the `previousBackupTs` differs between tables. This is the reason 
we cannot run a single WALPlayer job for all tables.

In the next step, we copy all files from the temporary directory to the actual 
destination:

[https://github.com/apache/hbase/blob/HBASE-28957_rebased/hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/IncrementalTableBackupClient.java#L362-L364]
 

So we must retain the generated HFiles until incrementalCopyHFiles() is 
executed.

Since Hadoop MapReduce does not allow the output directory to already exist, we 
need to create a separate output directory per table. I believe we already have 
the required logic for this.

We can replace `getBulkOutputDir()` with `getBulkOutputDirForTable()` here:

[https://github.com/apache/hbase/blob/fc752f7ce5542666f5671008c205a668cf57af59/hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/IncrementalTableBackupClient.java#L529]
 

This will generate HFiles in separate directories for each table.

Then, while consuming the generated files (here:
[https://github.com/apache/hbase/blob/fc752f7ce5542666f5671008c205a668cf57af59/hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/IncrementalTableBackupClient.java#L363-L364]
 

we need to ensure that we pass all table-specific output directories as a 
string array. We can collect these by calling `getBulkOutputDirForTable()` for 
each table.

and things can stay same for the non-continuous backup, (it put everything 
under one directory and consume it like that).




Additionally, I noticed another issue here:

[https://github.com/apache/hbase/blob/fc752f7ce5542666f5671008c205a668cf57af59/hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/IncrementalTableBackupClient.java#L527]
 

The job name should be made unique. Currently, it is the same for all job runs 
across different tables within the same backup, which could cause confusion in 
job tracking.

> Multi-table continuous incremental backup is failing because output directory 
> already exists
> --------------------------------------------------------------------------------------------
>
>                 Key: HBASE-29891
>                 URL: https://issues.apache.org/jira/browse/HBASE-29891
>             Project: HBase
>          Issue Type: Bug
>          Components: backup&restore
>    Affects Versions: 2.6.0, 3.0.0-alpha-4
>            Reporter: Kevin Geiszler
>            Assignee: Kevin Geiszler
>            Priority: Major
>         Attachments: full-log.txt, unit-test.txt
>
>
> This was discovered while writing a Point-in-Time Restore integration test 
> for HBASE-28957.
> Running an incremental backup with continuous backup enabled on multiple 
> tables results in the following error:
> {noformat}
> Output directory hdfs://localhost:64120/backupUT/.tmp/backup_1770846846624 
> already exists{noformat}
> Here is the full error and stack trace:
> {code:java}
> 2026-02-11T13:54:17,945 ERROR [Time-limited test {}] 
> impl.TableBackupClient(232): Unexpected exception in incremental-backup: 
> incremental copy backup_1770846846624Output directory 
> hdfs://localhost:64120/backupUT/.tmp/backup_1770846846624 already exists
> org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory 
> hdfs://localhost:64120/backupUT/.tmp/backup_1770846846624 already exists
>  at 
> org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:164)
>  ~[hadoop-mapreduce-client-core-3.4.2.jar:?]
>  at 
> org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:278) 
> ~[hadoop-mapreduce-client-core-3.4.2.jar:?]
>  at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:142)
>  ~[hadoop-mapreduce-client-core-3.4.2.jar:?]
>  at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1677) 
> ~[hadoop-mapreduce-client-core-3.4.2.jar:?]
>  at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1674) 
> ~[hadoop-mapreduce-client-core-3.4.2.jar:?]
>  at java.security.AccessController.doPrivileged(AccessController.java:712) 
> ~[?:?]
>  at javax.security.auth.Subject.doAs(Subject.java:439) ~[?:?]
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1953)
>  ~[hadoop-common-3.4.2.jar:?]
>  at org.apache.hadoop.mapreduce.Job.submit(Job.java:1674) 
> ~[hadoop-mapreduce-client-core-3.4.2.jar:?]
>  at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1695) 
> ~[hadoop-mapreduce-client-core-3.4.2.jar:?]
>  at org.apache.hadoop.hbase.mapreduce.WALPlayer.run(WALPlayer.java:482) 
> ~[classes/:?]
>  at 
> org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.walToHFiles(IncrementalTableBackupClient.java:545)
>  ~[classes/:?]
>  at 
> org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.convertWALsToHFiles(IncrementalTableBackupClient.java:488)
>  ~[classes/:?]
>  at 
> org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.execute(IncrementalTableBackupClient.java:363)
>  ~[classes/:?]
>  at 
> org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.backupTables(BackupAdminImpl.java:681)
>  ~[classes/:?]
>  at 
> org.apache.hadoop.hbase.backup.TestBackupBase.backupTables(TestBackupBase.java:445)
>  ~[test-classes/:?]
>  at 
> org.apache.hadoop.hbase.backup.TestIncrementalBackupWithContinuous.testMultiTableContinuousBackupWithIncrementalBackupSuccess(TestIncrementalBackupWithContinuous.java:192)
>  ~[test-classes/:?]
>  at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:?]
>  at 
> jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>  ~[?:?]
>  at 
> jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:?]
>  at java.lang.reflect.Method.invoke(Method.java:569) ~[?:?]
>  at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  ~[junit-4.13.2.jar:4.13.2]
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  ~[junit-4.13.2.jar:4.13.2]
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  ~[junit-4.13.2.jar:4.13.2]
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  ~[junit-4.13.2.jar:4.13.2]
>  at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) 
> ~[junit-4.13.2.jar:4.13.2]
>  at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) 
> ~[junit-4.13.2.jar:4.13.2]
>  at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) 
> ~[junit-4.13.2.jar:4.13.2]
>  at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>  ~[junit-4.13.2.jar:4.13.2]
>  at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) 
> ~[junit-4.13.2.jar:4.13.2]
>  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>  ~[junit-4.13.2.jar:4.13.2]
>  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>  ~[junit-4.13.2.jar:4.13.2]
>  at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) 
> ~[junit-4.13.2.jar:4.13.2]
>  at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) 
> ~[junit-4.13.2.jar:4.13.2]
>  at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) 
> ~[junit-4.13.2.jar:4.13.2]
>  at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) 
> ~[junit-4.13.2.jar:4.13.2]
>  at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) 
> ~[junit-4.13.2.jar:4.13.2]
>  at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) 
> ~[junit-4.13.2.jar:4.13.2]
>  at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) 
> ~[junit-4.13.2.jar:4.13.2]
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>  ~[junit-4.13.2.jar:4.13.2]
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>  ~[junit-4.13.2.jar:4.13.2]
>  at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:264) ~[?:?]
>  at java.util.concurrent.FutureTask.run(FutureTask.java) ~[?:?]
>  at java.lang.Thread.run(Thread.java:840) ~[?:?] {code}
> A full log of a unit test run that reproduces the error has been attached to 
> this ticket.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to