[ 
https://issues.apache.org/jira/browse/HBASE-28502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17843781#comment-17843781
 ] 

Dieter De Paepe commented on HBASE-28502:
-----------------------------------------

I was investigating another bug, but it lead me to this ticket as root cause.

The error in the manifest handling causes the backup validation to fail to 
detect a table is not part of a specific backup.

Eg:
{code:java}
$ echo "create 'table1', 'cf'" | bin/hbase shell -n
$ bin/hbase backup create full file:/tmp/hbasebackups -t table1
$ # Validation - will say everything is OK
$ bin/hbase restore file:/tmp/hbasebackups $(bin/hbase backup history | head 
-n1  | tail -n -1 | grep -o -P "backup_\d+") -o -t table2 -c
[...]
2024-05-06T17:53:56,072 INFO  [main {}] impl.BackupAdminImpl: Checking backup 
images: OK

$ bin/hbase restore file:/tmp/hbasebackups $(bin/hbase backup history | head 
-n1  | tail -n -1 | grep -o -P "backup_\d+") -o -t table2  
2024-05-06T17:54:09,320 ERROR [main {}] backup.RestoreDriver: Error while 
running restore backup
java.io.IOException: Table snapshot directory: 
file:/tmp/hbasebackups/backup_1715010823851/default/table2/.hbase-snapshot does 
not exist.
    at 
org.apache.hadoop.hbase.backup.util.RestoreTool.createAndRestoreTable(RestoreTool.java:316)
 ~[hbase-backup-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT]
    at 
org.apache.hadoop.hbase.backup.util.RestoreTool.fullRestoreTable(RestoreTool.java:211)
 ~[hbase-backup-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT]
    at 
org.apache.hadoop.hbase.backup.impl.RestoreTablesClient.restoreImages(RestoreTablesClient.java:151)
 ~[hbase-backup-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT]
    at 
org.apache.hadoop.hbase.backup.impl.RestoreTablesClient.restore(RestoreTablesClient.java:229)
 ~[hbase-backup-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT]
    at 
org.apache.hadoop.hbase.backup.impl.RestoreTablesClient.execute(RestoreTablesClient.java:265)
 ~[hbase-backup-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT]
    at 
org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.restore(BackupAdminImpl.java:518)
 ~[hbase-backup-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT]
    at 
org.apache.hadoop.hbase.backup.RestoreDriver.parseAndRun(RestoreDriver.java:176)
 ~[hbase-backup-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT]
    at 
org.apache.hadoop.hbase.backup.RestoreDriver.doWork(RestoreDriver.java:216) 
~[hbase-backup-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT]
    at org.apache.hadoop.hbase.backup.RestoreDriver.run(RestoreDriver.java:252) 
~[hbase-backup-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT]
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82) 
~[hadoop-common-3.3.5.jar:?]
    at 
org.apache.hadoop.hbase.backup.RestoreDriver.main(RestoreDriver.java:224) 
~[hbase-backup-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT]{code}

> Backup manifest of full backup contains incomplete table list
> -------------------------------------------------------------
>
>                 Key: HBASE-28502
>                 URL: https://issues.apache.org/jira/browse/HBASE-28502
>             Project: HBase
>          Issue Type: Bug
>          Components: backup&restore
>    Affects Versions: 2.6.0, 4.0.0-alpha-1
>            Reporter: thomassarens
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: TestBackupManifest.java
>
>
> Noticed that the {{BackupManifest#getTableNames}} is returning only a single 
> table instead of the complete list of tables that were requested for the 
> backup in case of a full backup, in case of an incremental backup the 
> manifest table list seem complete.
> Checking the {{TableBackupClient#addManifest}} method shows why:
>  * While looping over the included tables the manifest is stored per table 
> and the comment mentions something about storing the manifest with the table 
> directory:
> {code:java}
> // Since we have each table's backup in its own directory structure,
> // we'll store its manifest with the table directory.
> for (TableName table : backupInfo.getTables()) {
>   manifest = new BackupManifest(backupInfo, table);
>   ArrayList<BackupImage> ancestors = backupManager.getAncestors(backupInfo, 
> table);
>   for (BackupImage image : ancestors) {
>     manifest.addDependentImage(image);
>   }
>   if (type == BackupType.INCREMENTAL) {
>     // We'll store the log timestamps for this table only in its manifest.
>     Map<TableName, Map<String, Long>> tableTimestampMap = new HashMap<>();
>     tableTimestampMap.put(table, backupInfo.getIncrTimestampMap().get(table));
>     manifest.setIncrTimestampMap(tableTimestampMap);
>     ArrayList<BackupImage> ancestorss = 
> backupManager.getAncestors(backupInfo);
>     for (BackupImage image : ancestorss) {
>       manifest.addDependentImage(image);
>     }
>   }
>   manifest.store(conf);
> }{code}
>  * but the manifest path is based on the backup root dir and backup id, so it 
> is not on a table directory level: 
> {code:java}
> Path manifestFilePath = new 
> Path(HBackupFileSystem.getBackupPath(backupImage.getRootDir(), 
> backupImage.getBackupId()), MANIFEST_FILE_NAME);{code}
>  * so each call to {{manifest.store(conf)}} is just overwriting the same 
> manifest file
>  * for incremental backups the "complete" manifest is stored as well with 
> {{manifest.store(conf)}} and using the exact same path, so that explains why 
> it is correct for incremental backups:
> {code:java}
> // For incremental backup, we store a overall manifest in
> // <backup-root-dir>/WALs/<backup-id>
> // This is used when created the next incremental backup
> if (type == BackupType.INCREMENTAL) {
> manifest = new BackupManifest(backupInfo);
> // set the table region server start and end timestamps for incremental backup
> manifest.setIncrTimestampMap(backupInfo.getIncrTimestampMap());
> ArrayList<BackupImage> ancestors = backupManager.getAncestors(backupInfo);
> for (BackupImage image : ancestors) {
> manifest.addDependentImage(image);
> }
> manifest.store(conf);
> }{code}
>  * the comment related to the manifest path being 
> {{<backup-root-dir>/WALs/<backup-id>}} is incorrect
>  
> I created a simple test that verifies this issue [^TestBackupManifest.java] , 
> but no idea how to fix this. Perhaps only the overall manifest file should be 
> stored on \{{<backup-root-dir>/<backup-id> }}level, but that goes against the 
> comments here so not sure.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to