[
https://issues.apache.org/jira/browse/HBASE-28502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Dimiduk reassigned HBASE-28502:
------------------------------------
Assignee: Dieter De Paepe
> Backup manifest of full backup contains incomplete table list
> -------------------------------------------------------------
>
> Key: HBASE-28502
> URL: https://issues.apache.org/jira/browse/HBASE-28502
> Project: HBase
> Issue Type: Bug
> Components: backup&restore
> Affects Versions: 2.6.0, 4.0.0-alpha-1
> Reporter: thomassarens
> Assignee: Dieter De Paepe
> Priority: Major
> Labels: pull-request-available
> Attachments: TestBackupManifest.java
>
>
> Noticed that the {{BackupManifest#getTableNames}} is returning only a single
> table instead of the complete list of tables that were requested for the
> backup in case of a full backup, in case of an incremental backup the
> manifest table list seem complete.
> Checking the {{TableBackupClient#addManifest}} method shows why:
> * While looping over the included tables the manifest is stored per table
> and the comment mentions something about storing the manifest with the table
> directory:
> {code:java}
> // Since we have each table's backup in its own directory structure,
> // we'll store its manifest with the table directory.
> for (TableName table : backupInfo.getTables()) {
> manifest = new BackupManifest(backupInfo, table);
> ArrayList<BackupImage> ancestors = backupManager.getAncestors(backupInfo,
> table);
> for (BackupImage image : ancestors) {
> manifest.addDependentImage(image);
> }
> if (type == BackupType.INCREMENTAL) {
> // We'll store the log timestamps for this table only in its manifest.
> Map<TableName, Map<String, Long>> tableTimestampMap = new HashMap<>();
> tableTimestampMap.put(table, backupInfo.getIncrTimestampMap().get(table));
> manifest.setIncrTimestampMap(tableTimestampMap);
> ArrayList<BackupImage> ancestorss =
> backupManager.getAncestors(backupInfo);
> for (BackupImage image : ancestorss) {
> manifest.addDependentImage(image);
> }
> }
> manifest.store(conf);
> }{code}
> * but the manifest path is based on the backup root dir and backup id, so it
> is not on a table directory level:
> {code:java}
> Path manifestFilePath = new
> Path(HBackupFileSystem.getBackupPath(backupImage.getRootDir(),
> backupImage.getBackupId()), MANIFEST_FILE_NAME);{code}
> * so each call to {{manifest.store(conf)}} is just overwriting the same
> manifest file
> * for incremental backups the "complete" manifest is stored as well with
> {{manifest.store(conf)}} and using the exact same path, so that explains why
> it is correct for incremental backups:
> {code:java}
> // For incremental backup, we store a overall manifest in
> // <backup-root-dir>/WALs/<backup-id>
> // This is used when created the next incremental backup
> if (type == BackupType.INCREMENTAL) {
> manifest = new BackupManifest(backupInfo);
> // set the table region server start and end timestamps for incremental backup
> manifest.setIncrTimestampMap(backupInfo.getIncrTimestampMap());
> ArrayList<BackupImage> ancestors = backupManager.getAncestors(backupInfo);
> for (BackupImage image : ancestors) {
> manifest.addDependentImage(image);
> }
> manifest.store(conf);
> }{code}
> * the comment related to the manifest path being
> {{<backup-root-dir>/WALs/<backup-id>}} is incorrect
>
> I created a simple test that verifies this issue [^TestBackupManifest.java] ,
> but no idea how to fix this. Perhaps only the overall manifest file should be
> stored on \{{<backup-root-dir>/<backup-id> }}level, but that goes against the
> comments here so not sure.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)