[ https://issues.apache.org/jira/browse/HBASE-28502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17843666#comment-17843666 ]
Dieter De Paepe commented on HBASE-28502: ----------------------------------------- Based on older commits, I got the impression that the points mentioned here are leftovers from the backup-restore implementation before merging was implemented. Looks like back then, a manifest was created per table. I have created a PR with fix and testcases. > Backup manifest of full backup contains incomplete table list > ------------------------------------------------------------- > > Key: HBASE-28502 > URL: https://issues.apache.org/jira/browse/HBASE-28502 > Project: HBase > Issue Type: Bug > Components: backup&restore > Affects Versions: 2.6.0, 4.0.0-alpha-1 > Reporter: thomassarens > Priority: Major > Labels: pull-request-available > Attachments: TestBackupManifest.java > > > Noticed that the {{BackupManifest#getTableNames}} is returning only a single > table instead of the complete list of tables that were requested for the > backup in case of a full backup, in case of an incremental backup the > manifest table list seem complete. > Checking the {{TableBackupClient#addManifest}} method shows why: > * While looping over the included tables the manifest is stored per table > and the comment mentions something about storing the manifest with the table > directory: > {code:java} > // Since we have each table's backup in its own directory structure, > // we'll store its manifest with the table directory. > for (TableName table : backupInfo.getTables()) { > manifest = new BackupManifest(backupInfo, table); > ArrayList<BackupImage> ancestors = backupManager.getAncestors(backupInfo, > table); > for (BackupImage image : ancestors) { > manifest.addDependentImage(image); > } > if (type == BackupType.INCREMENTAL) { > // We'll store the log timestamps for this table only in its manifest. > Map<TableName, Map<String, Long>> tableTimestampMap = new HashMap<>(); > tableTimestampMap.put(table, backupInfo.getIncrTimestampMap().get(table)); > manifest.setIncrTimestampMap(tableTimestampMap); > ArrayList<BackupImage> ancestorss = > backupManager.getAncestors(backupInfo); > for (BackupImage image : ancestorss) { > manifest.addDependentImage(image); > } > } > manifest.store(conf); > }{code} > * but the manifest path is based on the backup root dir and backup id, so it > is not on a table directory level: > {code:java} > Path manifestFilePath = new > Path(HBackupFileSystem.getBackupPath(backupImage.getRootDir(), > backupImage.getBackupId()), MANIFEST_FILE_NAME);{code} > * so each call to {{manifest.store(conf)}} is just overwriting the same > manifest file > * for incremental backups the "complete" manifest is stored as well with > {{manifest.store(conf)}} and using the exact same path, so that explains why > it is correct for incremental backups: > {code:java} > // For incremental backup, we store a overall manifest in > // <backup-root-dir>/WALs/<backup-id> > // This is used when created the next incremental backup > if (type == BackupType.INCREMENTAL) { > manifest = new BackupManifest(backupInfo); > // set the table region server start and end timestamps for incremental backup > manifest.setIncrTimestampMap(backupInfo.getIncrTimestampMap()); > ArrayList<BackupImage> ancestors = backupManager.getAncestors(backupInfo); > for (BackupImage image : ancestors) { > manifest.addDependentImage(image); > } > manifest.store(conf); > }{code} > * the comment related to the manifest path being > {{<backup-root-dir>/WALs/<backup-id>}} is incorrect > > I created a simple test that verifies this issue [^TestBackupManifest.java] , > but no idea how to fix this. Perhaps only the overall manifest file should be > stored on \{{<backup-root-dir>/<backup-id> }}level, but that goes against the > comments here so not sure. -- This message was sent by Atlassian Jira (v8.20.10#820010)