[
https://issues.apache.org/jira/browse/HBASE-29826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18051976#comment-18051976
]
Kevin Geiszler commented on HBASE-29826:
----------------------------------------
I was able to confirm that the issue is with [fs.rename() in
MapReduceBackupMergeJob.java|https://github.com/apache/hbase/blob/HBASE-28957/hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/mapreduce/MapReduceBackupMergeJob.java#L160-L164].
{code:java}
if (fs.exists(new Path(String.valueOf(tmpBackupDir)))) {
LOG.info("Temporary backup directory already exists: {}. Moving contents of
{} to this "
+ "directory instead of moving the entire directory", tmpBackupDir,
backupDirPath);
FileStatus[] backupDirContents = fs.listStatus(backupDirPath);
for (FileStatus status : backupDirContents) {
Path src = status.getPath();
Path dst = new Path(tmpBackupDir, src.getName());
if (!fs.rename(src, dst)) {
throw new IOException("Failed to rename " + src + " to " + dst);
} else {
LOG.debug("Renamed {} to {}", src, dst);
}
}
} else {
LOG.info("Temporary backup directory does not exists: {}. Moving backup
directory to here",
tmpBackupDir);
if (!fs.rename(backupDirPath, tmpBackupDir)) {
throw new IOException("Failed to rename " + backupDirPath + " to " +
tmpBackupDir);
} else {
LOG.debug("Renamed {} to {}", backupDirPath, tmpBackupDir);
}
}{code}
> Backup merge is failing because .backup.manifest cannot be found
> ----------------------------------------------------------------
>
> Key: HBASE-29826
> URL: https://issues.apache.org/jira/browse/HBASE-29826
> Project: HBase
> Issue Type: Bug
> Components: backup&restore
> Reporter: Kevin Geiszler
> Priority: Major
>
> {{IntegrationTestContinuousBackupRestore}} is failing during the {{merge()}}
> attempt because {{.backup.manifest}} is not in the expected location. The
> manifest is still in the backup directory, but the actual path to the
> manifest is incorrect.
> For example, the correctly expected path is something like the following:
> {noformat}
> hdfs://localhost:60834/user/kgeiszler/test-data/7db8c206-5859-1fde-49d1-5e9bcadd8a62/backupIT/.tmp/backup_1768259498500/.backup.manifest{noformat}
> However, the actual path has the backup directory twice, such as the
> following:
> {noformat}
> hdfs://localhost:60834/user/kgeiszler/test-data/7db8c206-5859-1fde-49d1-5e9bcadd8a62/backupIT/.tmp/backup_1768259498500/backup_1768259498500/.backup.manifest
> {noformat}
> This may be happening because of the {{fs.rename()}} operation in
> [MapReduceBackupMergeJob.java|https://github.com/apache/hbase/blob/HBASE-28957/hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/mapreduce/MapReduceBackupMergeJob.java#L160].
> This issue causes the following error:
> {code:java}
> 2026-01-12T09:47:25,898 ERROR [Thread-223 {}]
> mapreduce.MapReduceBackupMergeJob(189): java.io.IOException: Could not find
> backup manifest .backup.manifest for backup_1768240000001. File
> hdfs://localhost:50226/user/kgeiszler/test-data/fc219c8d-9465-33de-85b7-e888fafd2a6d/backupIT/.tmp/backup_1768240000001/.backup.manifest
> does not exists. Did backup_1768240000001 correspond to previously taken
> backup ?java.io.IOException: Could not find backup manifest .backup.manifest
> for backup_1768240000001. File
> hdfs://localhost:50226/user/kgeiszler/test-data/fc219c8d-9465-33de-85b7-e888fafd2a6d/backupIT/.tmp/backup_1768240000001/.backup.manifest
> does not exists. Did backup_1768240000001 correspond to previously taken
> backup ? at
> org.apache.hadoop.hbase.backup.HBackupFileSystem.getManifestPath(HBackupFileSystem.java:126)
> ~[classes/:?] at
> org.apache.hadoop.hbase.backup.HBackupFileSystem.getManifestPath(HBackupFileSystem.java:111)
> ~[classes/:?] at
> org.apache.hadoop.hbase.backup.HBackupFileSystem.getManifest(HBackupFileSystem.java:143)
> ~[classes/:?] at
> org.apache.hadoop.hbase.backup.mapreduce.MapReduceBackupMergeJob.updateBackupManifest(MapReduceBackupMergeJob.java:314)
> ~[classes/:?] at
> org.apache.hadoop.hbase.backup.mapreduce.MapReduceBackupMergeJob.run(MapReduceBackupMergeJob.java:171)
> ~[classes/:?] at
> org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.mergeBackups(BackupAdminImpl.java:700)
> ~[classes/:?] at
> org.apache.hadoop.hbase.backup.IntegrationTestBackupRestoreBase.merge(IntegrationTestBackupRestoreBase.java:396)
> ~[test-classes/:?] at
> org.apache.hadoop.hbase.backup.IntegrationTestBackupRestoreBase.runTestSingle(IntegrationTestBackupRestoreBase.java:326)
> ~[test-classes/:?] at
> org.apache.hadoop.hbase.backup.IntegrationTestBackupRestoreBase$BackupAndRestoreThread.run(IntegrationTestBackupRestoreBase.java:136)
> ~[test-classes/:?] {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)