d-c-manning commented on a change in pull request #4274:
URL: https://github.com/apache/hbase/pull/4274#discussion_r835710881
##########
File path:
hbase-server/src/main/java/org/apache/hadoop/hbase/backup/HFileArchiver.java
##########
@@ -488,26 +488,46 @@ private static boolean resolveAndArchiveFile(Path
archiveDir, File currentFile,
Path archiveFile = new Path(archiveDir, filename);
FileSystem fs = currentFile.getFileSystem();
- // if the file already exists in the archive, move that one to a
timestamped backup. This is a
- // really, really unlikely situtation, where we get the same name for the
existing file, but
- // is included just for that 1 in trillion chance.
+ // An existing destination file in the archive is unexpected, but we
handle it here.
if (fs.exists(archiveFile)) {
- LOG.debug("{} already exists in archive, moving to timestamped backup
and " +
- "overwriting current.", archiveFile);
+ if (!fs.exists(currentFile.getPath())) {
+ // If the file already exists in the archive, and there is no current
file to archive, then
+ // assume that the file in archive is correct. This is an unexpected
situation, suggesting a
+ // race condition or split brain.
+ // In HBASE-26718 this was found when compaction incorrectly happened
during warmupRegion.
+ LOG.warn("{} exists in archive. Attempted to archive nonexistent file
{}.", archiveFile,
+ currentFile);
+ // We return success to match existing behavior in this method, where
FileNotFoundException
+ // in moveAndClose is ignored.
+ return true;
+ }
+ // There is a conflict between the current file and the already existing
archived file.
+ // Move the archived file to a timestamped backup. This is a really,
really unlikely
+ // situation, where we get the same name for the existing file, but is
included just for that
+ // 1 in trillion chance. We are potentially incurring data loss in the
archive directory if
+ // the files are not identical. The timestamped backup will be cleaned
by HFileCleaner as it
+ // has no references.
+ FileStatus curStatus = fs.getFileStatus(currentFile.getPath());
Review comment:
If we compare lengths and they are the same, and we return `true`,
should we also delete the source file? That is what the calling code would
expect.
Do you think that behavior is worth the change? I do like the idea of
throwing an exception in the case where we are overwriting a file with
different length. Of course, it is a logic change.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]