[GitHub] [hbase] apurtell commented on a change in pull request #4274: fix for HBASE-26718

GitBox Fri, 25 Mar 2022 10:33:19 -0700


apurtell commented on a change in pull request #4274:
URL: https://github.com/apache/hbase/pull/4274#discussion_r835476913




##########
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/backup/HFileArchiver.java
##########
@@ -488,26 +488,46 @@ private static boolean resolveAndArchiveFile(Path 
archiveDir, File currentFile,
     Path archiveFile = new Path(archiveDir, filename);
     FileSystem fs = currentFile.getFileSystem();
 
-    // if the file already exists in the archive, move that one to a 
timestamped backup. This is a
-    // really, really unlikely situtation, where we get the same name for the 
existing file, but
-    // is included just for that 1 in trillion chance.
+    // An existing destination file in the archive is unexpected, but we 
handle it here.
     if (fs.exists(archiveFile)) {
-      LOG.debug("{} already exists in archive, moving to timestamped backup 
and " +
-          "overwriting current.", archiveFile);
+      if (!fs.exists(currentFile.getPath())) {
+        // If the file already exists in the archive, and there is no current 
file to archive, then
+        // assume that the file in archive is correct. This is an unexpected 
situation, suggesting a
+        // race condition or split brain.
+        // In HBASE-26718 this was found when compaction incorrectly happened 
during warmupRegion.
+        LOG.warn("{} exists in archive. Attempted to archive nonexistent file 
{}.", archiveFile,
+          currentFile);
+        // We return success to match existing behavior in this method, where 
FileNotFoundException
+        // in moveAndClose is ignored.
+        return true;
+      }
+      // There is a conflict between the current file and the already existing 
archived file.
+      // Move the archived file to a timestamped backup. This is a really, 
really unlikely
+      // situation, where we get the same name for the existing file, but is 
included just for that
+      // 1 in trillion chance. We are potentially incurring data loss in the 
archive directory if
+      // the files are not identical. The timestamped backup will be cleaned 
by HFileCleaner as it
+      // has no references.
+      FileStatus curStatus = fs.getFileStatus(currentFile.getPath());

Review comment:
       I was thinking we could check file lengths, and if they are the same, 
then there is a high probability they are the same file. It would be the rare 
unfortunate outlier if the files were the exact same length but byte-by-byte 
different in some way. That said, it doesn't really change the risks in your 
proposed change. Even if same length we move the current file out of the way 
and drop the new one, you are only doing an existence check. 
   
   Where this might change the associated risks however is we could avoid 
moving the file and instead throw an exception if we detect a probable 
difference between already archived file and a new file we are being asked to 
archive to the exact same path. An existence check isn't sufficient for that. 
   
   Consider, we test for existence. Then we compare lengths. If lengths are the 
same, we consider it the same file and return 'true'. Otherwise we throw an 
exception. In some ways this maintains the same remote possibility that we 
incur data loss, if equivalent length does not mean same content. This does not 
change the risks fundamentally in that respect. So, by that I mean it is no 
worse. However we then **improve** the risks where an existing file is found 
and lengths are different, indicating divergence and data loss is possible. 
That should really be the 1 in trillion chance and deserves an exception 
instead of silent acceptance. 
   
   For your consideration. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hbase] apurtell commented on a change in pull request #4274: fix for HBASE-26718

Reply via email to