rmdmattingly commented on code in PR #6040:
URL: https://github.com/apache/hbase/pull/6040#discussion_r1763439938


##########
hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/master/BackupLogCleaner.java:
##########
@@ -81,39 +81,55 @@ public void init(Map<String, Object> params) {
     }
   }
 
-  private Map<Address, Long> getServerToNewestBackupTs(List<BackupInfo> 
backups)
+  /**
+   * Calculates the timestamp boundary up to which all backup roots have 
already included the WAL.
+   * I.e. WALs with a lower (= older) or equal timestamp are no longer needed 
for future incremental
+   * backups.
+   */
+  private Map<Address, Long> serverToPreservationBoundaryTs(List<BackupInfo> 
backups)

Review Comment:
   should this method have a precondition asserting that all BackupInfos in the 
list have an identical table set?
   
   Otherwise I think this could cause data loss in an incremental backup. Let's 
say we have one-table and another-table on our cluster, and we back up each one 
independently. Let's say we first took a backup for one-table, then a backup 
for another-table. Then we passed the corresponding BackupInfos into this 
method, this would yield preservation boundaries which would remove WALs 
necessary for the next incremental backup of one-table



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to