rmdmattingly commented on code in PR #6040:
URL: https://github.com/apache/hbase/pull/6040#discussion_r1763439938
##########
hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/master/BackupLogCleaner.java:
##########
@@ -81,39 +81,55 @@ public void init(Map<String, Object> params) {
}
}
- private Map<Address, Long> getServerToNewestBackupTs(List<BackupInfo>
backups)
+ /**
+ * Calculates the timestamp boundary up to which all backup roots have
already included the WAL.
+ * I.e. WALs with a lower (= older) or equal timestamp are no longer needed
for future incremental
+ * backups.
+ */
+ private Map<Address, Long> serverToPreservationBoundaryTs(List<BackupInfo>
backups)
Review Comment:
should this method have a precondition asserting that all BackupInfos in the
list have an identical table set?
Otherwise I think this could cause data loss in an incremental backup. Let's
say we have one-table and another-table on our cluster, and we back up each one
independently. Let's say we first took a backup for one-table, then a backup
for another-table. Then we passed the corresponding BackupInfos into this
method, this would yield preservation boundaries which would remove WALs
necessary for the next incremental backup of one-table
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]