DieterDP-ng commented on code in PR #6040:
URL: https://github.com/apache/hbase/pull/6040#discussion_r1763467993
##########
hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/master/BackupLogCleaner.java:
##########
@@ -81,39 +81,55 @@ public void init(Map<String, Object> params) {
}
}
- private Map<Address, Long> getServerToNewestBackupTs(List<BackupInfo>
backups)
+ /**
+ * Calculates the timestamp boundary up to which all backup roots have
already included the WAL.
+ * I.e. WALs with a lower (= older) or equal timestamp are no longer needed
for future incremental
+ * backups.
+ */
+ private Map<Address, Long> serverToPreservationBoundaryTs(List<BackupInfo>
backups)
Review Comment:
I don't see an issue. Trying to follow your example:
- Imagine 3 servers S1, S2, S3
- table T1 has regions on S1 and S2, table T2 has regions on S2 and S3
- we have 2 backup roots R1 that backs up T1, and R2 that backs up T2
At time:
- t=0, we backup T1 in R1 => backup B1
- t=0, we backup T2 in R2 => backup B2
- t=10, we backup T1 in R1 => backup B3
- t=20, we backup T2 in R2. => backup B4
Following the logic in this method:
- newestBackupPerRootDir will contain: (R1: B3, R2: B4)
- boundaries will contain: (S1: 10, S2: 10, S3: 20)
So for T1, all WALs up to t=10 can be deleted, for T2, WALs will be
preserved from t=10 or t=20, depending whether other tables are present.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]