rmdmattingly commented on code in PR #6040:
URL: https://github.com/apache/hbase/pull/6040#discussion_r1768663357
##########
hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/master/BackupLogCleaner.java:
##########
@@ -81,39 +81,55 @@ public void init(Map<String, Object> params) {
}
}
- private Map<Address, Long> getServerToNewestBackupTs(List<BackupInfo>
backups)
+ /**
+ * Calculates the timestamp boundary up to which all backup roots have
already included the WAL.
+ * I.e. WALs with a lower (= older) or equal timestamp are no longer needed
for future incremental
+ * backups.
+ */
+ private Map<Address, Long> serverToPreservationBoundaryTs(List<BackupInfo>
backups)
Review Comment:
Hmm, okay so reframing your example in a single root:
Imagine
3 servers S1, S2, S3
table T1 has regions on S1 and S2, table T2 has regions on S2 and S3
we have 1 backup root R1 that backs up T1 and T2
At time:
t=0, we backup T1 in R1 => backup B1
t=0, we backup T2 in R1 => backup B2
t=10, we backup T1 in R1 => backup B3
t=20, we backup T2 in R1 => backup B4
Following the logic in this method:
`newestBackupPerRootDir` will contain: (R1: B4)
We would calculate the preservation boundaries here:
```java
// This map tracks, for every address, the least recent (= oldest /
lowest timestamp) inclusion
// in any backup. In other words, it is the timestamp boundary up to
which all backups roots
// have included the WAL in their backup.
Map<Address, Long> boundaries = new HashMap<>();
for (BackupInfo backupInfo : newestBackupPerRootDir.values()) {
for (TableName table : backupInfo.getTables()) {
for (Map.Entry<String, Long> entry :
backupInfo.getTableSetTimestampMap().get(table)
.entrySet()) {
Address address = Address.fromString(entry.getKey());
Long storedTs = boundaries.get(address);
if (storedTs == null || entry.getValue() < storedTs) {
boundaries.put(address, entry.getValue());
}
}
}
}
```
Since the newest backup in R1 is B4, `serverToPreservationBoundaryTs` will
contain (S1: 20, S2: 20, S3: 20)
In this situation, we must not delete WALs from S1 or S2 between times 10
and 20 because a subsequent incremental backup of T1 will require those WALs.
In the BackupLogCleaner, though, we will end up calling `canDeleteFile` and
hitting this code with that `serverToPreservationBoundaryTs` (S1: 20, S2: 20,
S3: 20) renamed as `addressToBoundaryTs`:
```java
if (!addressToBoundaryTs.containsKey(walServerAddress)) { // for S1 &
S2, this will be false
// irrelevant ...
}
Long backupBoundary = addressToBoundaryTs.get(walServerAddress); //
for S1 & S2, this will be 20
if (backupBoundary >= walTimestamp) { // For WALs between times 10 and
20, this will be true
if (LOG.isDebugEnabled()) {
LOG.debug(
"Backup found for server: {}. All backups from {} are newer than
file, so deleting: {}",
walServerAddress.getHostName(), backupBoundary, path);
}
// so we will erroneously delete WALs <20, despite the next inc
backup of T1 requiring them
return true;
}
```
I guess this depends on how ancestry is defined? If we consider all backups
in a root to be ancestors regardless of their table set, then maybe it is okay
to delete these WALs. But, if not, then I don't see how it is okay?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]