DieterDP-ng commented on code in PR #6040:
URL: https://github.com/apache/hbase/pull/6040#discussion_r1763233238
##########
hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/master/BackupLogCleaner.java:
##########
@@ -81,39 +81,55 @@ public void init(Map<String, Object> params) {
}
}
- private Map<Address, Long> getServerToNewestBackupTs(List<BackupInfo>
backups)
+ /**
+ * Calculates the timestamp boundary up to which all backup roots have
already included the WAL.
+ * I.e. WALs with a lower (= older) or equal timestamp are no longer needed
for future incremental
+ * backups.
+ */
+ private Map<Address, Long> serverToPreservationBoundaryTs(List<BackupInfo>
backups)
throws IOException {
if (LOG.isDebugEnabled()) {
LOG.debug(
- "Cleaning WALs if they are older than the newest backups. "
+ "Cleaning WALs if they are older than the newest backups (for all
roots). "
+ "Checking WALs against {} backups: {}",
backups.size(),
backups.stream().map(BackupInfo::getBackupId).sorted().collect(Collectors.joining(",
")));
}
- Map<Address, Long> serverAddressToNewestBackupMap = new HashMap<>();
- Map<TableName, Long> tableNameBackupInfoMap = new HashMap<>();
- for (BackupInfo backupInfo : backups) {
+ // This map tracks, for every backup root, the most recent created backup
(= highest timestamp)
+ Map<String, BackupInfo> newestBackupPerRootDir = new HashMap<>();
Review Comment:
A backup root corresponds to the path you specify when doing `hbase backup
create <type> <backup_path>`. Within a single root, you'll have a series of
backups. Having multiple roots allows you to manage backups independently (eg:
if you delete one root, backups in other roots are unaffected). Incremental
backups are always built on top of previous backups from the same root.
Eg, (F = full backup, I = incremental backup):
```
Time --------------------------------------------->
Root1: F I I I F I I I
Root2: F F I I I F
```
So to answer your questions:
- There are multiple `BackupInfo`s per root: 1 per backup created in that
root. A root is a `Path`.
- In this piece of logic, we only need the most recent (= newest) backup, so
`Map<String, BackupInfo>`, where the root is represented as a `String`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]