DieterDP-ng commented on code in PR #6040:
URL: https://github.com/apache/hbase/pull/6040#discussion_r1763233238


##########
hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/master/BackupLogCleaner.java:
##########
@@ -81,39 +81,55 @@ public void init(Map<String, Object> params) {
     }
   }
 
-  private Map<Address, Long> getServerToNewestBackupTs(List<BackupInfo> 
backups)
+  /**
+   * Calculates the timestamp boundary up to which all backup roots have 
already included the WAL.
+   * I.e. WALs with a lower (= older) or equal timestamp are no longer needed 
for future incremental
+   * backups.
+   */
+  private Map<Address, Long> serverToPreservationBoundaryTs(List<BackupInfo> 
backups)
     throws IOException {
     if (LOG.isDebugEnabled()) {
       LOG.debug(
-        "Cleaning WALs if they are older than the newest backups. "
+        "Cleaning WALs if they are older than the newest backups (for all 
roots). "
           + "Checking WALs against {} backups: {}",
         backups.size(),
         
backups.stream().map(BackupInfo::getBackupId).sorted().collect(Collectors.joining(",
 ")));
     }
-    Map<Address, Long> serverAddressToNewestBackupMap = new HashMap<>();
 
-    Map<TableName, Long> tableNameBackupInfoMap = new HashMap<>();
-    for (BackupInfo backupInfo : backups) {
+    // This map tracks, for every backup root, the most recent created backup 
(= highest timestamp)
+    Map<String, BackupInfo> newestBackupPerRootDir = new HashMap<>();

Review Comment:
   A backup root corresponds to the path you specify when doing `hbase backup 
create <type> <backup_path>`. Within a single root, you'll have a series of 
backups. Having multiple roots allows you to manage backups independently (eg: 
if you delete one root, backups in other roots are unaffected). Incremental 
backups are always built on top of previous backups from the same root.
   
   Eg, (F = full backup, I = incremental backup):
   ```
   Time --------------------------------------------->
   Root1: F    I     I     I     F    I    I    I
   Root2: F          F  I    I   I              F
   ```
   
   So to answer your questions:
   - There are multiple `BackupInfo`s per root: 1 per backup created in that 
root. A root is a `Path`.
   - In this piece of logic, we only need the most recent (= newest) backup, so 
`Map<String, BackupInfo>`, where the root is represented as a `String`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to