vinayakphegde commented on code in PR #6847:
URL: https://github.com/apache/hbase/pull/6847#discussion_r2014206928
##########
hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/BackupCommands.java:
##########
@@ -853,6 +869,188 @@ protected void printUsage() {
}
}
+ /**
+ * The {@code CleanupCommand} class is responsible for removing Write-Ahead
Log (WAL) and
+ * bulk-loaded files that are no longer needed for Point-in-Time Recovery
(PITR).
+ * <p>
+ * The cleanup process follows these steps:
+ * <ol>
+ * <li>Identify the oldest full backup and its start timestamp.</li>
+ * <li>Delete WAL files older than this timestamp, as they are no longer
usable for PITR with any
+ * backup.</li>
+ * </ol>
Review Comment:
Thanks, @rmdmattingly, for the review comments!
I’d like to clarify that we are specifically cleaning up WALs in the backup
location (e.g., S3, where they are continuously replicated), not the cluster’s
WALs. If we were dealing with cluster WALs, your point would certainly
apply—does that sound correct?
Regarding keeping this command manual:
- All other backup-related commands are currently manual.
- This command depends on the delete command. What we are doing here is
identifying the first full backup in the system and deleting all WALs before
that point.
- These WALs are needed for point-in-time recovery (PITR), where we replay
WALs from the full backup up to the specified recovery point. If the full
backup itself no longer exists, keeping those WALs serves no purpose.
- Since deleting full backups is already a manual operation, there is little
benefit in automating this cleanup.
That said, we could explore the possibility of running this command
periodically and automatically in future iterations.
Let me know your thoughts!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]