DieterDP-ng commented on code in PR #7582: URL: https://github.com/apache/hbase/pull/7582#discussion_r2677182275
########## hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/IncrementalBackupManager.java: ########## Review Comment: > I'm not sure I necessarily agree with this logic. B/c the RS not being rolled this go around doesn't mean we've backed up all the files from the RS we need to backup. It just means the host doesn't exist on the cluster at the moment. > > 1. Server A is backed up > > 2. Server A generate more WAL files > > 3. Server A is removed from cluster > > 4. New backup occurs, but we don't get a roll time for Server A so we ignore its files > > > We need to backup the files that were generated between the last backup, and this backup Not sure we're on the same page here. (Did you miss that I mentioned that `getLogFilesForNewBackup` needs some adjustments?) For the example you gave (quoted above), with the code block I suggested, the following would happen: - when the logs are rolled, `rollTimestamps` would not have been updated, since the RS A was offline - so the `rollTimestamps` value for A would be `< rollStartTs`, and A would not be present as a key in `newTimestamps` - in the updated `getLogFilesForNewBackup`, we'd fall in case `server found in only previous: a region server that has gone offline, all logs will be older than rollStartTs and should be included`, so all logs of A would be included. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
