Re: [PR] HBASE-29776: Log filtering in IncrementalBackupManager can lead to data loss [hbase]

via GitHub Tue, 20 Jan 2026 08:49:39 -0800


DieterDP-ng commented on code in PR #7582:
URL: https://github.com/apache/hbase/pull/7582#discussion_r2709210795



##########
hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/IncrementalBackupManager.java:
##########


Review Comment:
   So the scenario would be that a RS host X would (just before receiving the 
log roll command) start a shutdown, causing it so there's no updated entry for 
that host in `readRegionServerLastLogRollResult`. Then, there's at least one RS 
Y that completes it's roll so quick, while having some time skew, that it would 
appear that X went offline after the the log roll of Y, causing it to not be 
included.
   
   It's a bit of a far-fetched scenario, but given the complexity of having 2 
possible logroll procedures, I can't easily say whether it's possible or not. 
But I admit that looking at the WAL files instead avoids this complexity of 
reasoning. So, I guess I'm convinced that the suggested WAL-file approach is 
fine. I would appreciate if the code is condensed & structured a bit extra 
though.
   
   Just out of curiosity: did you actually have evidence of clock skew causing 
issues? I thought clock-sync methods were sufficient to avoid such issues.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] HBASE-29776: Log filtering in IncrementalBackupManager can lead to data loss [hbase]

Reply via email to