[PR] HBASE-29716 Include sequence ID metadata for HFiles generated by incremental backups [hbase]

via GitHub Mon, 24 Nov 2025 15:56:55 -0800


krconv opened a new pull request, #7480:
URL: https://github.com/apache/hbase/pull/7480


   The HFiles generated by incremental backups cannot be properly read by 
tooling such as the ClientSideRequestScanner, because the generated HFiles do 
not include the MAX_SEQ_ID metadata. The scanner will ignore cell-level 
sequence IDs and instead sort the HFiles arbitrarily. This causes incorrect 
results when scanning overwrites to cells with the same timestamp.
   
   This PR adds a new option to the HFileOutputFormat2 that will calculate and 
set the required metadata. This only really effects the 
ClientSideRequestScanner, as the sequence ID will be recalculated when 
bulk-loaded anyways.
   
   Part of https://issues.apache.org/jira/browse/HBASE-29716


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[PR] HBASE-29716 Include sequence ID metadata for HFiles generated by incremental backups [hbase]

Reply via email to