Kodey Converse created HBASE-29716:
--------------------------------------

             Summary: Incremental backup does not properly preserve sequence IDs
                 Key: HBASE-29716
                 URL: https://issues.apache.org/jira/browse/HBASE-29716
             Project: HBase
          Issue Type: Bug
          Components: backup&restore
    Affects Versions: 2.5.13, 3.0.0, 2.6.5
            Reporter: Kodey Converse


When an incremental backup is taken, WAL files are re-written as HFiles using 
the WAL player. These HFiles are not formatted properly, and the sequence IDs 
for cells (which are required for correctness) are ignored by the RegionScanner.

This is a follow up to HBASE-27649; that fix plumbed sequence IDs from the WAL 
to the HFiles generated by WALPlayer. However, the HFiles generated by 
WALPlayer are marked to be bulk loaded [by metadata on the 
HFile|https://github.com/apache/hbase/blob/b8d803c0f1156219cc965e4c749e7ab7c9a65f31/hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat2.java#L461],
 and RegionScanner [will reset cell-level sequence 
IDs|https://github.com/apache/hbase/blob/b8d803c0f1156219cc965e4c749e7ab7c9a65f31/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStoreFile.java#L427-L450]
 for HFiles with this metadata, instead relying on the sequence ID generated at 
time of bulkload (which won't ever happen for these HFiles intended for 
incremental backups).

The result is that cell versions that have been overwritten (and therefore rely 
on sequence IDs for correctness) will return an incorrect value when read by 
HBase or by tooling such as the ClientSideRegionScanner. Instead, I believe the 
cell value that is returned will be decided based on [sorting the HFiles by 
their 
size|https://github.com/apache/hbase/blob/b8d803c0f1156219cc965e4c749e7ab7c9a65f31/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileComparators.java#L36-L39].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to