deepankar created HBASE-16624:
---------------------------------

             Summary: MVCC DeSerialization bug in the upstream HFileScannerImpl
                 Key: HBASE-16624
                 URL: https://issues.apache.org/jira/browse/HBASE-16624
             Project: HBase
          Issue Type: Bug
          Components: HFile
    Affects Versions: 2.0.0
            Reporter: deepankar
            Assignee: deepankar
            Priority: Blocker


My colleague [~naggarwal] found a bug in the deserialization of mvcc from 
HFile, As a part of the optimization of deserialization of VLong, we read a int 
at once but we forgot to convert it to unsigned one. 

This would cause issues because once we cross the integer threshold in 
sequenceId and a compaction happens we would write MAX_MEMSTORE_TS in the 
trailer as 0 (because we will be reading negative values from the file that got 
flushed with sequenceId > Integer.MAX_VALUE). And once we have MAX_MEMSTORE_TS 
as 0, and there are sequenceId values present alongside with KeyValues the 
regionserver will now start failing to read the compacted file and thus 
corruption. 

Interestingly this would happen only on the tables that don't have  
DataBlockEncoding enabled and unfortunately in our case that turned out to be 
META and a another small table.

Fix is small (~20 chars) and attached



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to