[jira] [Updated] (HBASE-29890) WAL tailing reader should resume partial cell reads instead of resetting compression

Sid Khillon (Jira) Wed, 11 Feb 2026 11:32:10 -0800


     [ 
https://issues.apache.org/jira/browse/HBASE-29890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Sid Khillon updated HBASE-29890:
--------------------------------
    Description: 
When the WAL tailing reader hits EOF mid-cell during WAL compression, it 
currently returns EOF_AND_RESET_COMPRESSION, which forces the reader to re-read 
the entire WAL file from the beginning to rebuild dictionary state. This is an 
O\(n\) operation that becomes increasingly expensive as the WAL grows.

 The root cause is that the CompressedKvDecoder eagerly adds entries to the 
compression dictionaries (ROW, FAMILY, QUALIFIER, and tag dictionaries) as it 
reads each field of a cell. If an IOException occurs partway through reading a 
cell, the dictionaries are left in a partially-updated state that no longer 
matches the actual stream position. The reader has no choice but to throw away 
the entire compression context and start over.

Proposed Fix is to defer dictionary additions until a cell is fully parsed:
  - Buffer ROW/FAMILY/QUALIFIER dictionary additions in CompressedKvDecoder and 
only commit them after parseCellInner() succeeds. On IOException, discard the 
pending additions.
  - Add a similar deferred-addition mode to TagCompressionContext for tag 
dictionaries.
  - Reset the ValueCompressor if an IOException occurs during the value 
decompression phase.

With deferred additions, hitting EOF mid-cell leaves the dictionaries in the 
state they were after the last fully-read cell. This means the reader can 
return EOF_AND_RESET (a cheap seek to the saved position) instead of 
EOF_AND_RESET_COMPRESSION, and resume reading from where it left off once the 
file grows.

  was:
When the WAL tailing reader hits EOF mid-cell during WAL compression, it 
currently returns EOF_AND_RESET_COMPRESSION, which forces the reader to re-read 
the entire WAL file from the beginning to rebuild dictionary state. This is an 
O (n) operation that becomes increasingly expensive as the WAL grows.

 The root cause is that the CompressedKvDecoder eagerly adds entries to the 
compression dictionaries (ROW, FAMILY, QUALIFIER, and tag dictionaries) as it 
reads each field of a cell. If an IOException occurs partway through reading a 
cell, the dictionaries are left in a partially-updated state that no longer 
matches the actual stream position. The reader has no choice but to throw away 
the entire compression context and start over.

Proposed Fix is to defer dictionary additions until a cell is fully parsed:
  - Buffer ROW/FAMILY/QUALIFIER dictionary additions in CompressedKvDecoder and 
only commit them after parseCellInner() succeeds. On IOException, discard the 
pending additions.
  - Add a similar deferred-addition mode to TagCompressionContext for tag 
dictionaries.
  - Reset the ValueCompressor if an IOException occurs during the value 
decompression phase.

With deferred additions, hitting EOF mid-cell leaves the dictionaries in the 
state they were after the last fully-read cell. This means the reader can 
return EOF_AND_RESET (a cheap seek to the saved position) instead of 
EOF_AND_RESET_COMPRESSION, and resume reading from where it left off once the 
file grows.


> WAL tailing reader should resume partial cell reads instead of resetting 
> compression
> ------------------------------------------------------------------------------------
>
>                 Key: HBASE-29890
>                 URL: https://issues.apache.org/jira/browse/HBASE-29890
>             Project: HBase
>          Issue Type: Improvement
>          Components: Replication, wal
>            Reporter: Sid Khillon
>            Assignee: Sid Khillon
>            Priority: Minor
>
> When the WAL tailing reader hits EOF mid-cell during WAL compression, it 
> currently returns EOF_AND_RESET_COMPRESSION, which forces the reader to 
> re-read the entire WAL file from the beginning to rebuild dictionary state. 
> This is an O\(n\) operation that becomes increasingly expensive as the WAL 
> grows.
>  The root cause is that the CompressedKvDecoder eagerly adds entries to the 
> compression dictionaries (ROW, FAMILY, QUALIFIER, and tag dictionaries) as it 
> reads each field of a cell. If an IOException occurs partway through reading 
> a cell, the dictionaries are left in a partially-updated state that no longer 
> matches the actual stream position. The reader has no choice but to throw 
> away the entire compression context and start over.
> Proposed Fix is to defer dictionary additions until a cell is fully parsed:
>   - Buffer ROW/FAMILY/QUALIFIER dictionary additions in CompressedKvDecoder 
> and only commit them after parseCellInner() succeeds. On IOException, discard 
> the pending additions.
>   - Add a similar deferred-addition mode to TagCompressionContext for tag 
> dictionaries.
>   - Reset the ValueCompressor if an IOException occurs during the value 
> decompression phase.
> With deferred additions, hitting EOF mid-cell leaves the dictionaries in the 
> state they were after the last fully-read cell. This means the reader can 
> return EOF_AND_RESET (a cheap seek to the saved position) instead of 
> EOF_AND_RESET_COMPRESSION, and resume reading from where it left off once the 
> file grows.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HBASE-29890) WAL tailing reader should resume partial cell reads instead of resetting compression

Reply via email to