[
https://issues.apache.org/jira/browse/HBASE-29890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sid Khillon updated HBASE-29890:
--------------------------------
Description:
When the WAL tailing reader hits EOF mid-cell during WAL compression, it
currently returns EOF_AND_RESET_COMPRESSION, which forces the reader to re-read
the entire WAL file from the beginning to rebuild dictionary state. This is an
O\(n\) operation that becomes increasingly expensive as the WAL grows.
The root cause is that the CompressedKvDecoder eagerly adds entries to the
compression dictionaries (ROW, FAMILY, QUALIFIER, and tag dictionaries) as it
reads each field of a cell. If an IOException occurs partway through reading a
cell, the dictionaries are left in a partially-updated state that no longer
matches the actual stream position. The reader has no choice but to throw away
the entire compression context and start over.
Proposed Fix is to defer dictionary additions until a cell is fully parsed:
- Buffer ROW/FAMILY/QUALIFIER dictionary additions in CompressedKvDecoder and
only commit them after parseCellInner() succeeds. On IOException, discard the
pending additions.
- Add a similar deferred-addition mode to TagCompressionContext for tag
dictionaries.
- Reset the ValueCompressor if an IOException occurs during the value
decompression phase.
With deferred additions, hitting EOF mid-cell leaves the dictionaries in the
state they were after the last fully-read cell. This means the reader can
return EOF_AND_RESET (a cheap seek to the saved position) instead of
EOF_AND_RESET_COMPRESSION, and resume reading from where it left off once the
file grows.
was:
When the WAL tailing reader hits EOF mid-cell during WAL compression, it
currently returns EOF_AND_RESET_COMPRESSION, which forces the reader to re-read
the entire WAL file from the beginning to rebuild dictionary state. This is an
O (n) operation that becomes increasingly expensive as the WAL grows.
The root cause is that the CompressedKvDecoder eagerly adds entries to the
compression dictionaries (ROW, FAMILY, QUALIFIER, and tag dictionaries) as it
reads each field of a cell. If an IOException occurs partway through reading a
cell, the dictionaries are left in a partially-updated state that no longer
matches the actual stream position. The reader has no choice but to throw away
the entire compression context and start over.
Proposed Fix is to defer dictionary additions until a cell is fully parsed:
- Buffer ROW/FAMILY/QUALIFIER dictionary additions in CompressedKvDecoder and
only commit them after parseCellInner() succeeds. On IOException, discard the
pending additions.
- Add a similar deferred-addition mode to TagCompressionContext for tag
dictionaries.
- Reset the ValueCompressor if an IOException occurs during the value
decompression phase.
With deferred additions, hitting EOF mid-cell leaves the dictionaries in the
state they were after the last fully-read cell. This means the reader can
return EOF_AND_RESET (a cheap seek to the saved position) instead of
EOF_AND_RESET_COMPRESSION, and resume reading from where it left off once the
file grows.
> WAL tailing reader should resume partial cell reads instead of resetting
> compression
> ------------------------------------------------------------------------------------
>
> Key: HBASE-29890
> URL: https://issues.apache.org/jira/browse/HBASE-29890
> Project: HBase
> Issue Type: Improvement
> Components: Replication, wal
> Reporter: Sid Khillon
> Assignee: Sid Khillon
> Priority: Minor
>
> When the WAL tailing reader hits EOF mid-cell during WAL compression, it
> currently returns EOF_AND_RESET_COMPRESSION, which forces the reader to
> re-read the entire WAL file from the beginning to rebuild dictionary state.
> This is an O\(n\) operation that becomes increasingly expensive as the WAL
> grows.
> The root cause is that the CompressedKvDecoder eagerly adds entries to the
> compression dictionaries (ROW, FAMILY, QUALIFIER, and tag dictionaries) as it
> reads each field of a cell. If an IOException occurs partway through reading
> a cell, the dictionaries are left in a partially-updated state that no longer
> matches the actual stream position. The reader has no choice but to throw
> away the entire compression context and start over.
> Proposed Fix is to defer dictionary additions until a cell is fully parsed:
> - Buffer ROW/FAMILY/QUALIFIER dictionary additions in CompressedKvDecoder
> and only commit them after parseCellInner() succeeds. On IOException, discard
> the pending additions.
> - Add a similar deferred-addition mode to TagCompressionContext for tag
> dictionaries.
> - Reset the ValueCompressor if an IOException occurs during the value
> decompression phase.
> With deferred additions, hitting EOF mid-cell leaves the dictionaries in the
> state they were after the last fully-read cell. This means the reader can
> return EOF_AND_RESET (a cheap seek to the saved position) instead of
> EOF_AND_RESET_COMPRESSION, and resume reading from where it left off once the
> file grows.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)