[
https://issues.apache.org/jira/browse/HBASE-10270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13860939#comment-13860939
]
Lars Hofhansl commented on HBASE-10270:
---------------------------------------
Interesting. We should probably do this in 0.94+ as well.
> Remove DataBlockEncoding from BlockCacheKey
> -------------------------------------------
>
> Key: HBASE-10270
> URL: https://issues.apache.org/jira/browse/HBASE-10270
> Project: HBase
> Issue Type: Improvement
> Components: regionserver
> Affects Versions: 0.89-fb
> Reporter: Arjen Roodselaar
> Assignee: Arjen Roodselaar
> Priority: Minor
> Fix For: 0.89-fb
>
> Attachments: datablockencoding_blockcachekey.patch
>
>
> When a block is added to the BlockCache its DataBlockEncoding is stored on
> the BlockCacheKey. This block encoding is used in the calculation of the
> hashCode and as such matters when cache lookups are done. Because the keys
> differ for encoded and unencoded (data) blocks, there is a potential for
> caching them twice or missing the cache. This happens for example when using
> Scan preloading as AbstractScannerV2.readNextDataBlock() does a read without
> knowing the block type or the encoding.
> This patch removes the block encoding from the key and forces the caller of
> HFileReaderV2.readBlock() to specify the expected BlockType as well as the
> expected DataBlockEncoding when these matter. This allows for a decision on
> either of these at read time instead of cache time, puts responsibility where
> appropriate, fixes some cache misses when using the scan preloading (which
> does a read without knowing the type or encoding), allows for the
> BlockCacheKey to be re-used by the L2 BucketCache and sets us up for a future
> CompoundScannerV2 which can read both un-encoded and encoded data blocks.
> A gotcha here: ScannerV2 and EncodedScannerV2 expect BlockType.DATA and
> BlockType.ENCODED_DATA respectively and will throw when given a block of the
> wrong type. Adding the DataBlockEncoding on the cache key caused a cache miss
> if the block was cached with the wrong encoding, implicitly defining the
> BlockType and thus keeping this from happening. It is now the scanner's
> responsibility to specify both the expected type and encoding (which is more
> appropriate).
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)