[
https://issues.apache.org/jira/browse/HBASE-10752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13936397#comment-13936397
]
chunhui shen commented on HBASE-10752:
--------------------------------------
As my understanding about current Trunk, a data block will only be cached
using one fixed format (encoded or not-encoded).
In HColumnDescriptor
{code}
public static final String ENCODE_ON_DISK = // To be removed, it is not used
anymore
"ENCODE_ON_DISK";
{code}
Also in HFileDataBlockEncoderImpl, we only have one variable ‘encoding’ rather
than two variables ‘onDisk’ and ‘inCache’.
Thus, In Trunk, I think we could remove 'DataBlockEncoding ' directly from
BlockCacheKey. No necessary to check 'DataBlockEncoding' after reading block
from cache.
In addition, it would be better if have a test to show the case 'there is a
potential for caching them twice or missing the cache' mentioned in HBASE-10270
> Port HBASE-10270 'Remove DataBlockEncoding from BlockCacheKey' to trunk
> -----------------------------------------------------------------------
>
> Key: HBASE-10752
> URL: https://issues.apache.org/jira/browse/HBASE-10752
> Project: HBase
> Issue Type: Improvement
> Reporter: Ted Yu
> Assignee: Ted Yu
> Priority: Minor
> Fix For: 0.99.0
>
> Attachments: 10752-v1.txt, 10752-v2.txt, 10752-v3.txt
>
>
> The JIRA removes the block encoding from the key and forces the caller of
> HFileReaderV2.readBlock() to specify the expected BlockType as well as the
> expected DataBlockEncoding when these matter. This allows for a decision on
> either of these at read time instead of cache time, puts responsibility where
> appropriate, fixes some cache misses when using the scan preloading (which
> does a read without knowing the type or encoding), allows for the
> BlockCacheKey to be re-used by the L2 BucketCache and sets us up for a future
> CompoundScannerV2 which can read both un-encoded and encoded data blocks.
--
This message was sent by Atlassian JIRA
(v6.2#6252)