Hi,
We have a use case where some data are mostly random read, so it polluted
cache and caused big GC. It is better to turn off the block cache for those
data. So we are going to call setCacheBlocks(false) for those get(). We
know that the index will be still cached based on below code path, so we
are safe there. But it is not clear if BloomFilter belong to the level <
searchTreeLevel, and also get cached also.
// Call HFile's caching block reader API. We always cache index
// blocks, otherwise we might get terrible performance.
boolean shouldCache = cacheBlocks || (lookupLevel <
searchTreeLevel);
BlockType expectedBlockType;
if (lookupLevel < searchTreeLevel - 1) {
expectedBlockType = BlockType.INTERMEDIATE_INDEX;
} else if (lookupLevel == searchTreeLevel - 1) {
expectedBlockType = BlockType.LEAF_INDEX;
} else {
// this also accounts for ENCODED_DATA
expectedBlockType = BlockType.DATA;
}
Or I think because BloomFilter is part of Meta data, so it is always cached
on read even when per-family/per-query cacheBlocks is turned off. Am I
right?
Thanks
Tian-Ying