Tianying: Please take a look at CacheConfig#shouldCacheBlockOnRead() which is called by HFileReaderV2#readBlock()
Cheers On Wed, Apr 16, 2014 at 5:39 PM, Tianying Chang <[email protected]> wrote: > Cool. Thanks! > > Just to dig deeper, is this because BloomFilter is part of Meta, and Meta > block always cached no matter what? > > Or it is because the BloomFilter is in the upper level of the searchTree in > the code path I pasted? I guess that code path is actually for data block, > not meta block? > > // Call HFile's caching block reader API. We always cache index > // blocks, otherwise we might get terrible performance. > boolean shouldCache = cacheBlocks || (lookupLevel < > searchTreeLevel); > BlockType expectedBlockType; > if (lookupLevel < searchTreeLevel - 1) { > expectedBlockType = BlockType.INTERMEDIATE_INDEX; > } else if (lookupLevel == searchTreeLevel - 1) { > expectedBlockType = BlockType.LEAF_INDEX; > } else { > // this also accounts for ENCODED_DATA > expectedBlockType = BlockType.DATA; > } > > > On Wed, Apr 16, 2014 at 4:59 PM, Ted Yu <[email protected]> wrote: > > > bq. it is always cached on read even when per-family/per-query > cacheBlocks > > is turned off. > > > > True. > > > > > > On Wed, Apr 16, 2014 at 4:41 PM, Tianying Chang <[email protected]> > wrote: > > > > > Hi, > > > > > > We have a use case where some data are mostly random read, so it > polluted > > > cache and caused big GC. It is better to turn off the block cache for > > those > > > data. So we are going to call setCacheBlocks(false) for those get(). We > > > know that the index will be still cached based on below code path, so > we > > > are safe there. But it is not clear if BloomFilter belong to the > level < > > > searchTreeLevel, and also get cached also. > > > > > > // Call HFile's caching block reader API. We always cache > index > > > // blocks, otherwise we might get terrible performance. > > > boolean shouldCache = cacheBlocks || (lookupLevel < > > > searchTreeLevel); > > > BlockType expectedBlockType; > > > if (lookupLevel < searchTreeLevel - 1) { > > > expectedBlockType = BlockType.INTERMEDIATE_INDEX; > > > } else if (lookupLevel == searchTreeLevel - 1) { > > > expectedBlockType = BlockType.LEAF_INDEX; > > > } else { > > > // this also accounts for ENCODED_DATA > > > expectedBlockType = BlockType.DATA; > > > } > > > > > > Or I think because BloomFilter is part of Meta data, so it is always > > cached > > > on read even when per-family/per-query cacheBlocks is turned off. Am I > > > right? > > > > > > Thanks > > > Tian-Ying > > > > > >
