[ https://issues.apache.org/jira/browse/HBASE-19511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16290365#comment-16290365 ]
ramkrishna.s.vasudevan commented on HBASE-19511: ------------------------------------------------ When ever testBlockRefCountAfterSplits() in TestBlockEvictionFromClients runs with this issue further test cases fail because our refcounting is messed up so further test cases fails. > Splits causes blocks to be cached again and so such blocks cannot be evicted > from bucket cache > ---------------------------------------------------------------------------------------------- > > Key: HBASE-19511 > URL: https://issues.apache.org/jira/browse/HBASE-19511 > Project: HBase > Issue Type: Bug > Components: BucketCache > Affects Versions: 2.0.0-alpha-4 > Reporter: ramkrishna.s.vasudevan > Assignee: ramkrishna.s.vasudevan > Priority: Critical > Fix For: 3.0.0, 2.0.0-beta-1 > > > This is because of similar pattern as in > https://issues.apache.org/jira/browse/HBASE-8547. > It took a lot of time to debug this and the reason for > TestBlockeEvictionFromClient was flaky due to this. > When we create split files the index of the firstKey that we create could > possibly be the same key as in the case of #testBlockRefCountAfterSplits(). > In such cases we were getting the same block to be cached again in the bucket > cache. As part of HBASE-8547 such cases were being handled. > {code} > String msg = "Caching an already cached block: " + cacheKey; > msg += ". This is harmless and can happen in rare cases (see > HBASE-8547)"; > LOG.warn(msg); > {code} > But this is a tricky case where this log msg will be coming only when block > with same cachekey was completely cached in the bucket cache. If there is a > case where the block with the same cachekey was not yet completed written to > bucket cache (by cache writer threads) this this log msg won't come but the > ramCache key wil prevent the block from again getting cached. > {code} > if (ramCache.putIfAbsent(cacheKey, re) != null) { > return; > } > {code} > So when ever the block was getting cached once again and it is already in > backingMap then we were doing a getBlock() to verify if the block is the same > block. This was internallly adding to the refcount and so those blocks will > never get removed from the bucket cache queue. ( there is no one to decrement > the ref count on such cases). > So I think for this rare cases it is better we do a copy of the block and > then check if the block is same as the existing one. This should be harmless > and should help us in doing proper ref counting -- This message was sent by Atlassian JIRA (v6.4.14#64029)