[
https://issues.apache.org/jira/browse/HBASE-25698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17337423#comment-17337423
]
Viraj Jasani commented on HBASE-25698:
--------------------------------------
[~anoop.hbase] Thanks for this nice find, this sounds right to me. TinyLfu does
not perform retain() for already cached blocks unlike LruBlockCache does this
way:
{code:java}
LruCachedBlock cb = map.computeIfPresent(cacheKey, (key, val) -> {
// It will be referenced by RPC path, so increase here. NOTICE: Must do the
retain inside
// this block. because if retain outside the map#computeIfPresent, the
evictBlock may remove
// the block and release, then we're retaining a block with refCnt=0 which is
disallowed.
// see HBASE-22422.
val.getBuffer().retain();
return val;
});
{code}
Also, I am not sure about this particular testing but AFAIK I think [~apurtell]
has used TinyLfu many times in his testing. He can confirm further anyways.
{quote}hbase.blockcache.use.external is not set true right. Then only we create
CombinedBC with L2 as VictimCache for L1
{quote}
I was also not aware of this, just looked at the relevant code.
{code:java}
public InclusiveCombinedBlockCache(FirstLevelBlockCache l1, BlockCache l2) {
super(l1,l2);
l1.setVictimCache(l2);
}
{code}
I hope what you are suggesting to fix in TinyLfu is somewhat similar to this:
{code:java}
diff --git
a/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/TinyLfuBlockCache.java
b/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/TinyLfuBlockCache.java
index a0dc30c524..1cb53dc6b6 100644
---
a/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/TinyLfuBlockCache.java
+++
b/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/TinyLfuBlockCache.java
@@ -170,6 +170,7 @@ public final class TinyLfuBlockCache implements
FirstLevelBlockCache {
value = victimCache.getBlock(cacheKey, caching, repeat,
updateCacheMetrics);
if ((value != null) && caching) {
if ((value instanceof HFileBlock) && ((HFileBlock)
value).isSharedMem()) {
+ value.retain();
value = HFileBlock.deepCloneOnHeap((HFileBlock) value);
}
cacheBlock(cacheKey, value);
@@ -203,6 +204,7 @@ public final class TinyLfuBlockCache implements
FirstLevelBlockCache {
@Override
public boolean evictBlock(BlockCacheKey cacheKey) {
Cacheable value = cache.asMap().remove(cacheKey);
+ value.release();
return (value != null);
}
{code}
> Persistent IllegalReferenceCountException at scanner open
> ---------------------------------------------------------
>
> Key: HBASE-25698
> URL: https://issues.apache.org/jira/browse/HBASE-25698
> Project: HBase
> Issue Type: Bug
> Components: HFile, Scanners
> Affects Versions: 2.4.2
> Reporter: Andrew Kyle Purtell
> Priority: Major
> Fix For: 3.0.0-alpha-1, 2.5.0, 2.4.3
>
>
> Persistent scanner open failure with offheap read path enabled.
> Not sure how it happened. Test scenario was HBase 1 cluster replicating to
> HBase 2 cluster. ITBLL as data generator at source, calm policy only. Scanner
> open errors on sink HBase 2 cluster later during ITBLL verify phase. Sink
> schema settings bloom=ROW encoding=FAST_DIFF compression=NONE.
> {noformat}
> Caused by:
> org.apache.hbase.thirdparty.io.netty.util.IllegalReferenceCountException:
> refCnt: 0, decrement: 1
> at
> org.apache.hbase.thirdparty.io.netty.util.internal.ReferenceCountUpdater.toLiveRealRefCnt(ReferenceCountUpdater.java:74)
> at
> org.apache.hbase.thirdparty.io.netty.util.internal.ReferenceCountUpdater.release(ReferenceCountUpdater.java:138)
> at
> org.apache.hbase.thirdparty.io.netty.util.AbstractReferenceCounted.release(AbstractReferenceCounted.java:76)
> at org.apache.hadoop.hbase.nio.ByteBuff.release(ByteBuff.java:79)
> at
> org.apache.hadoop.hbase.io.hfile.HFileBlock.release(HFileBlock.java:429)
> at
> org.apache.hadoop.hbase.io.hfile.CompoundBloomFilter.contains(CompoundBloomFilter.java:109)
> at
> org.apache.hadoop.hbase.regionserver.StoreFileReader.checkGeneralBloomFilter(StoreFileReader.java:433)
> at
> org.apache.hadoop.hbase.regionserver.StoreFileReader.passesGeneralRowBloomFilter(StoreFileReader.java:322)
> at
> org.apache.hadoop.hbase.regionserver.StoreFileReader.passesBloomFilter(StoreFileReader.java:251)
> at
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.shouldUseScanner(StoreFileScanner.java:491)
> at
> org.apache.hadoop.hbase.regionserver.StoreScanner.selectScannersFrom(StoreScanner.java:471)
> at
> org.apache.hadoop.hbase.regionserver.StoreScanner.<init>(StoreScanner.java:249)
> at
> org.apache.hadoop.hbase.regionserver.HStore.createScanner(HStore.java:2177)
> at
> org.apache.hadoop.hbase.regionserver.HStore.getScanner(HStore.java:2168)
> at
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.initializeScanners(HRegion.java:7172)
> {noformat}
> Bloom filter type on all files here is ROW, block encoding is FAST_DIFF:
> {noformat}
> hbase:017:0> describe "IntegrationTestBigLinkedList"
> Table IntegrationTestBigLinkedList is ENABLED
>
> IntegrationTestBigLinkedList
>
> COLUMN FAMILIES DESCRIPTION
>
> {NAME => 'big', BLOOMFILTER => 'ROW', IN_MEMORY => 'false', VERSIONS => '1',
> KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'FAST_DIF
> F', COMPRESSION => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', BLOCKCACHE
> => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '1'}
> {NAME => 'meta', BLOOMFILTER => 'ROW', IN_MEMORY => 'false', VERSIONS => '1',
> KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'FAST_DI
> FF', COMPRESSION => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', BLOCKCACHE
> => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '1'}
> {NAME => 'tiny', BLOOMFILTER => 'ROW', IN_MEMORY => 'false', VERSIONS => '1',
> KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'FAST_DI
> FF', COMPRESSION => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', BLOCKCACHE
> => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '1'}
> {noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)