[jira] [Commented] (HBASE-25698) Persistent IllegalReferenceCountException at scanner open

Viraj Jasani (Jira) Fri, 30 Apr 2021 07:32:04 -0700


    [ 
https://issues.apache.org/jira/browse/HBASE-25698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17337423#comment-17337423
 ]


Viraj Jasani commented on HBASE-25698:
--------------------------------------

[~anoop.hbase] Thanks for this nice find, this sounds right to me. TinyLfu does 
not perform retain() for already cached blocks unlike LruBlockCache does this 
way:
{code:java}
LruCachedBlock cb = map.computeIfPresent(cacheKey, (key, val) -> {
  // It will be referenced by RPC path, so increase here. NOTICE: Must do the 
retain inside
  // this block. because if retain outside the map#computeIfPresent, the 
evictBlock may remove
  // the block and release, then we're retaining a block with refCnt=0 which is 
disallowed.
  // see HBASE-22422.
  val.getBuffer().retain();
  return val;
});

{code}
Also, I am not sure about this particular testing but AFAIK I think [~apurtell] 
has used TinyLfu many times in his testing. He can confirm further anyways.
{quote}hbase.blockcache.use.external is not set true right. Then only we create 
CombinedBC with L2 as VictimCache for L1
{quote}
I was also not aware of this, just looked at the relevant code.
{code:java}
public InclusiveCombinedBlockCache(FirstLevelBlockCache l1, BlockCache l2) {
  super(l1,l2);
  l1.setVictimCache(l2);
}

{code}
I hope what you are suggesting to fix in TinyLfu is somewhat similar to this:
{code:java}
diff --git 
a/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/TinyLfuBlockCache.java
 
b/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/TinyLfuBlockCache.java
index a0dc30c524..1cb53dc6b6 100644
--- 
a/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/TinyLfuBlockCache.java
+++ 
b/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/TinyLfuBlockCache.java
@@ -170,6 +170,7 @@ public final class TinyLfuBlockCache implements 
FirstLevelBlockCache {
         value = victimCache.getBlock(cacheKey, caching, repeat, 
updateCacheMetrics);
         if ((value != null) && caching) {
           if ((value instanceof HFileBlock) && ((HFileBlock) 
value).isSharedMem()) {
+            value.retain();
             value = HFileBlock.deepCloneOnHeap((HFileBlock) value);
           }
           cacheBlock(cacheKey, value);
@@ -203,6 +204,7 @@ public final class TinyLfuBlockCache implements 
FirstLevelBlockCache {
   @Override
   public boolean evictBlock(BlockCacheKey cacheKey) {
     Cacheable value = cache.asMap().remove(cacheKey);
+    value.release();
     return (value != null);
   }
{code}

> Persistent IllegalReferenceCountException at scanner open
> ---------------------------------------------------------
>
>                 Key: HBASE-25698
>                 URL: https://issues.apache.org/jira/browse/HBASE-25698
>             Project: HBase
>          Issue Type: Bug
>          Components: HFile, Scanners
>    Affects Versions: 2.4.2
>            Reporter: Andrew Kyle Purtell
>            Priority: Major
>             Fix For: 3.0.0-alpha-1, 2.5.0, 2.4.3
>
>
> Persistent scanner open failure with offheap read path enabled.
> Not sure how it happened. Test scenario was HBase 1 cluster replicating to 
> HBase 2 cluster. ITBLL as data generator at source, calm policy only. Scanner 
> open errors on sink HBase 2 cluster later during ITBLL verify phase. Sink 
> schema settings bloom=ROW encoding=FAST_DIFF compression=NONE.
> {noformat}
> Caused by: 
> org.apache.hbase.thirdparty.io.netty.util.IllegalReferenceCountException: 
> refCnt: 0, decrement: 1
>         at 
> org.apache.hbase.thirdparty.io.netty.util.internal.ReferenceCountUpdater.toLiveRealRefCnt(ReferenceCountUpdater.java:74)
>         at 
> org.apache.hbase.thirdparty.io.netty.util.internal.ReferenceCountUpdater.release(ReferenceCountUpdater.java:138)
>         at 
> org.apache.hbase.thirdparty.io.netty.util.AbstractReferenceCounted.release(AbstractReferenceCounted.java:76)
>         at org.apache.hadoop.hbase.nio.ByteBuff.release(ByteBuff.java:79)
>         at 
> org.apache.hadoop.hbase.io.hfile.HFileBlock.release(HFileBlock.java:429)
>         at 
> org.apache.hadoop.hbase.io.hfile.CompoundBloomFilter.contains(CompoundBloomFilter.java:109)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreFileReader.checkGeneralBloomFilter(StoreFileReader.java:433)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreFileReader.passesGeneralRowBloomFilter(StoreFileReader.java:322)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreFileReader.passesBloomFilter(StoreFileReader.java:251)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.shouldUseScanner(StoreFileScanner.java:491)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.selectScannersFrom(StoreScanner.java:471)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.<init>(StoreScanner.java:249)
>         at 
> org.apache.hadoop.hbase.regionserver.HStore.createScanner(HStore.java:2177)
>         at 
> org.apache.hadoop.hbase.regionserver.HStore.getScanner(HStore.java:2168)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.initializeScanners(HRegion.java:7172)
> {noformat}
> Bloom filter type on all files here is ROW, block encoding is FAST_DIFF:
> {noformat}
> hbase:017:0> describe "IntegrationTestBigLinkedList"
> Table IntegrationTestBigLinkedList is ENABLED                                 
>                                                               
> IntegrationTestBigLinkedList                                                  
>                                                               
> COLUMN FAMILIES DESCRIPTION                                                   
>                                                               
> {NAME => 'big', BLOOMFILTER => 'ROW', IN_MEMORY => 'false', VERSIONS => '1', 
> KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'FAST_DIF
> F', COMPRESSION => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', BLOCKCACHE 
> => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '1'}     
> {NAME => 'meta', BLOOMFILTER => 'ROW', IN_MEMORY => 'false', VERSIONS => '1', 
> KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'FAST_DI
> FF', COMPRESSION => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', BLOCKCACHE 
> => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '1'}    
> {NAME => 'tiny', BLOOMFILTER => 'ROW', IN_MEMORY => 'false', VERSIONS => '1', 
> KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'FAST_DI
> FF', COMPRESSION => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', BLOCKCACHE 
> => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '1'}    
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HBASE-25698) Persistent IllegalReferenceCountException at scanner open

Reply via email to