[ 
https://issues.apache.org/jira/browse/HBASE-25698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17309737#comment-17309737
 ] 

Andrew Kyle Purtell edited comment on HBASE-25698 at 3/26/21, 9:48 PM:
-----------------------------------------------------------------------

bq. Any other exception before this stack trace?

I did not capture full logs. Then, I could not reproduce this problem because 
the testbed had been torn down.
 
My hope in filing this issue is code examination will be sufficient to find the 
reference counting problem. Thankfully the entire code arc of interest is 
captured.

HStore.getScanner(HStore.java:2168) -> HStore.createScanner(HStore.java:2177) 
-> StoreScanner.<init>(StoreScanner.java:249) -> 
StoreScanner.selectScannersFrom(StoreScanner.java:471) -> 
StoreFileScanner.shouldUseScanner(StoreFileScanner.java:491) -> 
StoreFileReader.passesBloomFilter(StoreFileReader.java:251) -> 
StoreFileReader.passesGeneralRowBloomFilter(StoreFileReader.java:322) -> 
StoreFileReader.checkGeneralBloomFilter(StoreFileReader.java:433) -> 
*CompoundBloomFilter.contains(CompoundBloomFilter.java:109) -> 
HFileBlock.release(HFileBlock.java:429)*
-> ByteBuff.release(ByteBuff.java:79) -> 
AbstractReferenceCounted.release(AbstractReferenceCounted.java:76) -> 
ReferenceCountUpdater.release(ReferenceCountUpdater.java:138) -> 
ReferenceCountUpdater.toLiveRealRefCnt(ReferenceCountUpdater.java:74) -> 

org.apache.hbase.thirdparty.io.netty.util.IllegalReferenceCountException: 
refCnt: 0, decrement: 1

We have a release without first taking a reference along this code path, looks 
like. I have not looked at the code yet.
Edit: Or as [~vjasani] suggested there could be a race condition if missing 
synchronization or not using atomics.


was (Author: apurtell):
bq. Any other exception before this stack trace?

I did not capture full logs. Then, I could not reproduce this problem because 
the testbed had been torn down.
 
My hope in filing this issue is code examination will be sufficient to find the 
reference counting problem. Thankfully the entire code arc of interest is 
captured.

HStore.getScanner(HStore.java:2168) -> HStore.createScanner(HStore.java:2177) 
-> StoreScanner.<init>(StoreScanner.java:249) -> 
StoreScanner.selectScannersFrom(StoreScanner.java:471) -> 
StoreFileScanner.shouldUseScanner(StoreFileScanner.java:491) -> 
StoreFileReader.passesBloomFilter(StoreFileReader.java:251) -> 
StoreFileReader.passesGeneralRowBloomFilter(StoreFileReader.java:322) -> 
StoreFileReader.checkGeneralBloomFilter(StoreFileReader.java:433) -> 
*CompoundBloomFilter.contains(CompoundBloomFilter.java:109) -> 
HFileBlock.release(HFileBlock.java:429)*
-> ByteBuff.release(ByteBuff.java:79) -> 
AbstractReferenceCounted.release(AbstractReferenceCounted.java:76) -> 
ReferenceCountUpdater.release(ReferenceCountUpdater.java:138) -> 
ReferenceCountUpdater.toLiveRealRefCnt(ReferenceCountUpdater.java:74) -> 

org.apache.hbase.thirdparty.io.netty.util.IllegalReferenceCountException: 
refCnt: 0, decrement: 1

We have a release without first taking a reference along this code path, looks 
like. I have not looked at the code yet.

> Persistent IllegalReferenceCountException at scanner open
> ---------------------------------------------------------
>
>                 Key: HBASE-25698
>                 URL: https://issues.apache.org/jira/browse/HBASE-25698
>             Project: HBase
>          Issue Type: Bug
>          Components: HFile, Scanners
>    Affects Versions: 2.4.2
>            Reporter: Andrew Kyle Purtell
>            Priority: Major
>             Fix For: 3.0.0-alpha-1, 2.5.0, 2.4.3
>
>
> Persistent scanner open failure with offheap read path enabled.
> Not sure how it happened. Test scenario was HBase 1 cluster replicating to 
> HBase 2 cluster. ITBLL as data generator at source, calm policy only. Scanner 
> open errors on sink HBase 2 cluster later during ITBLL verify phase. Sink 
> schema settings bloom=ROW encoding=FAST_DIFF compression=NONE.
> {noformat}
> Caused by: 
> org.apache.hbase.thirdparty.io.netty.util.IllegalReferenceCountException: 
> refCnt: 0, decrement: 1
>         at 
> org.apache.hbase.thirdparty.io.netty.util.internal.ReferenceCountUpdater.toLiveRealRefCnt(ReferenceCountUpdater.java:74)
>         at 
> org.apache.hbase.thirdparty.io.netty.util.internal.ReferenceCountUpdater.release(ReferenceCountUpdater.java:138)
>         at 
> org.apache.hbase.thirdparty.io.netty.util.AbstractReferenceCounted.release(AbstractReferenceCounted.java:76)
>         at org.apache.hadoop.hbase.nio.ByteBuff.release(ByteBuff.java:79)
>         at 
> org.apache.hadoop.hbase.io.hfile.HFileBlock.release(HFileBlock.java:429)
>         at 
> org.apache.hadoop.hbase.io.hfile.CompoundBloomFilter.contains(CompoundBloomFilter.java:109)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreFileReader.checkGeneralBloomFilter(StoreFileReader.java:433)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreFileReader.passesGeneralRowBloomFilter(StoreFileReader.java:322)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreFileReader.passesBloomFilter(StoreFileReader.java:251)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.shouldUseScanner(StoreFileScanner.java:491)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.selectScannersFrom(StoreScanner.java:471)
>         at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.<init>(StoreScanner.java:249)
>         at 
> org.apache.hadoop.hbase.regionserver.HStore.createScanner(HStore.java:2177)
>         at 
> org.apache.hadoop.hbase.regionserver.HStore.getScanner(HStore.java:2168)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.initializeScanners(HRegion.java:7172)
> {noformat}
> Bloom filter type on all files here is ROW, block encoding is FAST_DIFF:
> {noformat}
> hbase:017:0> describe "IntegrationTestBigLinkedList"
> Table IntegrationTestBigLinkedList is ENABLED                                 
>                                                               
> IntegrationTestBigLinkedList                                                  
>                                                               
> COLUMN FAMILIES DESCRIPTION                                                   
>                                                               
> {NAME => 'big', BLOOMFILTER => 'ROW', IN_MEMORY => 'false', VERSIONS => '1', 
> KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'FAST_DIF
> F', COMPRESSION => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', BLOCKCACHE 
> => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '1'}     
> {NAME => 'meta', BLOOMFILTER => 'ROW', IN_MEMORY => 'false', VERSIONS => '1', 
> KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'FAST_DI
> FF', COMPRESSION => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', BLOCKCACHE 
> => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '1'}    
> {NAME => 'tiny', BLOOMFILTER => 'ROW', IN_MEMORY => 'false', VERSIONS => '1', 
> KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'FAST_DI
> FF', COMPRESSION => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', BLOCKCACHE 
> => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '1'}    
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to