[ 
https://issues.apache.org/jira/browse/HBASE-22422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16847172#comment-16847172
 ] 

Zheng Hu commented on HBASE-22422:
----------------------------------

After running some hours,  the bug reproduced in my pressure cluster,  has the 
following log: 
{code}
2019-05-24,03:43:10,796 INFO org.apache.hadoop.hbase.nio.RefCnt: ===> Start to 
dump callerSet for #641783987
2019-05-24,03:43:10,796 INFO org.apache.hadoop.hbase.nio.RefCnt:   --> 
#641783987 -> caller: HFileScannerImpl#returnBlocks: return curBlock, refCnt 
before release is: 2
2019-05-24,03:43:10,796 INFO org.apache.hadoop.hbase.nio.RefCnt:   --> 
#641783987 -> caller: RAMCache#remove, refCnt before release is: 1
2019-05-24,03:43:10,796 INFO org.apache.hadoop.hbase.nio.RefCnt: ===> End to 
dump callerSet #641783987
2019-05-24,03:43:10,801 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
Encountered an unknown exception in RegionScannerImpl: 
org.apache.hbase.thirdparty.io.netty.util.IllegalReferenceCountException: 
refCnt: 0, increment: 1
        at 
org.apache.hbase.thirdparty.io.netty.util.AbstractReferenceCounted.retain0(AbstractReferenceCounted.java:87)
        at 
org.apache.hbase.thirdparty.io.netty.util.AbstractReferenceCounted.retain(AbstractReferenceCounted.java:74)
        at org.apache.hadoop.hbase.nio.RefCnt.retain(RefCnt.java:73)
        at 
org.apache.hadoop.hbase.nio.SingleByteBuff.retain(SingleByteBuff.java:398)
        at 
org.apache.hadoop.hbase.nio.SingleByteBuff.retain(SingleByteBuff.java:39)
        at 
org.apache.hadoop.hbase.io.hfile.HFileBlock.retain(HFileBlock.java:457)
        at 
org.apache.hadoop.hbase.io.hfile.HFileBlock.retain(HFileBlock.java:115)
        at 
org.apache.hadoop.hbase.io.hfile.bucket.BucketCache$RAMCache.get(BucketCache.java:1539)
        at 
org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.getBlock(BucketCache.java:483)
        at 
org.apache.hadoop.hbase.io.hfile.CombinedBlockCache.getBlock(CombinedBlockCache.java:85)
        at 
org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.getCachedBlock(HFileReaderImpl.java:1306)
        at 
org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1472)
        at 
org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:339)
        at 
org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:843)
        at 
org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:794)
        at 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:315)
        at 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:216)
        at 
org.apache.hadoop.hbase.regionserver.StoreScanner.seekScanners(StoreScanner.java:394)
        at 
org.apache.hadoop.hbase.regionserver.StoreScanner.<init>(StoreScanner.java:249)
        at 
org.apache.hadoop.hbase.regionserver.HStore.createScanner(HStore.java:2063)
        at 
org.apache.hadoop.hbase.regionserver.HStore.getScanner(HStore.java:2054)
        at 
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.initializeScanners(HRegion.java:6493)
        at 
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.<init>(HRegion.java:6473)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:2999)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:2979)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:2961)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:2955)
        at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2621)
        at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2548)
        at 
org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:41998)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:374)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:132)
        at 
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
        at 
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
2019-05-24,03:43:10,813 INFO org.apache.hadoop.hbase.nio.RefCnt: ===> Start to 
dump callerSet for #312566113
2019-05-24,03:43:10,813 INFO org.apache.hadoop.hbase.nio.RefCnt:   --> 
#312566113 -> caller: CellBasedKeyBlockIndexReader#loadDataBlockWithScanInfo, 
refCnt before release is: 1
2019-05-24,03:43:10,813 INFO org.apache.hadoop.hbase.nio.RefCnt:   --> 
#312566113 -> caller: CellBasedKeyBlockIndexReader#loadDataBlockWithScanInfo, 
refCnt before release is: 2
2019-05-24,03:43:10,813 INFO org.apache.hadoop.hbase.nio.RefCnt:   --> 
#312566113 -> caller: CellBasedKeyBlockIndexReader#loadDataBlockWithScanInfo, 
refCnt before release is: 3
2019-05-24,03:43:10,813 INFO org.apache.hadoop.hbase.nio.RefCnt: ===> End to 
dump callerSet #312566113
2019-05-24,03:43:10,813 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
Encountered an unknown exception in RegionScannerImpl: 
org.apache.hbase.thirdparty.io.netty.util.IllegalReferenceCountException: 
refCnt: 0, increment: 1
        at 
org.apache.hbase.thirdparty.io.netty.util.AbstractReferenceCounted.retain0(AbstractReferenceCounted.java:87)
        at 
org.apache.hbase.thirdparty.io.netty.util.AbstractReferenceCounted.retain(AbstractReferenceCounted.java:74)
        at org.apache.hadoop.hbase.nio.RefCnt.retain(RefCnt.java:73)
        at 
org.apache.hadoop.hbase.nio.SingleByteBuff.retain(SingleByteBuff.java:398)
        at 
org.apache.hadoop.hbase.nio.SingleByteBuff.retain(SingleByteBuff.java:39)
        at 
org.apache.hadoop.hbase.io.hfile.HFileBlock.retain(HFileBlock.java:457)
        at 
org.apache.hadoop.hbase.io.hfile.HFileBlock.retain(HFileBlock.java:115)
        at 
org.apache.hadoop.hbase.io.hfile.LruBlockCache.lambda$getBlock$0(LruBlockCache.java:512)
        at 
java.util.concurrent.ConcurrentHashMap.computeIfPresent(ConcurrentHashMap.java:1769)
        at 
org.apache.hadoop.hbase.io.hfile.LruBlockCache.getBlock(LruBlockCache.java:507)
        at 
org.apache.hadoop.hbase.io.hfile.CombinedBlockCache.getBlock(CombinedBlockCache.java:84)
        at 
org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.getCachedBlock(HFileReaderImpl.java:1306)
        at 
org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1472)
        at 
org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:339)
        at 
org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:843)
        at 
org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:794)
        at 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:315)
        at 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:216)
        at 
org.apache.hadoop.hbase.regionserver.StoreScanner.seekScanners(StoreScanner.java:394)
        at 
org.apache.hadoop.hbase.regionserver.StoreScanner.<init>(StoreScanner.java:249)
        at 
org.apache.hadoop.hbase.regionserver.HStore.createScanner(HStore.java:2063)
        at 
org.apache.hadoop.hbase.regionserver.HStore.getScanner(HStore.java:2054)
        at 
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.initializeScanners(HRegion.java:6493)
        at 
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.<init>(HRegion.java:6473)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:2999)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:2979)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:2961)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:2955)
        at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2621)
        at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2548)
        at 
org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:41998)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:374)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:132)
        at 
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
        at 
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
{code}

> Retain an ByteBuff with refCnt=0 when getBlock from LRUCache
> ------------------------------------------------------------
>
>                 Key: HBASE-22422
>                 URL: https://issues.apache.org/jira/browse/HBASE-22422
>             Project: HBase
>          Issue Type: Sub-task
>          Components: BlockCache
>            Reporter: Zheng Hu
>            Assignee: Zheng Hu
>            Priority: Major
>         Attachments: 0001-debug2.patch, 0001-debug2.patch, 0001-debug2.patch, 
> 0001-debug3.patch, 0001-debug4.patch, HBASE-22422.HBASE-21879.v01.patch, 
> LRUBlockCache-getBlock.png, debug.patch, 
> failed-to-check-positive-on-web-ui.png, image-2019-05-15-12-00-03-641.png
>
>
> After runing YCSB scan/get benchmark in our XiaoMi cluster,  we found the get 
> QPS dropped from  25000/s to hunderds per second in a cluster with five 
> nodes.  
> After enable the debug log at YCSB client side,  I found the following 
> stacktrace , see 
> https://issues.apache.org/jira/secure/attachment/12968745/image-2019-05-15-12-00-03-641.png.
>  
> After looking into the stractrace, I can ensure that the zero refCnt block is 
> an intermedia index block, see [2] http://hbase.apache.org/images/hfilev2.png
> Need a patch to fix this. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to