[ https://issues.apache.org/jira/browse/HBASE-14178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14653391#comment-14653391 ]
Heng Chen commented on HBASE-14178: ----------------------------------- {quote} Ideally, when the BC is enabled and CF level there is no setting like NOT to cache data into BC, we should try read it from the BC. Also even if the CF level setting is there and we are not reading back Data blocks, then also we have to consult BC. Still it will be much cleaner to do ur suggestion of adding the new method to CacheConfig. It will look much cleaner. {quote} I agree with both of you, I will write a function named shouldReadBlockFromCache in CacheConfig to check all the situations we should read from BC. But there is one problem. we acquire lock to ensure next request could read block from BC. If cacheDataOnRead is false but cacheDataOnWrite is true, as we discuss, we still read from BC, and acquire the lock. But after read block from hdfs, we use another condition to decide whether we should cache the block, and it will not cache the block when cacheDataOnRead is false and cacheDataOnWrite is true。 In this situation, the lock is useless. So i think we will use another 'If' to check whether we should acquire the lock. Do you think so? > regionserver blocks because of waiting for offsetLock > ----------------------------------------------------- > > Key: HBASE-14178 > URL: https://issues.apache.org/jira/browse/HBASE-14178 > Project: HBase > Issue Type: Bug > Components: regionserver > Affects Versions: 0.98.6 > Reporter: Heng Chen > Priority: Critical > Fix For: 0.98.6 > > Attachments: HBASE-14178-0.98.patch, HBASE-14178.patch, > HBASE-14178_v1.patch, HBASE-14178_v2.patch, HBASE-14178_v3.patch, > HBASE-14178_v4.patch, jstack > > > My regionserver blocks, and all client rpc timeout. > I print the regionserver's jstack, it seems a lot of threads were blocked > for waiting offsetLock, detail infomation belows: > PS: my table's block cache is off > {code} > "B.DefaultRpcServer.handler=2,queue=2,port=60020" #82 daemon prio=5 os_prio=0 > tid=0x0000000001827000 nid=0x2cdc in Object.wait() [0x00007f3831b72000] > java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > at java.lang.Object.wait(Object.java:502) > at org.apache.hadoop.hbase.util.IdLock.getLockEntry(IdLock.java:79) > - locked <0x0000000773af7c18> (a > org.apache.hadoop.hbase.util.IdLock$Entry) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:352) > at > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:253) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:524) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:572) > at > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:257) > at > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:173) > at > org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:55) > at > org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:313) > at > org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:269) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:695) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:683) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:533) > at > org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:140) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:3889) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3969) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3847) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3820) > - locked <0x00000005e5c55ad0> (a > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3807) > at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4779) > at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4753) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:2916) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29583) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2027) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108) > at > org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114) > at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94) > at java.lang.Thread.run(Thread.java:745) > Locked ownable synchronizers: > - <0x00000005e5c55c08> (a > java.util.concurrent.locks.ReentrantLock$NonfairSync) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)