[
https://issues.apache.org/jira/browse/HBASE-14178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14659524#comment-14659524
]
Duo Zhang commented on HBASE-14178:
-----------------------------------
Yes, we doing more round of check with lock because maybe another thread has
already cache the block for us. Things happen here is we disable BC for the
given family, so it is impossible that another thread will do the work for us,
so we just read from HDFS and bypass the second checking BC round.
And as I mentioned above, there are lots of configurations for BC, and
{{family.isBlockCacheEnabled()}} is treated as {{cacheDataOnRead}} (You can see
the code pasted by [~chenheng], maybe it is a mistake but it is not important
for this issue I think, we could open another issue for it). So the safe way to
determine if we need the second 'read BC with lock' round is to check if we
will put the block back to BC after we read it from HDFS. This is why we
introduce a {{shouldLockOnCacheMiss}} method here. Maybe we cound change the
name to {{shouldReadAgainWithLockOnCacheMiss}}?
Thanks.
> regionserver blocks because of waiting for offsetLock
> -----------------------------------------------------
>
> Key: HBASE-14178
> URL: https://issues.apache.org/jira/browse/HBASE-14178
> Project: HBase
> Issue Type: Bug
> Components: regionserver
> Affects Versions: 0.98.6
> Reporter: Heng Chen
> Priority: Critical
> Fix For: 0.98.6
>
> Attachments: HBASE-14178-0.98.patch, HBASE-14178.patch,
> HBASE-14178_v1.patch, HBASE-14178_v2.patch, HBASE-14178_v3.patch,
> HBASE-14178_v4.patch, HBASE-14178_v5.patch, HBASE-14178_v6.patch, jstack
>
>
> My regionserver blocks, and all client rpc timeout.
> I print the regionserver's jstack, it seems a lot of threads were blocked
> for waiting offsetLock, detail infomation belows:
> PS: my table's block cache is off
> {code}
> "B.DefaultRpcServer.handler=2,queue=2,port=60020" #82 daemon prio=5 os_prio=0
> tid=0x0000000001827000 nid=0x2cdc in Object.wait() [0x00007f3831b72000]
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:502)
> at org.apache.hadoop.hbase.util.IdLock.getLockEntry(IdLock.java:79)
> - locked <0x0000000773af7c18> (a
> org.apache.hadoop.hbase.util.IdLock$Entry)
> at
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:352)
> at
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:253)
> at
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:524)
> at
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:572)
> at
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:257)
> at
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:173)
> at
> org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:55)
> at
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:313)
> at
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:269)
> at
> org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:695)
> at
> org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:683)
> at
> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:533)
> at
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:140)
> at
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:3889)
> at
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3969)
> at
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3847)
> at
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3820)
> - locked <0x00000005e5c55ad0> (a
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl)
> at
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3807)
> at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4779)
> at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4753)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:2916)
> at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29583)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2027)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
> at
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
> at java.lang.Thread.run(Thread.java:745)
> Locked ownable synchronizers:
> - <0x00000005e5c55c08> (a
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)