[jira] [Commented] (HBASE-8547) Fix java.lang.RuntimeException: Cached an already cached block

Enis Soztutar (JIRA) Tue, 14 May 2013 19:35:18 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-8547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13657818#comment-13657818
 ]


Enis Soztutar commented on HBASE-8547:
--------------------------------------

bq. Is it true to say that this can only happen when bottom half of a split is 
looking for block at offset 0 – first block – or top-half of a split is looking 
for last block in the file?
This is not just block offset 0. I think the trigger is that when we have a 
midkey for a half file which does not correspond to the block boundary, both 
the top and bottom regions will try to read the whole block. If this happens 
concurrently then we get an exception. In HBASE-6479, reports a similar case, 
but in that they only encountered this for meta blocks, which are read by both 
half store files. 

bq. Your unit test verifies half file has this issue?
Yes, I can reproduce this in the test environment, but it does not mimic the 
requests coming from upper layers. 

After some offline discussion, I think it makes sense to split this issue into 
two, and do just the changes in v2-reduced patch. The reasoning is that the 
changes are proving to be quite large with the refactoring, and we have to 
spend some time to think how are we going to handle singleton LruBlockCache, 
CacheConfig, etc. 


                
> Fix java.lang.RuntimeException: Cached an already cached block
> --------------------------------------------------------------
>
>                 Key: HBASE-8547
>                 URL: https://issues.apache.org/jira/browse/HBASE-8547
>             Project: HBase
>          Issue Type: Bug
>          Components: io, regionserver
>            Reporter: Enis Soztutar
>            Assignee: Enis Soztutar
>             Fix For: 0.98.0, 0.94.8, 0.95.1
>
>         Attachments: hbase-8547_v1-0.94.patch, hbase-8547_v1-0.94.patch, 
> hbase-8547_v1.patch, hbase-8547_v2-0.94-reduced.patch
>
>
> In one test, one of the region servers received the following on 0.94. 
> Note HalfStoreFileReader in the stack trace. I think the root cause is that 
> after the region is split, the mid point can be in the middle of the block 
> (for store files that the mid point is not chosen from). Each half store file 
> tries to load the half block and put it in the block cache. Since IdLock is 
> instantiated per store file reader, they do not share the same IdLock 
> instance, thus does not lock against each other effectively. 
> {code}
> 2013-05-12 01:30:37,733 ERROR 
> org.apache.hadoop.hbase.regionserver.HRegionServer:·
> java.lang.RuntimeException: Cached an already cached block
>   at 
> org.apache.hadoop.hbase.io.hfile.LruBlockCache.cacheBlock(LruBlockCache.java:279)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:353)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:254)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:480)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:501)
>   at 
> org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekTo(HalfStoreFileReader.java:237)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:226)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:145)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.enforceSeek(StoreFileScanner.java:351)
>   at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.pollRealKV(KeyValueHeap.java:354)
>   at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:312)
>   at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:277)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:543)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:411)
>   at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:143)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:3829)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3896)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3778)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3770)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2643)
>   at sun.reflect.GeneratedMethodAccessor25.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hbase.ipc.SecureRpcEngine$Server.call(SecureRpcEngine.java:308)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426)
> {code}
> I can see two possible fixes: 
>  # Allow this kind of rare cases in LruBlockCache by not throwing an 
> exception. 
>  # Move the lock instances to upper layer (possibly in CacheConfig), and let 
> half hfile readers share the same IdLock implementation. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8547) Fix java.lang.RuntimeException: Cached an already cached block

Reply via email to