[ 
https://issues.apache.org/jira/browse/HBASE-27053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17566488#comment-17566488
 ] 

Bryan Beaudreault commented on HBASE-27053:
-------------------------------------------

My PR in [https://github.com/apache/hbase/pull/4610] is ready for review.

The solution ended up being a little more complicated than I indicated above:
 * It wasn't enough to simply modify getBufferWithoutHeader() with 
!isUnpacked(). isUnpacked() itself analyzes the buffer, so can't be used in a 
method called by unpack().
 * I tried abstracting things a bit so that an HFileBlock is stamped at 
construction time with whether it includes a checksum or not. That was 
complicated and broke various assumptions throughout the tests.
 * In the end, the simplest solution was to discard the checksum right after 
verifying it, in HFileBlock#readBlockDataInternal. This way we can be sure that 
all HFileBlocks (packed or unpacked) do not have a checksum. This just overall 
makes it easier to reason about, but required a bit more code to change and fix 
some tests.

Another simple solution would be to modify 
HFileBlockDefaultDecodingContext.prepareDecoding to copy over the checksum 
after decompressing. This would be a more targeted change, but in my opinion 
would be wasteful – we never use the checksum after reading from disk, so why 
do the work to copy it around everywhere?

Open to other opinions though, happy to go with what seems most reasonable to 
everyone.

> IOException during caching of uncompressed block to the block cache.
> --------------------------------------------------------------------
>
>                 Key: HBASE-27053
>                 URL: https://issues.apache.org/jira/browse/HBASE-27053
>             Project: HBase
>          Issue Type: Bug
>          Components: BlockCache
>    Affects Versions: 2.4.12
>            Reporter: Sergey Soldatov
>            Assignee: Bryan Beaudreault
>            Priority: Major
>
> When prefetch to block cache is enabled and blocks are compressed sometimes 
> caching fails with the exception:
> {noformat}
> 2022-05-18 21:37:29,597 ERROR [RS_OPEN_REGION-regionserver/x1:16020-2] 
> regionserver.HRegion: Could not initialize all stores for the 
> region=cluster_test,66666666,1652935047946.a57ca5f9e7bebb4855a44523063f79c7.
> 2022-05-18 21:37:29,598 WARN  [RS_OPEN_REGION-regionserver/x1:16020-2] 
> regionserver.HRegion: Failed initialize of region= 
> cluster_test,66666666,1652935047946.a57ca5f9e7bebb4855a44523063f79c7., 
> starting to roll back memstore
> java.io.IOException: java.io.IOException: java.lang.RuntimeException: Cached 
> block contents differ, which should not have 
> happened.cacheKey:19307adf1c2248ebb5675116ea640712.c3a21f2005abf308e4a8c9759d4e05fe_0
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1149)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1092)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:996)
>   at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:946)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7240)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegionFromTableDir(HRegion.java:7199)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7175)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7134)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7090)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.AssignRegionHandler.process(AssignRegionHandler.java:147)
>   at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:100)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: java.lang.RuntimeException: Cached block 
> contents differ, which should not have 
> happened.cacheKey:19307adf1c2248ebb5675116ea640712.c3a21f2005abf308e4a8c9759d4e05fe_0
>   at 
> org.apache.hadoop.hbase.regionserver.StoreEngine.openStoreFiles(StoreEngine.java:294)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreEngine.initialize(StoreEngine.java:344)
>   at org.apache.hadoop.hbase.regionserver.HStore.<init>(HStore.java:294)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:6375)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1115)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1112)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   ... 3 more
> Caused by: java.lang.RuntimeException: Cached block contents differ, which 
> should not have 
> happened.cacheKey:19307adf1c2248ebb5675116ea640712.c3a21f2005abf308e4a8c9759d4e05fe_0
>   at 
> org.apache.hadoop.hbase.io.hfile.BlockCacheUtil.validateBlockAddition(BlockCacheUtil.java:199)
>   at 
> org.apache.hadoop.hbase.io.hfile.BlockCacheUtil.shouldReplaceExistingCacheBlock(BlockCacheUtil.java:231)
>   at 
> org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.shouldReplaceExistingCacheBlock(BucketCache.java:447)
>   at 
> org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.cacheBlockWithWait(BucketCache.java:432)
>   at 
> org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.cacheBlock(BucketCache.java:418)
>   at 
> org.apache.hadoop.hbase.io.hfile.CombinedBlockCache.cacheBlock(CombinedBlockCache.java:60)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.lambda$readBlock$2(HFileReaderImpl.java:1319)
>   at java.util.Optional.ifPresent(Optional.java:159)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1317)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.readAndUpdateNewBlock(HFileReaderImpl.java:942)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:931)
>   at 
> org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekTo(HalfStoreFileReader.java:171)
>   at 
> org.apache.hadoop.hbase.io.HalfStoreFileReader.getFirstKey(HalfStoreFileReader.java:321)
>   at org.apache.hadoop.hbase.regionserver.HStoreFile.open(HStoreFile.java:477)
>   at 
> org.apache.hadoop.hbase.regionserver.HStoreFile.initReader(HStoreFile.java:490)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreEngine.createStoreFileAndReader(StoreEngine.java:231)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreEngine.lambda$openStoreFiles$0(StoreEngine.java:272)
>   ... 6 more
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to