[jira] [Commented] (HBASE-27053) IOException during caching of uncompressed block to the block cache.

2022-07-17 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17567703#comment-17567703
 ] 

Hudson commented on HBASE-27053:


Results for branch master
[build #635 on 
builds.a.o|https://ci-hbase.apache.org/job/HBase%20Nightly/job/master/635/]: 
(/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/master/635/General_20Nightly_20Build_20Report/]






(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/master/635/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/master/635/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> IOException during caching of uncompressed block to the block cache.
> 
>
> Key: HBASE-27053
> URL: https://issues.apache.org/jira/browse/HBASE-27053
> Project: HBase
>  Issue Type: Bug
>  Components: BlockCache
>Affects Versions: 2.4.12
>Reporter: Sergey Soldatov
>Assignee: Bryan Beaudreault
>Priority: Major
> Fix For: 2.5.0, 3.0.0-alpha-4, 2.4.14
>
>
> When prefetch to block cache is enabled and blocks are compressed sometimes 
> caching fails with the exception:
> {noformat}
> 2022-05-18 21:37:29,597 ERROR [RS_OPEN_REGION-regionserver/x1:16020-2] 
> regionserver.HRegion: Could not initialize all stores for the 
> region=cluster_test,,1652935047946.a57ca5f9e7bebb4855a44523063f79c7.
> 2022-05-18 21:37:29,598 WARN  [RS_OPEN_REGION-regionserver/x1:16020-2] 
> regionserver.HRegion: Failed initialize of region= 
> cluster_test,,1652935047946.a57ca5f9e7bebb4855a44523063f79c7., 
> starting to roll back memstore
> java.io.IOException: java.io.IOException: java.lang.RuntimeException: Cached 
> block contents differ, which should not have 
> happened.cacheKey:19307adf1c2248ebb5675116ea640712.c3a21f2005abf308e4a8c9759d4e05fe_0
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1149)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1092)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:996)
>   at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:946)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7240)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegionFromTableDir(HRegion.java:7199)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7175)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7134)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7090)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.AssignRegionHandler.process(AssignRegionHandler.java:147)
>   at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:100)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: java.lang.RuntimeException: Cached block 
> contents differ, which should not have 
> happened.cacheKey:19307adf1c2248ebb5675116ea640712.c3a21f2005abf308e4a8c9759d4e05fe_0
>   at 
> org.apache.hadoop.hbase.regionserver.StoreEngine.openStoreFiles(StoreEngine.java:294)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreEngine.initialize(StoreEngine.java:344)
>   at org.apache.hadoop.hbase.regionserver.HStore.(HStore.java:294)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:6375)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1115)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1112)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   ... 3 more
> Caused by: java.lang.RuntimeException: Cached block contents differ, which 
> should not have 
> happened.cacheKey:19307adf1c2248ebb5675116ea640712.c3a21f2005abf308e4a8c9759d4e05fe_0
>   at 
> org.apache.hadoop.hbase.io.hfile.BlockCacheUtil.validateBlockAddition(BlockCacheUtil.java:199)
>   at 
> 

[jira] [Commented] (HBASE-27053) IOException during caching of uncompressed block to the block cache.

2022-07-17 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17567698#comment-17567698
 ] 

Hudson commented on HBASE-27053:


Results for branch branch-2
[build #594 on 
builds.a.o|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2/594/]: 
(/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2/594/General_20Nightly_20Build_20Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2/594/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2/594/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2/594/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> IOException during caching of uncompressed block to the block cache.
> 
>
> Key: HBASE-27053
> URL: https://issues.apache.org/jira/browse/HBASE-27053
> Project: HBase
>  Issue Type: Bug
>  Components: BlockCache
>Affects Versions: 2.4.12
>Reporter: Sergey Soldatov
>Assignee: Bryan Beaudreault
>Priority: Major
> Fix For: 2.5.0, 3.0.0-alpha-4, 2.4.14
>
>
> When prefetch to block cache is enabled and blocks are compressed sometimes 
> caching fails with the exception:
> {noformat}
> 2022-05-18 21:37:29,597 ERROR [RS_OPEN_REGION-regionserver/x1:16020-2] 
> regionserver.HRegion: Could not initialize all stores for the 
> region=cluster_test,,1652935047946.a57ca5f9e7bebb4855a44523063f79c7.
> 2022-05-18 21:37:29,598 WARN  [RS_OPEN_REGION-regionserver/x1:16020-2] 
> regionserver.HRegion: Failed initialize of region= 
> cluster_test,,1652935047946.a57ca5f9e7bebb4855a44523063f79c7., 
> starting to roll back memstore
> java.io.IOException: java.io.IOException: java.lang.RuntimeException: Cached 
> block contents differ, which should not have 
> happened.cacheKey:19307adf1c2248ebb5675116ea640712.c3a21f2005abf308e4a8c9759d4e05fe_0
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1149)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1092)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:996)
>   at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:946)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7240)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegionFromTableDir(HRegion.java:7199)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7175)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7134)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7090)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.AssignRegionHandler.process(AssignRegionHandler.java:147)
>   at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:100)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: java.lang.RuntimeException: Cached block 
> contents differ, which should not have 
> happened.cacheKey:19307adf1c2248ebb5675116ea640712.c3a21f2005abf308e4a8c9759d4e05fe_0
>   at 
> org.apache.hadoop.hbase.regionserver.StoreEngine.openStoreFiles(StoreEngine.java:294)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreEngine.initialize(StoreEngine.java:344)
>   at org.apache.hadoop.hbase.regionserver.HStore.(HStore.java:294)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:6375)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1115)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1112)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   ... 3 more
> Caused by: java.lang.RuntimeException: Cached block contents differ, which 
> should not have 
> 

[jira] [Commented] (HBASE-27053) IOException during caching of uncompressed block to the block cache.

2022-07-17 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17567696#comment-17567696
 ] 

Hudson commented on HBASE-27053:


Results for branch branch-2.5
[build #166 on 
builds.a.o|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.5/166/]:
 (/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.5/166/General_20Nightly_20Build_20Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.5/166/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.5/166/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.5/166/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> IOException during caching of uncompressed block to the block cache.
> 
>
> Key: HBASE-27053
> URL: https://issues.apache.org/jira/browse/HBASE-27053
> Project: HBase
>  Issue Type: Bug
>  Components: BlockCache
>Affects Versions: 2.4.12
>Reporter: Sergey Soldatov
>Assignee: Bryan Beaudreault
>Priority: Major
> Fix For: 2.5.0, 3.0.0-alpha-4, 2.4.14
>
>
> When prefetch to block cache is enabled and blocks are compressed sometimes 
> caching fails with the exception:
> {noformat}
> 2022-05-18 21:37:29,597 ERROR [RS_OPEN_REGION-regionserver/x1:16020-2] 
> regionserver.HRegion: Could not initialize all stores for the 
> region=cluster_test,,1652935047946.a57ca5f9e7bebb4855a44523063f79c7.
> 2022-05-18 21:37:29,598 WARN  [RS_OPEN_REGION-regionserver/x1:16020-2] 
> regionserver.HRegion: Failed initialize of region= 
> cluster_test,,1652935047946.a57ca5f9e7bebb4855a44523063f79c7., 
> starting to roll back memstore
> java.io.IOException: java.io.IOException: java.lang.RuntimeException: Cached 
> block contents differ, which should not have 
> happened.cacheKey:19307adf1c2248ebb5675116ea640712.c3a21f2005abf308e4a8c9759d4e05fe_0
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1149)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1092)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:996)
>   at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:946)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7240)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegionFromTableDir(HRegion.java:7199)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7175)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7134)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7090)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.AssignRegionHandler.process(AssignRegionHandler.java:147)
>   at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:100)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: java.lang.RuntimeException: Cached block 
> contents differ, which should not have 
> happened.cacheKey:19307adf1c2248ebb5675116ea640712.c3a21f2005abf308e4a8c9759d4e05fe_0
>   at 
> org.apache.hadoop.hbase.regionserver.StoreEngine.openStoreFiles(StoreEngine.java:294)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreEngine.initialize(StoreEngine.java:344)
>   at org.apache.hadoop.hbase.regionserver.HStore.(HStore.java:294)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:6375)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1115)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1112)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   ... 3 more
> Caused by: java.lang.RuntimeException: Cached block contents differ, which 
> should not have 
> 

[jira] [Commented] (HBASE-27053) IOException during caching of uncompressed block to the block cache.

2022-07-17 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17567682#comment-17567682
 ] 

Hudson commented on HBASE-27053:


Results for branch branch-2.4
[build #391 on 
builds.a.o|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.4/391/]:
 (/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.4/391/General_20Nightly_20Build_20Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.4/391/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.4/391/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.4/391/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> IOException during caching of uncompressed block to the block cache.
> 
>
> Key: HBASE-27053
> URL: https://issues.apache.org/jira/browse/HBASE-27053
> Project: HBase
>  Issue Type: Bug
>  Components: BlockCache
>Affects Versions: 2.4.12
>Reporter: Sergey Soldatov
>Assignee: Bryan Beaudreault
>Priority: Major
> Fix For: 2.5.0, 3.0.0-alpha-4, 2.4.14
>
>
> When prefetch to block cache is enabled and blocks are compressed sometimes 
> caching fails with the exception:
> {noformat}
> 2022-05-18 21:37:29,597 ERROR [RS_OPEN_REGION-regionserver/x1:16020-2] 
> regionserver.HRegion: Could not initialize all stores for the 
> region=cluster_test,,1652935047946.a57ca5f9e7bebb4855a44523063f79c7.
> 2022-05-18 21:37:29,598 WARN  [RS_OPEN_REGION-regionserver/x1:16020-2] 
> regionserver.HRegion: Failed initialize of region= 
> cluster_test,,1652935047946.a57ca5f9e7bebb4855a44523063f79c7., 
> starting to roll back memstore
> java.io.IOException: java.io.IOException: java.lang.RuntimeException: Cached 
> block contents differ, which should not have 
> happened.cacheKey:19307adf1c2248ebb5675116ea640712.c3a21f2005abf308e4a8c9759d4e05fe_0
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1149)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1092)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:996)
>   at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:946)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7240)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegionFromTableDir(HRegion.java:7199)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7175)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7134)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7090)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.AssignRegionHandler.process(AssignRegionHandler.java:147)
>   at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:100)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: java.lang.RuntimeException: Cached block 
> contents differ, which should not have 
> happened.cacheKey:19307adf1c2248ebb5675116ea640712.c3a21f2005abf308e4a8c9759d4e05fe_0
>   at 
> org.apache.hadoop.hbase.regionserver.StoreEngine.openStoreFiles(StoreEngine.java:294)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreEngine.initialize(StoreEngine.java:344)
>   at org.apache.hadoop.hbase.regionserver.HStore.(HStore.java:294)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:6375)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1115)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1112)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   ... 3 more
> Caused by: java.lang.RuntimeException: Cached block contents differ, which 
> should not have 
> 

[jira] [Commented] (HBASE-27053) IOException during caching of uncompressed block to the block cache.

2022-07-17 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17567636#comment-17567636
 ] 

Duo Zhang commented on HBASE-27053:
---

Thanks [~bbeaudreault] for fixing the nasty issue!

> IOException during caching of uncompressed block to the block cache.
> 
>
> Key: HBASE-27053
> URL: https://issues.apache.org/jira/browse/HBASE-27053
> Project: HBase
>  Issue Type: Bug
>  Components: BlockCache
>Affects Versions: 2.4.12
>Reporter: Sergey Soldatov
>Assignee: Bryan Beaudreault
>Priority: Major
> Fix For: 2.5.0, 3.0.0-alpha-4, 2.4.14
>
>
> When prefetch to block cache is enabled and blocks are compressed sometimes 
> caching fails with the exception:
> {noformat}
> 2022-05-18 21:37:29,597 ERROR [RS_OPEN_REGION-regionserver/x1:16020-2] 
> regionserver.HRegion: Could not initialize all stores for the 
> region=cluster_test,,1652935047946.a57ca5f9e7bebb4855a44523063f79c7.
> 2022-05-18 21:37:29,598 WARN  [RS_OPEN_REGION-regionserver/x1:16020-2] 
> regionserver.HRegion: Failed initialize of region= 
> cluster_test,,1652935047946.a57ca5f9e7bebb4855a44523063f79c7., 
> starting to roll back memstore
> java.io.IOException: java.io.IOException: java.lang.RuntimeException: Cached 
> block contents differ, which should not have 
> happened.cacheKey:19307adf1c2248ebb5675116ea640712.c3a21f2005abf308e4a8c9759d4e05fe_0
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1149)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1092)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:996)
>   at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:946)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7240)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegionFromTableDir(HRegion.java:7199)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7175)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7134)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7090)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.AssignRegionHandler.process(AssignRegionHandler.java:147)
>   at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:100)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: java.lang.RuntimeException: Cached block 
> contents differ, which should not have 
> happened.cacheKey:19307adf1c2248ebb5675116ea640712.c3a21f2005abf308e4a8c9759d4e05fe_0
>   at 
> org.apache.hadoop.hbase.regionserver.StoreEngine.openStoreFiles(StoreEngine.java:294)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreEngine.initialize(StoreEngine.java:344)
>   at org.apache.hadoop.hbase.regionserver.HStore.(HStore.java:294)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:6375)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1115)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1112)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   ... 3 more
> Caused by: java.lang.RuntimeException: Cached block contents differ, which 
> should not have 
> happened.cacheKey:19307adf1c2248ebb5675116ea640712.c3a21f2005abf308e4a8c9759d4e05fe_0
>   at 
> org.apache.hadoop.hbase.io.hfile.BlockCacheUtil.validateBlockAddition(BlockCacheUtil.java:199)
>   at 
> org.apache.hadoop.hbase.io.hfile.BlockCacheUtil.shouldReplaceExistingCacheBlock(BlockCacheUtil.java:231)
>   at 
> org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.shouldReplaceExistingCacheBlock(BucketCache.java:447)
>   at 
> org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.cacheBlockWithWait(BucketCache.java:432)
>   at 
> org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.cacheBlock(BucketCache.java:418)
>   at 
> org.apache.hadoop.hbase.io.hfile.CombinedBlockCache.cacheBlock(CombinedBlockCache.java:60)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.lambda$readBlock$2(HFileReaderImpl.java:1319)
>   at java.util.Optional.ifPresent(Optional.java:159)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1317)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.readAndUpdateNewBlock(HFileReaderImpl.java:942)
>   at 
> 

[jira] [Commented] (HBASE-27053) IOException during caching of uncompressed block to the block cache.

2022-07-13 Thread Bryan Beaudreault (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17566488#comment-17566488
 ] 

Bryan Beaudreault commented on HBASE-27053:
---

My PR in [https://github.com/apache/hbase/pull/4610] is ready for review.

The solution ended up being a little more complicated than I indicated above:
 * It wasn't enough to simply modify getBufferWithoutHeader() with 
!isUnpacked(). isUnpacked() itself analyzes the buffer, so can't be used in a 
method called by unpack().
 * I tried abstracting things a bit so that an HFileBlock is stamped at 
construction time with whether it includes a checksum or not. That was 
complicated and broke various assumptions throughout the tests.
 * In the end, the simplest solution was to discard the checksum right after 
verifying it, in HFileBlock#readBlockDataInternal. This way we can be sure that 
all HFileBlocks (packed or unpacked) do not have a checksum. This just overall 
makes it easier to reason about, but required a bit more code to change and fix 
some tests.

Another simple solution would be to modify 
HFileBlockDefaultDecodingContext.prepareDecoding to copy over the checksum 
after decompressing. This would be a more targeted change, but in my opinion 
would be wasteful – we never use the checksum after reading from disk, so why 
do the work to copy it around everywhere?

Open to other opinions though, happy to go with what seems most reasonable to 
everyone.

> IOException during caching of uncompressed block to the block cache.
> 
>
> Key: HBASE-27053
> URL: https://issues.apache.org/jira/browse/HBASE-27053
> Project: HBase
>  Issue Type: Bug
>  Components: BlockCache
>Affects Versions: 2.4.12
>Reporter: Sergey Soldatov
>Assignee: Bryan Beaudreault
>Priority: Major
>
> When prefetch to block cache is enabled and blocks are compressed sometimes 
> caching fails with the exception:
> {noformat}
> 2022-05-18 21:37:29,597 ERROR [RS_OPEN_REGION-regionserver/x1:16020-2] 
> regionserver.HRegion: Could not initialize all stores for the 
> region=cluster_test,,1652935047946.a57ca5f9e7bebb4855a44523063f79c7.
> 2022-05-18 21:37:29,598 WARN  [RS_OPEN_REGION-regionserver/x1:16020-2] 
> regionserver.HRegion: Failed initialize of region= 
> cluster_test,,1652935047946.a57ca5f9e7bebb4855a44523063f79c7., 
> starting to roll back memstore
> java.io.IOException: java.io.IOException: java.lang.RuntimeException: Cached 
> block contents differ, which should not have 
> happened.cacheKey:19307adf1c2248ebb5675116ea640712.c3a21f2005abf308e4a8c9759d4e05fe_0
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1149)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1092)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:996)
>   at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:946)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7240)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegionFromTableDir(HRegion.java:7199)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7175)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7134)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7090)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.AssignRegionHandler.process(AssignRegionHandler.java:147)
>   at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:100)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: java.lang.RuntimeException: Cached block 
> contents differ, which should not have 
> happened.cacheKey:19307adf1c2248ebb5675116ea640712.c3a21f2005abf308e4a8c9759d4e05fe_0
>   at 
> org.apache.hadoop.hbase.regionserver.StoreEngine.openStoreFiles(StoreEngine.java:294)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreEngine.initialize(StoreEngine.java:344)
>   at org.apache.hadoop.hbase.regionserver.HStore.(HStore.java:294)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:6375)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1115)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1112)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   ... 3 more
> Caused by: 

[jira] [Commented] (HBASE-27053) IOException during caching of uncompressed block to the block cache.

2022-07-12 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17566090#comment-17566090
 ] 

Andrew Kyle Purtell commented on HBASE-27053:
-

[~bbeaudreault] This JIRA can be taken over.

> IOException during caching of uncompressed block to the block cache.
> 
>
> Key: HBASE-27053
> URL: https://issues.apache.org/jira/browse/HBASE-27053
> Project: HBase
>  Issue Type: Bug
>  Components: BlockCache
>Affects Versions: 2.4.12
>Reporter: Sergey Soldatov
>Assignee: Sergey Soldatov
>Priority: Major
>
> When prefetch to block cache is enabled and blocks are compressed sometimes 
> caching fails with the exception:
> {noformat}
> 2022-05-18 21:37:29,597 ERROR [RS_OPEN_REGION-regionserver/x1:16020-2] 
> regionserver.HRegion: Could not initialize all stores for the 
> region=cluster_test,,1652935047946.a57ca5f9e7bebb4855a44523063f79c7.
> 2022-05-18 21:37:29,598 WARN  [RS_OPEN_REGION-regionserver/x1:16020-2] 
> regionserver.HRegion: Failed initialize of region= 
> cluster_test,,1652935047946.a57ca5f9e7bebb4855a44523063f79c7., 
> starting to roll back memstore
> java.io.IOException: java.io.IOException: java.lang.RuntimeException: Cached 
> block contents differ, which should not have 
> happened.cacheKey:19307adf1c2248ebb5675116ea640712.c3a21f2005abf308e4a8c9759d4e05fe_0
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1149)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1092)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:996)
>   at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:946)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7240)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegionFromTableDir(HRegion.java:7199)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7175)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7134)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7090)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.AssignRegionHandler.process(AssignRegionHandler.java:147)
>   at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:100)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: java.lang.RuntimeException: Cached block 
> contents differ, which should not have 
> happened.cacheKey:19307adf1c2248ebb5675116ea640712.c3a21f2005abf308e4a8c9759d4e05fe_0
>   at 
> org.apache.hadoop.hbase.regionserver.StoreEngine.openStoreFiles(StoreEngine.java:294)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreEngine.initialize(StoreEngine.java:344)
>   at org.apache.hadoop.hbase.regionserver.HStore.(HStore.java:294)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:6375)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1115)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1112)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   ... 3 more
> Caused by: java.lang.RuntimeException: Cached block contents differ, which 
> should not have 
> happened.cacheKey:19307adf1c2248ebb5675116ea640712.c3a21f2005abf308e4a8c9759d4e05fe_0
>   at 
> org.apache.hadoop.hbase.io.hfile.BlockCacheUtil.validateBlockAddition(BlockCacheUtil.java:199)
>   at 
> org.apache.hadoop.hbase.io.hfile.BlockCacheUtil.shouldReplaceExistingCacheBlock(BlockCacheUtil.java:231)
>   at 
> org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.shouldReplaceExistingCacheBlock(BucketCache.java:447)
>   at 
> org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.cacheBlockWithWait(BucketCache.java:432)
>   at 
> org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.cacheBlock(BucketCache.java:418)
>   at 
> org.apache.hadoop.hbase.io.hfile.CombinedBlockCache.cacheBlock(CombinedBlockCache.java:60)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.lambda$readBlock$2(HFileReaderImpl.java:1319)
>   at java.util.Optional.ifPresent(Optional.java:159)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1317)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.readAndUpdateNewBlock(HFileReaderImpl.java:942)
>   at 
> 

[jira] [Commented] (HBASE-27053) IOException during caching of uncompressed block to the block cache.

2022-07-07 Thread Bryan Beaudreault (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17564027#comment-17564027
 ] 

Bryan Beaudreault commented on HBASE-27053:
---

[~sergey.soldatov] please let me know if you agree with the above and would be 
willing to give it a try. I'd also be happy to take over this Jira if you don't 
have time. Up to you. I'd just like to get it fixed soon.

> IOException during caching of uncompressed block to the block cache.
> 
>
> Key: HBASE-27053
> URL: https://issues.apache.org/jira/browse/HBASE-27053
> Project: HBase
>  Issue Type: Bug
>  Components: BlockCache
>Affects Versions: 2.4.12
>Reporter: Sergey Soldatov
>Assignee: Sergey Soldatov
>Priority: Major
>
> When prefetch to block cache is enabled and blocks are compressed sometimes 
> caching fails with the exception:
> {noformat}
> 2022-05-18 21:37:29,597 ERROR [RS_OPEN_REGION-regionserver/x1:16020-2] 
> regionserver.HRegion: Could not initialize all stores for the 
> region=cluster_test,,1652935047946.a57ca5f9e7bebb4855a44523063f79c7.
> 2022-05-18 21:37:29,598 WARN  [RS_OPEN_REGION-regionserver/x1:16020-2] 
> regionserver.HRegion: Failed initialize of region= 
> cluster_test,,1652935047946.a57ca5f9e7bebb4855a44523063f79c7., 
> starting to roll back memstore
> java.io.IOException: java.io.IOException: java.lang.RuntimeException: Cached 
> block contents differ, which should not have 
> happened.cacheKey:19307adf1c2248ebb5675116ea640712.c3a21f2005abf308e4a8c9759d4e05fe_0
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1149)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1092)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:996)
>   at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:946)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7240)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegionFromTableDir(HRegion.java:7199)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7175)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7134)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7090)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.AssignRegionHandler.process(AssignRegionHandler.java:147)
>   at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:100)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: java.lang.RuntimeException: Cached block 
> contents differ, which should not have 
> happened.cacheKey:19307adf1c2248ebb5675116ea640712.c3a21f2005abf308e4a8c9759d4e05fe_0
>   at 
> org.apache.hadoop.hbase.regionserver.StoreEngine.openStoreFiles(StoreEngine.java:294)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreEngine.initialize(StoreEngine.java:344)
>   at org.apache.hadoop.hbase.regionserver.HStore.(HStore.java:294)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:6375)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1115)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1112)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   ... 3 more
> Caused by: java.lang.RuntimeException: Cached block contents differ, which 
> should not have 
> happened.cacheKey:19307adf1c2248ebb5675116ea640712.c3a21f2005abf308e4a8c9759d4e05fe_0
>   at 
> org.apache.hadoop.hbase.io.hfile.BlockCacheUtil.validateBlockAddition(BlockCacheUtil.java:199)
>   at 
> org.apache.hadoop.hbase.io.hfile.BlockCacheUtil.shouldReplaceExistingCacheBlock(BlockCacheUtil.java:231)
>   at 
> org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.shouldReplaceExistingCacheBlock(BucketCache.java:447)
>   at 
> org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.cacheBlockWithWait(BucketCache.java:432)
>   at 
> org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.cacheBlock(BucketCache.java:418)
>   at 
> org.apache.hadoop.hbase.io.hfile.CombinedBlockCache.cacheBlock(CombinedBlockCache.java:60)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.lambda$readBlock$2(HFileReaderImpl.java:1319)
>   at java.util.Optional.ifPresent(Optional.java:159)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1317)
>   at 
> 

[jira] [Commented] (HBASE-27053) IOException during caching of uncompressed block to the block cache.

2022-07-07 Thread Bryan Beaudreault (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17564026#comment-17564026
 ] 

Bryan Beaudreault commented on HBASE-27053:
---

I've been digging into this. Relaying some of my findings from comments on the 
PR:

The problem here is that in HFileBlock#allocateBuffer, we are including space 
for the checksums in the unpacked buffer. But then, when we call 
HFileBlockDecodingContext#prepareEncoding, we pass in 
getUncompressedSizeWithoutHeader() as the uncompressedSize, which does not 
include checksum length. This mismatch causes all unpacked HFileBlock buffers 
to have a section of un-set bytes at the end of each buffer.

When using BucketCache, all HFileBlock buffers are allocated through the 
ByteBuffAllocactor, which pools them for re-use. When a buffer is pulled from 
the pool, clear() is called on the buffer, but that doesn't zero out the bytes. 
It just resets the pointers and limits. As a result, you pull a buffer from the 
pool which previously had bytes in the section of the buffer that happens to 
hold the un-set checksum, you end up with junk.

This is happening for all HFileBlock buffers, when BucketCache is enabled. But 
this typically doesn't matter, because the checksum is not used after reading 
from disk. In fact, most usages of the HFileBlock buffer get it through 
HFileBlock#getBufferWithoutHeader(), which excludes the checksum section. 

I think the best fix here would be to make HFileBlock#allocateBuffer not 
include the checksum bytes in the capacity. The only challenge with that is we 
need to then handle the case in getBufferWithoutHeader() so we properly set the 
limit. Currently getBufferWithoutHeader() always passes {{false}} into 
{{{}getBufferWithoutHeader(boolean withChecksum){}}}. I think this would be a 
matter of changing {{false}} to {{{}!isUnpacked(){}}}. Since packed blocks 
would always have the checksum available, and unpacked would never.

So my suggestion ends up being a two-line change:
 * in HFileBlock#allocateBuffer(), remove {{cksumBytes}} from capacity
 * In HFileBlock#getBufferWithoutHeader(), pass {{!isUnpacked()}} instead of 
{{{}false{}}}.

We can validate this functionality with a new test in TestHFileBlockUnpack:
{code:java}
  @Test
  public void itUnpacksIdenticallyEachTime() throws IOException {
Path path = new Path(TEST_UTIL.getDataTestDir(), name.getMethodName());
int totalSize = createTestBlock(path);

// Allocate a bunch of random buffers, so we can be sure that unpack will 
only have "dirty"
// buffers to choose from when allocating itself.
Random random = new Random();
byte[] temp = new byte[HConstants.DEFAULT_BLOCKSIZE];
List buffs = new ArrayList<>();
for (int i = 0; i < 10; i++) {
  ByteBuff buff = allocator.allocate(HConstants.DEFAULT_BLOCKSIZE);
  random.nextBytes(temp);
  buff.put(temp);
  buffs.add(buff);
}

buffs.forEach(ByteBuff::release);

// read the same block twice. we should expect the underlying buffer below 
to
// be identical each time
HFileBlock blockOne = readBlock(path, totalSize);
HFileBlock blockTwo = readBlock(path, totalSize);

ByteBuff bufferOne = blockOne.getBufferWithoutHeader(true);
ByteBuff bufferTwo = blockTwo.getBufferWithoutHeader(true);

// This assertion should succeed, but it fails on master. It will succeed 
once
// cksumBytes is removed from HFileBlock#allocateBuffer.
assertEquals(0, ByteBuff.compareTo(bufferOne, 0, bufferOne.limit(), 
bufferTwo, 0 ,bufferTwo.limit()));
  } {code}
This test basically ensures that if you read the same block twice, they get the 
exact same underlying buffer. I think this is a good invariant to ensure.

> IOException during caching of uncompressed block to the block cache.
> 
>
> Key: HBASE-27053
> URL: https://issues.apache.org/jira/browse/HBASE-27053
> Project: HBase
>  Issue Type: Bug
>  Components: BlockCache
>Affects Versions: 2.4.12
>Reporter: Sergey Soldatov
>Assignee: Sergey Soldatov
>Priority: Major
>
> When prefetch to block cache is enabled and blocks are compressed sometimes 
> caching fails with the exception:
> {noformat}
> 2022-05-18 21:37:29,597 ERROR [RS_OPEN_REGION-regionserver/x1:16020-2] 
> regionserver.HRegion: Could not initialize all stores for the 
> region=cluster_test,,1652935047946.a57ca5f9e7bebb4855a44523063f79c7.
> 2022-05-18 21:37:29,598 WARN  [RS_OPEN_REGION-regionserver/x1:16020-2] 
> regionserver.HRegion: Failed initialize of region= 
> cluster_test,,1652935047946.a57ca5f9e7bebb4855a44523063f79c7., 
> starting to roll back memstore
> java.io.IOException: java.io.IOException: java.lang.RuntimeException: Cached 
> block contents differ, which should not have 
> 

[jira] [Commented] (HBASE-27053) IOException during caching of uncompressed block to the block cache.

2022-07-02 Thread Bryan Beaudreault (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17561795#comment-17561795
 ] 

Bryan Beaudreault commented on HBASE-27053:
---

We are seeing this issue, but we don't use prefetch. It similarly happens 
around region splits.

> IOException during caching of uncompressed block to the block cache.
> 
>
> Key: HBASE-27053
> URL: https://issues.apache.org/jira/browse/HBASE-27053
> Project: HBase
>  Issue Type: Bug
>  Components: BlockCache
>Affects Versions: 2.4.12
>Reporter: Sergey Soldatov
>Assignee: Sergey Soldatov
>Priority: Major
>
> When prefetch to block cache is enabled and blocks are compressed sometimes 
> caching fails with the exception:
> {noformat}
> 2022-05-18 21:37:29,597 ERROR [RS_OPEN_REGION-regionserver/x1:16020-2] 
> regionserver.HRegion: Could not initialize all stores for the 
> region=cluster_test,,1652935047946.a57ca5f9e7bebb4855a44523063f79c7.
> 2022-05-18 21:37:29,598 WARN  [RS_OPEN_REGION-regionserver/x1:16020-2] 
> regionserver.HRegion: Failed initialize of region= 
> cluster_test,,1652935047946.a57ca5f9e7bebb4855a44523063f79c7., 
> starting to roll back memstore
> java.io.IOException: java.io.IOException: java.lang.RuntimeException: Cached 
> block contents differ, which should not have 
> happened.cacheKey:19307adf1c2248ebb5675116ea640712.c3a21f2005abf308e4a8c9759d4e05fe_0
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1149)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1092)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:996)
>   at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:946)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7240)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegionFromTableDir(HRegion.java:7199)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7175)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7134)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7090)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.AssignRegionHandler.process(AssignRegionHandler.java:147)
>   at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:100)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: java.lang.RuntimeException: Cached block 
> contents differ, which should not have 
> happened.cacheKey:19307adf1c2248ebb5675116ea640712.c3a21f2005abf308e4a8c9759d4e05fe_0
>   at 
> org.apache.hadoop.hbase.regionserver.StoreEngine.openStoreFiles(StoreEngine.java:294)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreEngine.initialize(StoreEngine.java:344)
>   at org.apache.hadoop.hbase.regionserver.HStore.(HStore.java:294)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:6375)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1115)
>   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1112)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   ... 3 more
> Caused by: java.lang.RuntimeException: Cached block contents differ, which 
> should not have 
> happened.cacheKey:19307adf1c2248ebb5675116ea640712.c3a21f2005abf308e4a8c9759d4e05fe_0
>   at 
> org.apache.hadoop.hbase.io.hfile.BlockCacheUtil.validateBlockAddition(BlockCacheUtil.java:199)
>   at 
> org.apache.hadoop.hbase.io.hfile.BlockCacheUtil.shouldReplaceExistingCacheBlock(BlockCacheUtil.java:231)
>   at 
> org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.shouldReplaceExistingCacheBlock(BucketCache.java:447)
>   at 
> org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.cacheBlockWithWait(BucketCache.java:432)
>   at 
> org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.cacheBlock(BucketCache.java:418)
>   at 
> org.apache.hadoop.hbase.io.hfile.CombinedBlockCache.cacheBlock(CombinedBlockCache.java:60)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.lambda$readBlock$2(HFileReaderImpl.java:1319)
>   at java.util.Optional.ifPresent(Optional.java:159)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1317)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.readAndUpdateNewBlock(HFileReaderImpl.java:942)
>   at 
> 

[jira] [Commented] (HBASE-27053) IOException during caching of uncompressed block to the block cache.

2022-05-19 Thread Sergey Soldatov (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17539859#comment-17539859
 ] 

Sergey Soldatov commented on HBASE-27053:
-

So, why and when did it happen. The usual scenario is when a region split has 
just happened and RS is trying to open both daughters and load those to the 
bucket cache. There is a single store that has the split point, so two threads 
are reading the same block and trying to store it in the cache. When we 
decompress the block, the first thing we are doing is the space allocation:

[https://github.com/apache/hbase/blob/c7eb30d91015de67fb8207ac1818ce2a29dd60a4/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java#L649]

Here we allocate the space for uncompressed data *AND* checksum. But we fill 
the data only, without filling the checksum space which actually might be 
filled by garbage from the previous usage. So the block is cached with this 
garbage and obviously, that part might and would not match when we try to store 
the same block from another thread. 

Unfortunately, I was unable to create a reasonable unit test for this scenario, 
but there is the manual steps to reproduce:

a single node cluster with following configuration tweaks:

hbase.hregion.memstore.flush.size=100 

hbase.hregion.max.filesize=1000 

hbase.bucketcache.ioengine=file:/tmp/hbase_cache

hbase.bucketcache.size=20

hbase.hfile.thread.prefetch=4

 

And the load is generated by:
{noformat}

hbase org.apache.hadoop.hbase.util.LoadTestTool -compression SNAPPY -write 
1:10:100 -num_keys 1 {noformat}
Usually, after 700k-1.2m records, the earlier mentioned exception appears in 
the RS log.

So, to solve the problem I would suggest adding a code that cleans up the 
checksum space when the decompression is completed. It doesn't look like an 
optimal solution, but right after decompression, we don't know whether the 
checksum space will be used or not, so we could not just trim the bytebuff. 

> IOException during caching of uncompressed block to the block cache.
> 
>
> Key: HBASE-27053
> URL: https://issues.apache.org/jira/browse/HBASE-27053
> Project: HBase
>  Issue Type: Bug
>  Components: BlockCache
>Affects Versions: 2.4.12
>Reporter: Sergey Soldatov
>Assignee: Sergey Soldatov
>Priority: Major
>
> When prefetch to block cache is enabled and blocks are compressed sometimes 
> caching fails with the exception:
> {noformat}
> 2022-05-18 21:37:29,597 ERROR [RS_OPEN_REGION-regionserver/x1:16020-2] 
> regionserver.HRegion: Could not initialize all stores for the 
> region=cluster_test,,1652935047946.a57ca5f9e7bebb4855a44523063f79c7.
> 2022-05-18 21:37:29,598 WARN  [RS_OPEN_REGION-regionserver/x1:16020-2] 
> regionserver.HRegion: Failed initialize of region= 
> cluster_test,,1652935047946.a57ca5f9e7bebb4855a44523063f79c7., 
> starting to roll back memstore
> java.io.IOException: java.io.IOException: java.lang.RuntimeException: Cached 
> block contents differ, which should not have 
> happened.cacheKey:19307adf1c2248ebb5675116ea640712.c3a21f2005abf308e4a8c9759d4e05fe_0
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1149)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1092)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:996)
>   at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:946)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7240)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegionFromTableDir(HRegion.java:7199)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7175)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7134)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7090)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.AssignRegionHandler.process(AssignRegionHandler.java:147)
>   at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:100)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: java.lang.RuntimeException: Cached block 
> contents differ, which should not have 
> happened.cacheKey:19307adf1c2248ebb5675116ea640712.c3a21f2005abf308e4a8c9759d4e05fe_0
>   at 
> org.apache.hadoop.hbase.regionserver.StoreEngine.openStoreFiles(StoreEngine.java:294)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreEngine.initialize(StoreEngine.java:344)
>   at