[jira] [Commented] (HBASE-26644) Spurious compaction failures with file tracker
[ https://issues.apache.org/jira/browse/HBASE-26644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17489556#comment-17489556 ] Duo Zhang commented on HBASE-26644: --- Ping [~elserj]. > Spurious compaction failures with file tracker > -- > > Key: HBASE-26644 > URL: https://issues.apache.org/jira/browse/HBASE-26644 > Project: HBase > Issue Type: Sub-task > Components: Compaction >Reporter: Josh Elser >Assignee: Josh Elser >Priority: Major > > Noticed when running a basic {{{}hbase pe randomWrite{}}}, we'll see > compactions failing at various points. > One example: > {noformat} > 2022-01-03 17:41:18,319 ERROR > [regionserver/localhost:16020-shortCompactions-0] > regionserver.CompactSplit(670): Compaction failed > region=TestTable,0004054490,1641249249856.2dc7251c6eceb660b9c7bb0b587db913., > storeName=2dc7251c6eceb660b9c7bb0b587db913/info0, priority=6, > startTime=1641249666161 > java.io.IOException: Root-level entries already added in single-level mode > at > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeSingleLevelIndex(HFileBlockIndex.java:1136) > at > org.apache.hadoop.hbase.io.hfile.CompoundBloomFilterWriter$MetaWriter.write(CompoundBloomFilterWriter.java:279) > at > org.apache.hadoop.hbase.io.hfile.HFileWriterImpl$1.writeToBlock(HFileWriterImpl.java:713) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.writeBlock(HFileBlock.java:1205) > at > org.apache.hadoop.hbase.io.hfile.HFileWriterImpl.close(HFileWriterImpl.java:660) > at > org.apache.hadoop.hbase.regionserver.StoreFileWriter.close(StoreFileWriter.java:377) > at > org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.commitWriter(DefaultCompactor.java:70) > at > org.apache.hadoop.hbase.regionserver.compactions.Compactor.compact(Compactor.java:386) > at > org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:62) > at > org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:125) > at org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1141) > at org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:2388) > at > org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.doCompaction(CompactSplit.java:654) > at > org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.run(CompactSplit.java:697) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) {noformat} > This isn't a super-critical issue because compactions will be retried > automatically and they appear to eventually succeed. However, when the max > storefiles limit is reaching, this does cause ingest to hang (as I was doing > with my modest configuration). > We had seen a similar kind of problem in our testing when backporting to > HBase 2.4 (not upstream as the decision was to not do this) which we > eventually tracked down to a bad merge-conflict resolution to the new HFile > Cleaner. However, initial investigations don't have the same exact problem. > It seems that we have some kind of generic race condition. Would be good to > add more logging to catch this in the future (since we have two separate > instances of this category of bug already). -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HBASE-26644) Spurious compaction failures with file tracker
[ https://issues.apache.org/jira/browse/HBASE-26644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17478442#comment-17478442 ] Duo Zhang commented on HBASE-26644: --- So let's close this one or keep an eye on it for more time? [~elserj] > Spurious compaction failures with file tracker > -- > > Key: HBASE-26644 > URL: https://issues.apache.org/jira/browse/HBASE-26644 > Project: HBase > Issue Type: Sub-task > Components: Compaction >Reporter: Josh Elser >Assignee: Josh Elser >Priority: Major > > Noticed when running a basic {{{}hbase pe randomWrite{}}}, we'll see > compactions failing at various points. > One example: > {noformat} > 2022-01-03 17:41:18,319 ERROR > [regionserver/localhost:16020-shortCompactions-0] > regionserver.CompactSplit(670): Compaction failed > region=TestTable,0004054490,1641249249856.2dc7251c6eceb660b9c7bb0b587db913., > storeName=2dc7251c6eceb660b9c7bb0b587db913/info0, priority=6, > startTime=1641249666161 > java.io.IOException: Root-level entries already added in single-level mode > at > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeSingleLevelIndex(HFileBlockIndex.java:1136) > at > org.apache.hadoop.hbase.io.hfile.CompoundBloomFilterWriter$MetaWriter.write(CompoundBloomFilterWriter.java:279) > at > org.apache.hadoop.hbase.io.hfile.HFileWriterImpl$1.writeToBlock(HFileWriterImpl.java:713) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.writeBlock(HFileBlock.java:1205) > at > org.apache.hadoop.hbase.io.hfile.HFileWriterImpl.close(HFileWriterImpl.java:660) > at > org.apache.hadoop.hbase.regionserver.StoreFileWriter.close(StoreFileWriter.java:377) > at > org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.commitWriter(DefaultCompactor.java:70) > at > org.apache.hadoop.hbase.regionserver.compactions.Compactor.compact(Compactor.java:386) > at > org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:62) > at > org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:125) > at org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1141) > at org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:2388) > at > org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.doCompaction(CompactSplit.java:654) > at > org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.run(CompactSplit.java:697) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) {noformat} > This isn't a super-critical issue because compactions will be retried > automatically and they appear to eventually succeed. However, when the max > storefiles limit is reaching, this does cause ingest to hang (as I was doing > with my modest configuration). > We had seen a similar kind of problem in our testing when backporting to > HBase 2.4 (not upstream as the decision was to not do this) which we > eventually tracked down to a bad merge-conflict resolution to the new HFile > Cleaner. However, initial investigations don't have the same exact problem. > It seems that we have some kind of generic race condition. Would be good to > add more logging to catch this in the future (since we have two separate > instances of this category of bug already). -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HBASE-26644) Spurious compaction failures with file tracker
[ https://issues.apache.org/jira/browse/HBASE-26644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17472375#comment-17472375 ] Josh Elser commented on HBASE-26644: No, sorry. Have been pulled into other stuff. I'll try to come back here. > Spurious compaction failures with file tracker > -- > > Key: HBASE-26644 > URL: https://issues.apache.org/jira/browse/HBASE-26644 > Project: HBase > Issue Type: Sub-task > Components: Compaction >Reporter: Josh Elser >Priority: Major > > Noticed when running a basic {{{}hbase pe randomWrite{}}}, we'll see > compactions failing at various points. > One example: > {noformat} > 2022-01-03 17:41:18,319 ERROR > [regionserver/localhost:16020-shortCompactions-0] > regionserver.CompactSplit(670): Compaction failed > region=TestTable,0004054490,1641249249856.2dc7251c6eceb660b9c7bb0b587db913., > storeName=2dc7251c6eceb660b9c7bb0b587db913/info0, priority=6, > startTime=1641249666161 > java.io.IOException: Root-level entries already added in single-level mode > at > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeSingleLevelIndex(HFileBlockIndex.java:1136) > at > org.apache.hadoop.hbase.io.hfile.CompoundBloomFilterWriter$MetaWriter.write(CompoundBloomFilterWriter.java:279) > at > org.apache.hadoop.hbase.io.hfile.HFileWriterImpl$1.writeToBlock(HFileWriterImpl.java:713) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.writeBlock(HFileBlock.java:1205) > at > org.apache.hadoop.hbase.io.hfile.HFileWriterImpl.close(HFileWriterImpl.java:660) > at > org.apache.hadoop.hbase.regionserver.StoreFileWriter.close(StoreFileWriter.java:377) > at > org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.commitWriter(DefaultCompactor.java:70) > at > org.apache.hadoop.hbase.regionserver.compactions.Compactor.compact(Compactor.java:386) > at > org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:62) > at > org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:125) > at org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1141) > at org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:2388) > at > org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.doCompaction(CompactSplit.java:654) > at > org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.run(CompactSplit.java:697) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) {noformat} > This isn't a super-critical issue because compactions will be retried > automatically and they appear to eventually succeed. However, when the max > storefiles limit is reaching, this does cause ingest to hang (as I was doing > with my modest configuration). > We had seen a similar kind of problem in our testing when backporting to > HBase 2.4 (not upstream as the decision was to not do this) which we > eventually tracked down to a bad merge-conflict resolution to the new HFile > Cleaner. However, initial investigations don't have the same exact problem. > It seems that we have some kind of generic race condition. Would be good to > add more logging to catch this in the future (since we have two separate > instances of this category of bug already). -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HBASE-26644) Spurious compaction failures with file tracker
[ https://issues.apache.org/jira/browse/HBASE-26644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17471163#comment-17471163 ] Duo Zhang commented on HBASE-26644: --- So any new updates here? > Spurious compaction failures with file tracker > -- > > Key: HBASE-26644 > URL: https://issues.apache.org/jira/browse/HBASE-26644 > Project: HBase > Issue Type: Sub-task > Components: Compaction >Reporter: Josh Elser >Priority: Major > > Noticed when running a basic {{{}hbase pe randomWrite{}}}, we'll see > compactions failing at various points. > One example: > {noformat} > 2022-01-03 17:41:18,319 ERROR > [regionserver/localhost:16020-shortCompactions-0] > regionserver.CompactSplit(670): Compaction failed > region=TestTable,0004054490,1641249249856.2dc7251c6eceb660b9c7bb0b587db913., > storeName=2dc7251c6eceb660b9c7bb0b587db913/info0, priority=6, > startTime=1641249666161 > java.io.IOException: Root-level entries already added in single-level mode > at > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeSingleLevelIndex(HFileBlockIndex.java:1136) > at > org.apache.hadoop.hbase.io.hfile.CompoundBloomFilterWriter$MetaWriter.write(CompoundBloomFilterWriter.java:279) > at > org.apache.hadoop.hbase.io.hfile.HFileWriterImpl$1.writeToBlock(HFileWriterImpl.java:713) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.writeBlock(HFileBlock.java:1205) > at > org.apache.hadoop.hbase.io.hfile.HFileWriterImpl.close(HFileWriterImpl.java:660) > at > org.apache.hadoop.hbase.regionserver.StoreFileWriter.close(StoreFileWriter.java:377) > at > org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.commitWriter(DefaultCompactor.java:70) > at > org.apache.hadoop.hbase.regionserver.compactions.Compactor.compact(Compactor.java:386) > at > org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:62) > at > org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:125) > at org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1141) > at org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:2388) > at > org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.doCompaction(CompactSplit.java:654) > at > org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.run(CompactSplit.java:697) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) {noformat} > This isn't a super-critical issue because compactions will be retried > automatically and they appear to eventually succeed. However, when the max > storefiles limit is reaching, this does cause ingest to hang (as I was doing > with my modest configuration). > We had seen a similar kind of problem in our testing when backporting to > HBase 2.4 (not upstream as the decision was to not do this) which we > eventually tracked down to a bad merge-conflict resolution to the new HFile > Cleaner. However, initial investigations don't have the same exact problem. > It seems that we have some kind of generic race condition. Would be good to > add more logging to catch this in the future (since we have two separate > instances of this category of bug already). -- This message was sent by Atlassian Jira (v8.20.1#820001)