[jira] [Updated] (HBASE-12949) Scanner can be stuck in infinite loop if the HFile is corrupted
[ https://issues.apache.org/jira/browse/HBASE-12949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jerry He updated HBASE-12949: - Resolution: Fixed Assignee: Jerry He Hadoop Flags: Reviewed Fix Version/s: 1.4.0 2.0.0 Status: Resolved (was: Patch Available) > Scanner can be stuck in infinite loop if the HFile is corrupted > --- > > Key: HBASE-12949 > URL: https://issues.apache.org/jira/browse/HBASE-12949 > Project: HBase > Issue Type: Bug >Affects Versions: 0.94.3, 0.98.10 >Reporter: Jerry He >Assignee: Jerry He > Fix For: 2.0.0, 1.4.0 > > Attachments: HBASE-12949-branch-1-v3.patch, HBASE-12949-master-v2 > (1).patch, HBASE-12949-master-v2.patch, HBASE-12949-master-v2.patch, > HBASE-12949-master-v2.patch, HBASE-12949-master-v3.patch, > HBASE-12949-master.patch > > > We've encountered problem where compaction hangs and never completes. > After looking into it further, we found that the compaction scanner was stuck > in a infinite loop. See stack below. > {noformat} > org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:296) > org.apache.hadoop.hbase.regionserver.KeyValueHeap.reseek(KeyValueHeap.java:257) > org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:697) > org.apache.hadoop.hbase.regionserver.StoreScanner.seekToNextRow(StoreScanner.java:672) > org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:529) > org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:223) > {noformat} > We identified the hfile that seems to be corrupted. Using HFile tool shows > the following: > {noformat} > [biadmin@hdtest009 bin]$ hbase org.apache.hadoop.hbase.io.hfile.HFile -v -k > -m -f > /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 > 15/01/23 11:53:17 INFO Configuration.deprecation: hadoop.native.lib is > deprecated. Instead, use io.native.lib.available > 15/01/23 11:53:18 INFO util.ChecksumType: Checksum using > org.apache.hadoop.util.PureJavaCrc32 > 15/01/23 11:53:18 INFO util.ChecksumType: Checksum can use > org.apache.hadoop.util.PureJavaCrc32C > 15/01/23 11:53:18 INFO Configuration.deprecation: fs.default.name is > deprecated. Instead, use fs.defaultFS > Scanning -> > /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 > WARNING, previous row is greater then current row > filename -> > /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 > previous -> > \x00/20110203-094231205-79442793-1410161293068203000\x0Aattributes16794406\x00\x00\x01\x00\x00\x00\x00\x00\x00 > current -> > Exception in thread "main" java.nio.BufferUnderflowException > at java.nio.Buffer.nextGetIndex(Buffer.java:489) > at java.nio.HeapByteBuffer.getInt(HeapByteBuffer.java:347) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.readKeyValueLen(HFileReaderV2.java:856) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:768) > at > org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.scanKeysValues(HFilePrettyPrinter.java:362) > at > org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.processFile(HFilePrettyPrinter.java:262) > at > org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.run(HFilePrettyPrinter.java:220) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at > org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.main(HFilePrettyPrinter.java:539) > at org.apache.hadoop.hbase.io.hfile.HFile.main(HFile.java:802) > {noformat} > Turning on Java Assert shows the following: > {noformat} > Exception in thread "main" java.lang.AssertionError: Key > 20110203-094231205-79442793-1410161293068203000/attributes:16794406/1099511627776/Minimum/vlen=15/mvcc=0 > followed by a smaller key //0/Minimum/vlen=0/mvcc=0 in cf attributes > at > org.apache.hadoop.hbase.regionserver.StoreScanner.checkScanOrder(StoreScanner.java:672) > {noformat} > It shows that the hfile seems to be corrupted -- the keys don't seem to be > right. > But Scanner is not able to give a meaningful error, but stuck in an infinite > loop in here: > {code} > KeyValueHeap.generalizedSeek() > while ((scanner = heap.poll()) != null) { > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12949) Scanner can be stuck in infinite loop if the HFile is corrupted
[ https://issues.apache.org/jira/browse/HBASE-12949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jerry He updated HBASE-12949: - Attachment: HBASE-12949-branch-1-v3.patch > Scanner can be stuck in infinite loop if the HFile is corrupted > --- > > Key: HBASE-12949 > URL: https://issues.apache.org/jira/browse/HBASE-12949 > Project: HBase > Issue Type: Bug >Affects Versions: 0.94.3, 0.98.10 >Reporter: Jerry He > Attachments: HBASE-12949-branch-1-v3.patch, HBASE-12949-master-v2 > (1).patch, HBASE-12949-master-v2.patch, HBASE-12949-master-v2.patch, > HBASE-12949-master-v2.patch, HBASE-12949-master-v3.patch, > HBASE-12949-master.patch > > > We've encountered problem where compaction hangs and never completes. > After looking into it further, we found that the compaction scanner was stuck > in a infinite loop. See stack below. > {noformat} > org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:296) > org.apache.hadoop.hbase.regionserver.KeyValueHeap.reseek(KeyValueHeap.java:257) > org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:697) > org.apache.hadoop.hbase.regionserver.StoreScanner.seekToNextRow(StoreScanner.java:672) > org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:529) > org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:223) > {noformat} > We identified the hfile that seems to be corrupted. Using HFile tool shows > the following: > {noformat} > [biadmin@hdtest009 bin]$ hbase org.apache.hadoop.hbase.io.hfile.HFile -v -k > -m -f > /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 > 15/01/23 11:53:17 INFO Configuration.deprecation: hadoop.native.lib is > deprecated. Instead, use io.native.lib.available > 15/01/23 11:53:18 INFO util.ChecksumType: Checksum using > org.apache.hadoop.util.PureJavaCrc32 > 15/01/23 11:53:18 INFO util.ChecksumType: Checksum can use > org.apache.hadoop.util.PureJavaCrc32C > 15/01/23 11:53:18 INFO Configuration.deprecation: fs.default.name is > deprecated. Instead, use fs.defaultFS > Scanning -> > /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 > WARNING, previous row is greater then current row > filename -> > /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 > previous -> > \x00/20110203-094231205-79442793-1410161293068203000\x0Aattributes16794406\x00\x00\x01\x00\x00\x00\x00\x00\x00 > current -> > Exception in thread "main" java.nio.BufferUnderflowException > at java.nio.Buffer.nextGetIndex(Buffer.java:489) > at java.nio.HeapByteBuffer.getInt(HeapByteBuffer.java:347) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.readKeyValueLen(HFileReaderV2.java:856) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:768) > at > org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.scanKeysValues(HFilePrettyPrinter.java:362) > at > org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.processFile(HFilePrettyPrinter.java:262) > at > org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.run(HFilePrettyPrinter.java:220) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at > org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.main(HFilePrettyPrinter.java:539) > at org.apache.hadoop.hbase.io.hfile.HFile.main(HFile.java:802) > {noformat} > Turning on Java Assert shows the following: > {noformat} > Exception in thread "main" java.lang.AssertionError: Key > 20110203-094231205-79442793-1410161293068203000/attributes:16794406/1099511627776/Minimum/vlen=15/mvcc=0 > followed by a smaller key //0/Minimum/vlen=0/mvcc=0 in cf attributes > at > org.apache.hadoop.hbase.regionserver.StoreScanner.checkScanOrder(StoreScanner.java:672) > {noformat} > It shows that the hfile seems to be corrupted -- the keys don't seem to be > right. > But Scanner is not able to give a meaningful error, but stuck in an infinite > loop in here: > {code} > KeyValueHeap.generalizedSeek() > while ((scanner = heap.poll()) != null) { > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12949) Scanner can be stuck in infinite loop if the HFile is corrupted
[ https://issues.apache.org/jira/browse/HBASE-12949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jerry He updated HBASE-12949: - Attachment: HBASE-12949-master-v3.patch v3 re-based with current master. > Scanner can be stuck in infinite loop if the HFile is corrupted > --- > > Key: HBASE-12949 > URL: https://issues.apache.org/jira/browse/HBASE-12949 > Project: HBase > Issue Type: Bug >Affects Versions: 0.94.3, 0.98.10 >Reporter: Jerry He > Attachments: HBASE-12949-master-v2 (1).patch, > HBASE-12949-master-v2.patch, HBASE-12949-master-v2.patch, > HBASE-12949-master-v2.patch, HBASE-12949-master-v3.patch, > HBASE-12949-master.patch > > > We've encountered problem where compaction hangs and never completes. > After looking into it further, we found that the compaction scanner was stuck > in a infinite loop. See stack below. > {noformat} > org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:296) > org.apache.hadoop.hbase.regionserver.KeyValueHeap.reseek(KeyValueHeap.java:257) > org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:697) > org.apache.hadoop.hbase.regionserver.StoreScanner.seekToNextRow(StoreScanner.java:672) > org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:529) > org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:223) > {noformat} > We identified the hfile that seems to be corrupted. Using HFile tool shows > the following: > {noformat} > [biadmin@hdtest009 bin]$ hbase org.apache.hadoop.hbase.io.hfile.HFile -v -k > -m -f > /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 > 15/01/23 11:53:17 INFO Configuration.deprecation: hadoop.native.lib is > deprecated. Instead, use io.native.lib.available > 15/01/23 11:53:18 INFO util.ChecksumType: Checksum using > org.apache.hadoop.util.PureJavaCrc32 > 15/01/23 11:53:18 INFO util.ChecksumType: Checksum can use > org.apache.hadoop.util.PureJavaCrc32C > 15/01/23 11:53:18 INFO Configuration.deprecation: fs.default.name is > deprecated. Instead, use fs.defaultFS > Scanning -> > /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 > WARNING, previous row is greater then current row > filename -> > /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 > previous -> > \x00/20110203-094231205-79442793-1410161293068203000\x0Aattributes16794406\x00\x00\x01\x00\x00\x00\x00\x00\x00 > current -> > Exception in thread "main" java.nio.BufferUnderflowException > at java.nio.Buffer.nextGetIndex(Buffer.java:489) > at java.nio.HeapByteBuffer.getInt(HeapByteBuffer.java:347) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.readKeyValueLen(HFileReaderV2.java:856) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:768) > at > org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.scanKeysValues(HFilePrettyPrinter.java:362) > at > org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.processFile(HFilePrettyPrinter.java:262) > at > org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.run(HFilePrettyPrinter.java:220) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at > org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.main(HFilePrettyPrinter.java:539) > at org.apache.hadoop.hbase.io.hfile.HFile.main(HFile.java:802) > {noformat} > Turning on Java Assert shows the following: > {noformat} > Exception in thread "main" java.lang.AssertionError: Key > 20110203-094231205-79442793-1410161293068203000/attributes:16794406/1099511627776/Minimum/vlen=15/mvcc=0 > followed by a smaller key //0/Minimum/vlen=0/mvcc=0 in cf attributes > at > org.apache.hadoop.hbase.regionserver.StoreScanner.checkScanOrder(StoreScanner.java:672) > {noformat} > It shows that the hfile seems to be corrupted -- the keys don't seem to be > right. > But Scanner is not able to give a meaningful error, but stuck in an infinite > loop in here: > {code} > KeyValueHeap.generalizedSeek() > while ((scanner = heap.poll()) != null) { > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12949) Scanner can be stuck in infinite loop if the HFile is corrupted
[ https://issues.apache.org/jira/browse/HBASE-12949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-12949: -- Attachment: HBASE-12949-master-v2.patch Retry. We never committed this [~jerryhe]? > Scanner can be stuck in infinite loop if the HFile is corrupted > --- > > Key: HBASE-12949 > URL: https://issues.apache.org/jira/browse/HBASE-12949 > Project: HBase > Issue Type: Bug >Affects Versions: 0.94.3, 0.98.10 >Reporter: Jerry He > Attachments: HBASE-12949-master-v2 (1).patch, > HBASE-12949-master-v2.patch, HBASE-12949-master-v2.patch, > HBASE-12949-master-v2.patch, HBASE-12949-master.patch > > > We've encountered problem where compaction hangs and never completes. > After looking into it further, we found that the compaction scanner was stuck > in a infinite loop. See stack below. > {noformat} > org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:296) > org.apache.hadoop.hbase.regionserver.KeyValueHeap.reseek(KeyValueHeap.java:257) > org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:697) > org.apache.hadoop.hbase.regionserver.StoreScanner.seekToNextRow(StoreScanner.java:672) > org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:529) > org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:223) > {noformat} > We identified the hfile that seems to be corrupted. Using HFile tool shows > the following: > {noformat} > [biadmin@hdtest009 bin]$ hbase org.apache.hadoop.hbase.io.hfile.HFile -v -k > -m -f > /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 > 15/01/23 11:53:17 INFO Configuration.deprecation: hadoop.native.lib is > deprecated. Instead, use io.native.lib.available > 15/01/23 11:53:18 INFO util.ChecksumType: Checksum using > org.apache.hadoop.util.PureJavaCrc32 > 15/01/23 11:53:18 INFO util.ChecksumType: Checksum can use > org.apache.hadoop.util.PureJavaCrc32C > 15/01/23 11:53:18 INFO Configuration.deprecation: fs.default.name is > deprecated. Instead, use fs.defaultFS > Scanning -> > /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 > WARNING, previous row is greater then current row > filename -> > /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 > previous -> > \x00/20110203-094231205-79442793-1410161293068203000\x0Aattributes16794406\x00\x00\x01\x00\x00\x00\x00\x00\x00 > current -> > Exception in thread "main" java.nio.BufferUnderflowException > at java.nio.Buffer.nextGetIndex(Buffer.java:489) > at java.nio.HeapByteBuffer.getInt(HeapByteBuffer.java:347) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.readKeyValueLen(HFileReaderV2.java:856) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:768) > at > org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.scanKeysValues(HFilePrettyPrinter.java:362) > at > org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.processFile(HFilePrettyPrinter.java:262) > at > org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.run(HFilePrettyPrinter.java:220) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at > org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.main(HFilePrettyPrinter.java:539) > at org.apache.hadoop.hbase.io.hfile.HFile.main(HFile.java:802) > {noformat} > Turning on Java Assert shows the following: > {noformat} > Exception in thread "main" java.lang.AssertionError: Key > 20110203-094231205-79442793-1410161293068203000/attributes:16794406/1099511627776/Minimum/vlen=15/mvcc=0 > followed by a smaller key //0/Minimum/vlen=0/mvcc=0 in cf attributes > at > org.apache.hadoop.hbase.regionserver.StoreScanner.checkScanOrder(StoreScanner.java:672) > {noformat} > It shows that the hfile seems to be corrupted -- the keys don't seem to be > right. > But Scanner is not able to give a meaningful error, but stuck in an infinite > loop in here: > {code} > KeyValueHeap.generalizedSeek() > while ((scanner = heap.poll()) != null) { > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12949) Scanner can be stuck in infinite loop if the HFile is corrupted
[ https://issues.apache.org/jira/browse/HBASE-12949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-12949: -- Attachment: HBASE-12949-master-v2 (1).patch Ok [~jerryhe] Sounds like it'll be pretty obvious that there is a 'bad section' in a region... but the bad section won't bring down the cluster (smile). Let me try hadoopqa again. Will commit unless objection. Scanner can be stuck in infinite loop if the HFile is corrupted --- Key: HBASE-12949 URL: https://issues.apache.org/jira/browse/HBASE-12949 Project: HBase Issue Type: Bug Affects Versions: 0.94.3, 0.98.10 Reporter: Jerry He Attachments: HBASE-12949-master-v2 (1).patch, HBASE-12949-master-v2.patch, HBASE-12949-master-v2.patch, HBASE-12949-master.patch We've encountered problem where compaction hangs and never completes. After looking into it further, we found that the compaction scanner was stuck in a infinite loop. See stack below. {noformat} org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:296) org.apache.hadoop.hbase.regionserver.KeyValueHeap.reseek(KeyValueHeap.java:257) org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:697) org.apache.hadoop.hbase.regionserver.StoreScanner.seekToNextRow(StoreScanner.java:672) org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:529) org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:223) {noformat} We identified the hfile that seems to be corrupted. Using HFile tool shows the following: {noformat} [biadmin@hdtest009 bin]$ hbase org.apache.hadoop.hbase.io.hfile.HFile -v -k -m -f /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 15/01/23 11:53:17 INFO Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available 15/01/23 11:53:18 INFO util.ChecksumType: Checksum using org.apache.hadoop.util.PureJavaCrc32 15/01/23 11:53:18 INFO util.ChecksumType: Checksum can use org.apache.hadoop.util.PureJavaCrc32C 15/01/23 11:53:18 INFO Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS Scanning - /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 WARNING, previous row is greater then current row filename - /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 previous - \x00/20110203-094231205-79442793-1410161293068203000\x0Aattributes16794406\x00\x00\x01\x00\x00\x00\x00\x00\x00 current - Exception in thread main java.nio.BufferUnderflowException at java.nio.Buffer.nextGetIndex(Buffer.java:489) at java.nio.HeapByteBuffer.getInt(HeapByteBuffer.java:347) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.readKeyValueLen(HFileReaderV2.java:856) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:768) at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.scanKeysValues(HFilePrettyPrinter.java:362) at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.processFile(HFilePrettyPrinter.java:262) at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.run(HFilePrettyPrinter.java:220) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.main(HFilePrettyPrinter.java:539) at org.apache.hadoop.hbase.io.hfile.HFile.main(HFile.java:802) {noformat} Turning on Java Assert shows the following: {noformat} Exception in thread main java.lang.AssertionError: Key 20110203-094231205-79442793-1410161293068203000/attributes:16794406/1099511627776/Minimum/vlen=15/mvcc=0 followed by a smaller key //0/Minimum/vlen=0/mvcc=0 in cf attributes at org.apache.hadoop.hbase.regionserver.StoreScanner.checkScanOrder(StoreScanner.java:672) {noformat} It shows that the hfile seems to be corrupted -- the keys don't seem to be right. But Scanner is not able to give a meaningful error, but stuck in an infinite loop in here: {code} KeyValueHeap.generalizedSeek() while ((scanner = heap.poll()) != null) { } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12949) Scanner can be stuck in infinite loop if the HFile is corrupted
[ https://issues.apache.org/jira/browse/HBASE-12949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-12949: -- Attachment: HBASE-12949-master-v2.patch Retry. LGTM +1 Just a few length checks. Not too bad. Scanner can be stuck in infinite loop if the HFile is corrupted --- Key: HBASE-12949 URL: https://issues.apache.org/jira/browse/HBASE-12949 Project: HBase Issue Type: Bug Affects Versions: 0.94.3, 0.98.10 Reporter: Jerry He Attachments: HBASE-12949-master-v2.patch, HBASE-12949-master-v2.patch, HBASE-12949-master.patch We've encountered problem where compaction hangs and never completes. After looking into it further, we found that the compaction scanner was stuck in a infinite loop. See stack below. {noformat} org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:296) org.apache.hadoop.hbase.regionserver.KeyValueHeap.reseek(KeyValueHeap.java:257) org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:697) org.apache.hadoop.hbase.regionserver.StoreScanner.seekToNextRow(StoreScanner.java:672) org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:529) org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:223) {noformat} We identified the hfile that seems to be corrupted. Using HFile tool shows the following: {noformat} [biadmin@hdtest009 bin]$ hbase org.apache.hadoop.hbase.io.hfile.HFile -v -k -m -f /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 15/01/23 11:53:17 INFO Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available 15/01/23 11:53:18 INFO util.ChecksumType: Checksum using org.apache.hadoop.util.PureJavaCrc32 15/01/23 11:53:18 INFO util.ChecksumType: Checksum can use org.apache.hadoop.util.PureJavaCrc32C 15/01/23 11:53:18 INFO Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS Scanning - /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 WARNING, previous row is greater then current row filename - /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 previous - \x00/20110203-094231205-79442793-1410161293068203000\x0Aattributes16794406\x00\x00\x01\x00\x00\x00\x00\x00\x00 current - Exception in thread main java.nio.BufferUnderflowException at java.nio.Buffer.nextGetIndex(Buffer.java:489) at java.nio.HeapByteBuffer.getInt(HeapByteBuffer.java:347) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.readKeyValueLen(HFileReaderV2.java:856) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:768) at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.scanKeysValues(HFilePrettyPrinter.java:362) at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.processFile(HFilePrettyPrinter.java:262) at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.run(HFilePrettyPrinter.java:220) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.main(HFilePrettyPrinter.java:539) at org.apache.hadoop.hbase.io.hfile.HFile.main(HFile.java:802) {noformat} Turning on Java Assert shows the following: {noformat} Exception in thread main java.lang.AssertionError: Key 20110203-094231205-79442793-1410161293068203000/attributes:16794406/1099511627776/Minimum/vlen=15/mvcc=0 followed by a smaller key //0/Minimum/vlen=0/mvcc=0 in cf attributes at org.apache.hadoop.hbase.regionserver.StoreScanner.checkScanOrder(StoreScanner.java:672) {noformat} It shows that the hfile seems to be corrupted -- the keys don't seem to be right. But Scanner is not able to give a meaningful error, but stuck in an infinite loop in here: {code} KeyValueHeap.generalizedSeek() while ((scanner = heap.poll()) != null) { } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12949) Scanner can be stuck in infinite loop if the HFile is corrupted
[ https://issues.apache.org/jira/browse/HBASE-12949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jerry He updated HBASE-12949: - Attachment: HBASE-12949-master-v2.patch Scanner can be stuck in infinite loop if the HFile is corrupted --- Key: HBASE-12949 URL: https://issues.apache.org/jira/browse/HBASE-12949 Project: HBase Issue Type: Bug Affects Versions: 0.94.3, 0.98.10 Reporter: Jerry He Attachments: HBASE-12949-master-v2.patch, HBASE-12949-master.patch We've encountered problem where compaction hangs and never completes. After looking into it further, we found that the compaction scanner was stuck in a infinite loop. See stack below. {noformat} org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:296) org.apache.hadoop.hbase.regionserver.KeyValueHeap.reseek(KeyValueHeap.java:257) org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:697) org.apache.hadoop.hbase.regionserver.StoreScanner.seekToNextRow(StoreScanner.java:672) org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:529) org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:223) {noformat} We identified the hfile that seems to be corrupted. Using HFile tool shows the following: {noformat} [biadmin@hdtest009 bin]$ hbase org.apache.hadoop.hbase.io.hfile.HFile -v -k -m -f /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 15/01/23 11:53:17 INFO Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available 15/01/23 11:53:18 INFO util.ChecksumType: Checksum using org.apache.hadoop.util.PureJavaCrc32 15/01/23 11:53:18 INFO util.ChecksumType: Checksum can use org.apache.hadoop.util.PureJavaCrc32C 15/01/23 11:53:18 INFO Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS Scanning - /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 WARNING, previous row is greater then current row filename - /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 previous - \x00/20110203-094231205-79442793-1410161293068203000\x0Aattributes16794406\x00\x00\x01\x00\x00\x00\x00\x00\x00 current - Exception in thread main java.nio.BufferUnderflowException at java.nio.Buffer.nextGetIndex(Buffer.java:489) at java.nio.HeapByteBuffer.getInt(HeapByteBuffer.java:347) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.readKeyValueLen(HFileReaderV2.java:856) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:768) at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.scanKeysValues(HFilePrettyPrinter.java:362) at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.processFile(HFilePrettyPrinter.java:262) at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.run(HFilePrettyPrinter.java:220) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.main(HFilePrettyPrinter.java:539) at org.apache.hadoop.hbase.io.hfile.HFile.main(HFile.java:802) {noformat} Turning on Java Assert shows the following: {noformat} Exception in thread main java.lang.AssertionError: Key 20110203-094231205-79442793-1410161293068203000/attributes:16794406/1099511627776/Minimum/vlen=15/mvcc=0 followed by a smaller key //0/Minimum/vlen=0/mvcc=0 in cf attributes at org.apache.hadoop.hbase.regionserver.StoreScanner.checkScanOrder(StoreScanner.java:672) {noformat} It shows that the hfile seems to be corrupted -- the keys don't seem to be right. But Scanner is not able to give a meaningful error, but stuck in an infinite loop in here: {code} KeyValueHeap.generalizedSeek() while ((scanner = heap.poll()) != null) { } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12949) Scanner can be stuck in infinite loop if the HFile is corrupted
[ https://issues.apache.org/jira/browse/HBASE-12949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jerry He updated HBASE-12949: - Status: Patch Available (was: Open) Scanner can be stuck in infinite loop if the HFile is corrupted --- Key: HBASE-12949 URL: https://issues.apache.org/jira/browse/HBASE-12949 Project: HBase Issue Type: Bug Affects Versions: 0.98.10, 0.94.3 Reporter: Jerry He Attachments: HBASE-12949-master.patch We've encountered problem where compaction hangs and never completes. After looking into it further, we found that the compaction scanner was stuck in a infinite loop. See stack below. {noformat} org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:296) org.apache.hadoop.hbase.regionserver.KeyValueHeap.reseek(KeyValueHeap.java:257) org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:697) org.apache.hadoop.hbase.regionserver.StoreScanner.seekToNextRow(StoreScanner.java:672) org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:529) org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:223) {noformat} We identified the hfile that seems to be corrupted. Using HFile tool shows the following: {noformat} [biadmin@hdtest009 bin]$ hbase org.apache.hadoop.hbase.io.hfile.HFile -v -k -m -f /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 15/01/23 11:53:17 INFO Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available 15/01/23 11:53:18 INFO util.ChecksumType: Checksum using org.apache.hadoop.util.PureJavaCrc32 15/01/23 11:53:18 INFO util.ChecksumType: Checksum can use org.apache.hadoop.util.PureJavaCrc32C 15/01/23 11:53:18 INFO Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS Scanning - /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 WARNING, previous row is greater then current row filename - /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 previous - \x00/20110203-094231205-79442793-1410161293068203000\x0Aattributes16794406\x00\x00\x01\x00\x00\x00\x00\x00\x00 current - Exception in thread main java.nio.BufferUnderflowException at java.nio.Buffer.nextGetIndex(Buffer.java:489) at java.nio.HeapByteBuffer.getInt(HeapByteBuffer.java:347) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.readKeyValueLen(HFileReaderV2.java:856) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:768) at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.scanKeysValues(HFilePrettyPrinter.java:362) at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.processFile(HFilePrettyPrinter.java:262) at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.run(HFilePrettyPrinter.java:220) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.main(HFilePrettyPrinter.java:539) at org.apache.hadoop.hbase.io.hfile.HFile.main(HFile.java:802) {noformat} Turning on Java Assert shows the following: {noformat} Exception in thread main java.lang.AssertionError: Key 20110203-094231205-79442793-1410161293068203000/attributes:16794406/1099511627776/Minimum/vlen=15/mvcc=0 followed by a smaller key //0/Minimum/vlen=0/mvcc=0 in cf attributes at org.apache.hadoop.hbase.regionserver.StoreScanner.checkScanOrder(StoreScanner.java:672) {noformat} It shows that the hfile seems to be corrupted -- the keys don't seem to be right. But Scanner is not able to give a meaningful error, but stuck in an infinite loop in here: {code} KeyValueHeap.generalizedSeek() while ((scanner = heap.poll()) != null) { } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12949) Scanner can be stuck in infinite loop if the HFile is corrupted
[ https://issues.apache.org/jira/browse/HBASE-12949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jerry He updated HBASE-12949: - Attachment: HBASE-12949-master.patch Attached a patch to see if you folks are ok with the approach. Here is what would show up after the patch with the bad hfile. In the region server log, which a aborted compaction: {noformat} 2015-02-04 13:57:39,077 ERROR org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction failed Request = regionName=CUMMINS_INSITE_V1,20110207-105743558-21939316-1406327439200524000,1422756983729.bc8e4b3996d2424f21dc0cfdcd422a6b., storeName=attributes, fileCount=1, fileSize=6.4 G (6.4 G), priority=9, time=1423087053717124000 java.io.IOException: Could not iterate StoreFileScanner[org.apache.hadoop.hbase.io.HalfStoreFileReader$1@714e714e, cur=20110208-080219433-21950204-1397112048924811000/attributes:1015319_1010319/1397120694918/Put/vlen=15/mvcc=0] at org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:142) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:108) at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:507) at org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:223) at org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:77) at org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:110) at org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1099) at org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:1483) at org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.run(CompactSplitThread.java:506) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:906) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:929) at java.lang.Thread.run(Thread.java:738) Caused by: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Invalid type 0 at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.getKeyValue(HFileReaderV2.java:695) at org.apache.hadoop.hbase.io.HalfStoreFileReader$1.getKeyValue(HalfStoreFileReader.java:149) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:137) ... 11 more {noformat} Doing a get from the shell: {noformat} hbase(main):002:0 get 'CUMMINS_INSITE_V1', '20110208-080219433-21950204-1397112048924811000' COLUMN CELL ERROR: java.io.IOException: Could not iterate StoreFileScanner[org.apache.hadoop.hbase.io.HalfStoreFileReader$1@70837083, cur=20110208-080219433-21950204-1397112048924811000/attributes:1015319_1010319/1397120694918/Put/vlen=15/mvcc=0] at org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:142) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:108) at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:507) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:140) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:3992) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:4072) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3950) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3919) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3906) at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4882) at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4856) at org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:2951) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29937) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2027) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:110) at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:90) at java.lang.Thread.run(Thread.java:738) Caused by: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Invalid type 0 at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.getKeyValue(HFileReaderV2.java:695) at org.apache.hadoop.hbase.io.HalfStoreFileReader$1.getKeyValue(HalfStoreFileReader.java:149) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:137) ...
[jira] [Updated] (HBASE-12949) Scanner can be stuck in infinite loop if the HFile is corrupted
[ https://issues.apache.org/jira/browse/HBASE-12949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jerry He updated HBASE-12949: - Description: We've encountered problem where compaction hangs and never completes. After looking into it further, we found that the compaction scanner was stuck in a infinite loop. See stack below. {noformat} org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:296) org.apache.hadoop.hbase.regionserver.KeyValueHeap.reseek(KeyValueHeap.java:257) org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:697) org.apache.hadoop.hbase.regionserver.StoreScanner.seekToNextRow(StoreScanner.java:672) org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:529) org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:223) {noformat} We identified the hfile that seems to be corrupted. Using HFile tool shows the following: {noformat} [biadmin@hdtest009 bin]$ hbase org.apache.hadoop.hbase.io.hfile.HFile -v -k -m -f /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 15/01/23 11:53:17 INFO Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available 15/01/23 11:53:18 INFO util.ChecksumType: Checksum using org.apache.hadoop.util.PureJavaCrc32 15/01/23 11:53:18 INFO util.ChecksumType: Checksum can use org.apache.hadoop.util.PureJavaCrc32C 15/01/23 11:53:18 INFO Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS Scanning - /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 WARNING, previous row is greater then current row filename - /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 previous - \x00/20110203-094231205-79442793-1410161293068203000\x0Aattributes16794406\x00\x00\x01\x00\x00\x00\x00\x00\x00 current - Exception in thread main java.nio.BufferUnderflowException at java.nio.Buffer.nextGetIndex(Buffer.java:489) at java.nio.HeapByteBuffer.getInt(HeapByteBuffer.java:347) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.readKeyValueLen(HFileReaderV2.java:856) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:768) at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.scanKeysValues(HFilePrettyPrinter.java:362) at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.processFile(HFilePrettyPrinter.java:262) at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.run(HFilePrettyPrinter.java:220) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.main(HFilePrettyPrinter.java:539) at org.apache.hadoop.hbase.io.hfile.HFile.main(HFile.java:802) {noformat} Turning on Java Assert shows the following: {noformat} Exception in thread main java.lang.AssertionError: Key 20110203-094231205-79442793-1410161293068203000/attributes:16794406/1099511627776/Minimum/vlen=15/mvcc=0 followed by a smaller key //0/Minimum/vlen=0/mvcc=0 in cf attributes at org.apache.hadoop.hbase.regionserver.StoreScanner.checkScanOrder(StoreScanner.java:672) {noformat} It shows that the hfile seems to be corrupted -- the keys don't seem to be right. But Scanner is not able to give a meaningful error, but stuck in an infinite loop in here: {code} KeyValueHeap.generalizedSeek() while ((scanner = heap.poll()) != null) { } {code} was: We've encountered problem where compaction hangs and never completes. After looking into it further, we found that the compaction scanner was stuck in a infinite loop. See stack below. {noformat} org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:296) org.apache.hadoop.hbase.regionserver.KeyValueHeap.reseek(KeyValueHeap.java:257) org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:697) org.apache.hadoop.hbase.regionserver.StoreScanner.seekToNextRow(StoreScanner.java:672) org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:529) org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:223) {noformat} We identified the hfile that seems to be corrupted. Using HFile tool shows the following: {noformat} [biadmin@hdtest009 bin]$ hbase org.apache.hadoop.hbase.io.hfile.HFile -v -k -m -f /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 15/01/23 11:53:17 INFO Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available 15/01/23 11:53:18 INFO util.ChecksumType: Checksum using org.apache.hadoop.util.PureJavaCrc32 15/01/23 11:53:18 INFO util.ChecksumType: Checksum
[jira] [Updated] (HBASE-12949) Scanner can be stuck in infinite loop if the HFile is corrupted
[ https://issues.apache.org/jira/browse/HBASE-12949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jerry He updated HBASE-12949: - Affects Version/s: 0.94.3 Scanner can be stuck in infinite loop if the HFile is corrupted --- Key: HBASE-12949 URL: https://issues.apache.org/jira/browse/HBASE-12949 Project: HBase Issue Type: Bug Affects Versions: 0.94.3, 0.98.10 Reporter: Jerry He We've encountered problem where compaction hangs and never completes. After looking into it further, we found that the compaction scanner was stuck in a infinite loop. See stack below. {noformat} org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:296) org.apache.hadoop.hbase.regionserver.KeyValueHeap.reseek(KeyValueHeap.java:257) org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:697) org.apache.hadoop.hbase.regionserver.StoreScanner.seekToNextRow(StoreScanner.java:672) org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:529) org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:223) {noformat} We identified the hfile that seems to be corrupted. Using HFile tool shows the following: {noformat} [biadmin@hdtest009 bin]$ hbase org.apache.hadoop.hbase.io.hfile.HFile -v -k -m -f /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 15/01/23 11:53:17 INFO Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available 15/01/23 11:53:18 INFO util.ChecksumType: Checksum using org.apache.hadoop.util.PureJavaCrc32 15/01/23 11:53:18 INFO util.ChecksumType: Checksum can use org.apache.hadoop.util.PureJavaCrc32C 15/01/23 11:53:18 INFO Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS Scanning - /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 WARNING, previous row is greater then current row filename - /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7 previous - \x00/20110203-094231205-79442793-1410161293068203000\x0Aattributes16794406\x00\x00\x01\x00\x00\x00\x00\x00\x00 current - Exception in thread main java.nio.BufferUnderflowException at java.nio.Buffer.nextGetIndex(Buffer.java:489) at java.nio.HeapByteBuffer.getInt(HeapByteBuffer.java:347) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.readKeyValueLen(HFileReaderV2.java:856) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:768) at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.scanKeysValues(HFilePrettyPrinter.java:362) at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.processFile(HFilePrettyPrinter.java:262) at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.run(HFilePrettyPrinter.java:220) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.main(HFilePrettyPrinter.java:539) at org.apache.hadoop.hbase.io.hfile.HFile.main(HFile.java:802) {noformat} Turning on Java Assert shows the following: {noformat} Exception in thread main java.lang.AssertionError: Key 20110203-094231205-79442793-1410161293068203000/attributes:16794406/1099511627776/Minimum/vlen=15/mvcc=0 followed by a smaller key //0/Minimum/vlen=0/mvcc=0 in cf attributes at org.apache.hadoop.hbase.regionserver.StoreScanner.checkScanOrder(StoreScanner.java:672) {noformat} It shows that the hfile seems to be corrupted -- the keys don't seem to be right. But Scanner is not able to give a meaningful error, but stuck in an infinite loop in here: {code} KeyValueHeap.generalizedSeek() while ((scanner = heap.poll()) != null) { } -- This message was sent by Atlassian JIRA (v6.3.4#6332)