[ 
https://issues.apache.org/jira/browse/HBASE-12949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-12949:
-----------------------------
    Description: 
We've encountered problem where compaction hangs and never completes.
After looking into it further, we found that the compaction scanner was stuck 
in a infinite loop. See stack below.
{noformat}
org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:296)
org.apache.hadoop.hbase.regionserver.KeyValueHeap.reseek(KeyValueHeap.java:257)
org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:697)
org.apache.hadoop.hbase.regionserver.StoreScanner.seekToNextRow(StoreScanner.java:672)
org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:529)
org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:223)
{noformat}

We identified the hfile that seems to be corrupted.  Using HFile tool shows the 
following:
{noformat}
[biadmin@hdtest009 bin]$ hbase org.apache.hadoop.hbase.io.hfile.HFile -v -k -m 
-f 
/user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7
15/01/23 11:53:17 INFO Configuration.deprecation: hadoop.native.lib is 
deprecated. Instead, use io.native.lib.available
15/01/23 11:53:18 INFO util.ChecksumType: Checksum using 
org.apache.hadoop.util.PureJavaCrc32
15/01/23 11:53:18 INFO util.ChecksumType: Checksum can use 
org.apache.hadoop.util.PureJavaCrc32C
15/01/23 11:53:18 INFO Configuration.deprecation: fs.default.name is 
deprecated. Instead, use fs.defaultFS
Scanning -> 
/user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7
WARNING, previous row is greater then current row
        filename -> 
/user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7
        previous -> 
\x00/20110203-094231205-79442793-1410161293068203000\x0Aattributes16794406\x00\x00\x01\x00\x00\x00\x00\x00\x00
        current  ->
Exception in thread "main" java.nio.BufferUnderflowException
        at java.nio.Buffer.nextGetIndex(Buffer.java:489)
        at java.nio.HeapByteBuffer.getInt(HeapByteBuffer.java:347)
        at 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.readKeyValueLen(HFileReaderV2.java:856)
        at 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:768)
        at 
org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.scanKeysValues(HFilePrettyPrinter.java:362)
        at 
org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.processFile(HFilePrettyPrinter.java:262)
        at 
org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.run(HFilePrettyPrinter.java:220)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at 
org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.main(HFilePrettyPrinter.java:539)
        at org.apache.hadoop.hbase.io.hfile.HFile.main(HFile.java:802)
{noformat}

Turning on Java Assert shows the following:
{noformat}
Exception in thread "main" java.lang.AssertionError: Key 
20110203-094231205-79442793-1410161293068203000/attributes:16794406/1099511627776/Minimum/vlen=15/mvcc=0
 followed by a smaller key //0/Minimum/vlen=0/mvcc=0 in cf attributes
        at 
org.apache.hadoop.hbase.regionserver.StoreScanner.checkScanOrder(StoreScanner.java:672)
{noformat}

It shows that the hfile seems to be corrupted -- the keys don't seem to be 
right.
But Scanner is not able to give a meaningful error, but stuck in an infinite 
loop in here:
{code}
KeyValueHeap.generalizedSeek()
while ((scanner = heap.poll()) != null) {
}
{code}

  was:
We've encountered problem where compaction hangs and never completes.
After looking into it further, we found that the compaction scanner was stuck 
in a infinite loop. See stack below.
{noformat}
org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:296)
org.apache.hadoop.hbase.regionserver.KeyValueHeap.reseek(KeyValueHeap.java:257)
org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:697)
org.apache.hadoop.hbase.regionserver.StoreScanner.seekToNextRow(StoreScanner.java:672)
org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:529)
org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:223)
{noformat}

We identified the hfile that seems to be corrupted.  Using HFile tool shows the 
following:
{noformat}
[biadmin@hdtest009 bin]$ hbase org.apache.hadoop.hbase.io.hfile.HFile -v -k -m 
-f 
/user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7
15/01/23 11:53:17 INFO Configuration.deprecation: hadoop.native.lib is 
deprecated. Instead, use io.native.lib.available
15/01/23 11:53:18 INFO util.ChecksumType: Checksum using 
org.apache.hadoop.util.PureJavaCrc32
15/01/23 11:53:18 INFO util.ChecksumType: Checksum can use 
org.apache.hadoop.util.PureJavaCrc32C
15/01/23 11:53:18 INFO Configuration.deprecation: fs.default.name is 
deprecated. Instead, use fs.defaultFS
Scanning -> 
/user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7
WARNING, previous row is greater then current row
        filename -> 
/user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7
        previous -> 
\x00/20110203-094231205-79442793-1410161293068203000\x0Aattributes16794406\x00\x00\x01\x00\x00\x00\x00\x00\x00
        current  ->
Exception in thread "main" java.nio.BufferUnderflowException
        at java.nio.Buffer.nextGetIndex(Buffer.java:489)
        at java.nio.HeapByteBuffer.getInt(HeapByteBuffer.java:347)
        at 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.readKeyValueLen(HFileReaderV2.java:856)
        at 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:768)
        at 
org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.scanKeysValues(HFilePrettyPrinter.java:362)
        at 
org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.processFile(HFilePrettyPrinter.java:262)
        at 
org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.run(HFilePrettyPrinter.java:220)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at 
org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.main(HFilePrettyPrinter.java:539)
        at org.apache.hadoop.hbase.io.hfile.HFile.main(HFile.java:802)
{noformat}

Turning on Java Assert shows the following:
{noformat}
Exception in thread "main" java.lang.AssertionError: Key 
20110203-094231205-79442793-1410161293068203000/attributes:16794406/1099511627776/Minimum/vlen=15/mvcc=0
 followed by a smaller key //0/Minimum/vlen=0/mvcc=0 in cf attributes
        at 
org.apache.hadoop.hbase.regionserver.StoreScanner.checkScanOrder(StoreScanner.java:672)
{noformat}

It shows that the hfile seems to be corrupted -- the keys don't seem to be 
right.
But Scanner is not able to give a meaningful error, but stuck in an infinite 
loop in here:
{code}
KeyValueHeap.generalizedSeek()
while ((scanner = heap.poll()) != null) {
}


> Scanner can be stuck in infinite loop if the HFile is corrupted
> ---------------------------------------------------------------
>
>                 Key: HBASE-12949
>                 URL: https://issues.apache.org/jira/browse/HBASE-12949
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.94.3, 0.98.10
>            Reporter: Jerry He
>
> We've encountered problem where compaction hangs and never completes.
> After looking into it further, we found that the compaction scanner was stuck 
> in a infinite loop. See stack below.
> {noformat}
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:296)
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.reseek(KeyValueHeap.java:257)
> org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:697)
> org.apache.hadoop.hbase.regionserver.StoreScanner.seekToNextRow(StoreScanner.java:672)
> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:529)
> org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:223)
> {noformat}
> We identified the hfile that seems to be corrupted.  Using HFile tool shows 
> the following:
> {noformat}
> [biadmin@hdtest009 bin]$ hbase org.apache.hadoop.hbase.io.hfile.HFile -v -k 
> -m -f 
> /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7
> 15/01/23 11:53:17 INFO Configuration.deprecation: hadoop.native.lib is 
> deprecated. Instead, use io.native.lib.available
> 15/01/23 11:53:18 INFO util.ChecksumType: Checksum using 
> org.apache.hadoop.util.PureJavaCrc32
> 15/01/23 11:53:18 INFO util.ChecksumType: Checksum can use 
> org.apache.hadoop.util.PureJavaCrc32C
> 15/01/23 11:53:18 INFO Configuration.deprecation: fs.default.name is 
> deprecated. Instead, use fs.defaultFS
> Scanning -> 
> /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7
> WARNING, previous row is greater then current row
>         filename -> 
> /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7
>         previous -> 
> \x00/20110203-094231205-79442793-1410161293068203000\x0Aattributes16794406\x00\x00\x01\x00\x00\x00\x00\x00\x00
>         current  ->
> Exception in thread "main" java.nio.BufferUnderflowException
>         at java.nio.Buffer.nextGetIndex(Buffer.java:489)
>         at java.nio.HeapByteBuffer.getInt(HeapByteBuffer.java:347)
>         at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.readKeyValueLen(HFileReaderV2.java:856)
>         at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:768)
>         at 
> org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.scanKeysValues(HFilePrettyPrinter.java:362)
>         at 
> org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.processFile(HFilePrettyPrinter.java:262)
>         at 
> org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.run(HFilePrettyPrinter.java:220)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>         at 
> org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.main(HFilePrettyPrinter.java:539)
>         at org.apache.hadoop.hbase.io.hfile.HFile.main(HFile.java:802)
> {noformat}
> Turning on Java Assert shows the following:
> {noformat}
> Exception in thread "main" java.lang.AssertionError: Key 
> 20110203-094231205-79442793-1410161293068203000/attributes:16794406/1099511627776/Minimum/vlen=15/mvcc=0
>  followed by a smaller key //0/Minimum/vlen=0/mvcc=0 in cf attributes
>         at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.checkScanOrder(StoreScanner.java:672)
> {noformat}
> It shows that the hfile seems to be corrupted -- the keys don't seem to be 
> right.
> But Scanner is not able to give a meaningful error, but stuck in an infinite 
> loop in here:
> {code}
> KeyValueHeap.generalizedSeek()
> while ((scanner = heap.poll()) != null) {
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to