[ 
https://issues.apache.org/jira/browse/HBASE-12949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15329902#comment-15329902
 ] 

stack commented on HBASE-12949:
-------------------------------

Hmm... is there an outstanding issue here? Above it is asked...

bq. So, what you thinking on this issue Jerry He? How comes infinite loop if we 
were supposed to have fallen back on hdfs block? Maybe that didn't happen in 
this case? Or we need to throw a more violent exception?

Why would HDFS not give us a good replica to replace the one that hbase 
checksum said was bad?

> Scanner can be stuck in infinite loop if the HFile is corrupted
> ---------------------------------------------------------------
>
>                 Key: HBASE-12949
>                 URL: https://issues.apache.org/jira/browse/HBASE-12949
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.94.3, 0.98.10
>            Reporter: Jerry He
>         Attachments: HBASE-12949-master-v2 (1).patch, 
> HBASE-12949-master-v2.patch, HBASE-12949-master-v2.patch, 
> HBASE-12949-master-v2.patch, HBASE-12949-master.patch
>
>
> We've encountered problem where compaction hangs and never completes.
> After looking into it further, we found that the compaction scanner was stuck 
> in a infinite loop. See stack below.
> {noformat}
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:296)
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.reseek(KeyValueHeap.java:257)
> org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:697)
> org.apache.hadoop.hbase.regionserver.StoreScanner.seekToNextRow(StoreScanner.java:672)
> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:529)
> org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:223)
> {noformat}
> We identified the hfile that seems to be corrupted.  Using HFile tool shows 
> the following:
> {noformat}
> [biadmin@hdtest009 bin]$ hbase org.apache.hadoop.hbase.io.hfile.HFile -v -k 
> -m -f 
> /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7
> 15/01/23 11:53:17 INFO Configuration.deprecation: hadoop.native.lib is 
> deprecated. Instead, use io.native.lib.available
> 15/01/23 11:53:18 INFO util.ChecksumType: Checksum using 
> org.apache.hadoop.util.PureJavaCrc32
> 15/01/23 11:53:18 INFO util.ChecksumType: Checksum can use 
> org.apache.hadoop.util.PureJavaCrc32C
> 15/01/23 11:53:18 INFO Configuration.deprecation: fs.default.name is 
> deprecated. Instead, use fs.defaultFS
> Scanning -> 
> /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7
> WARNING, previous row is greater then current row
>         filename -> 
> /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7
>         previous -> 
> \x00/20110203-094231205-79442793-1410161293068203000\x0Aattributes16794406\x00\x00\x01\x00\x00\x00\x00\x00\x00
>         current  ->
> Exception in thread "main" java.nio.BufferUnderflowException
>         at java.nio.Buffer.nextGetIndex(Buffer.java:489)
>         at java.nio.HeapByteBuffer.getInt(HeapByteBuffer.java:347)
>         at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.readKeyValueLen(HFileReaderV2.java:856)
>         at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:768)
>         at 
> org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.scanKeysValues(HFilePrettyPrinter.java:362)
>         at 
> org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.processFile(HFilePrettyPrinter.java:262)
>         at 
> org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.run(HFilePrettyPrinter.java:220)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>         at 
> org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.main(HFilePrettyPrinter.java:539)
>         at org.apache.hadoop.hbase.io.hfile.HFile.main(HFile.java:802)
> {noformat}
> Turning on Java Assert shows the following:
> {noformat}
> Exception in thread "main" java.lang.AssertionError: Key 
> 20110203-094231205-79442793-1410161293068203000/attributes:16794406/1099511627776/Minimum/vlen=15/mvcc=0
>  followed by a smaller key //0/Minimum/vlen=0/mvcc=0 in cf attributes
>         at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.checkScanOrder(StoreScanner.java:672)
> {noformat}
> It shows that the hfile seems to be corrupted -- the keys don't seem to be 
> right.
> But Scanner is not able to give a meaningful error, but stuck in an infinite 
> loop in here:
> {code}
> KeyValueHeap.generalizedSeek()
> while ((scanner = heap.poll()) != null) {
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to