[ 
https://issues.apache.org/jira/browse/HBASE-12949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14313108#comment-14313108
 ] 

Jerry He commented on HBASE-12949:
----------------------------------

Hi, [~stack]

bq. In the KV constructor, it is going to decode sizes anyway as part of the 
parse? Check at this point rather than apart in the CellUtil I suggested above?
I looked through the constructors and methods in KeyValue and their use.  It 
looks the decoding of the bytes for row length, family length, type are done as 
needed by the upper callers at the time of use.  Also it looks we can not do 
the checking in the getters at the conversion time because they are called on 
the artificial/fake keys as well.

I added a v2 of the patch, which only enhanced the check of keyLength. It is 
not a complete fix, but it does incremental good :-)  and less controversial ?
It caught the corruption case I encountered for this JIRA:

{noformat}
Exception in thread "main" java.lang.IllegalStateException: Invalid currKeyLen 
0 or currValueLen 0. Block offset: 2853145836, block length: 65580, position: 
29457 (without header).
        at 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.readKeyValueLen(HFileReaderV2.java:884)
        at 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:790)
        at 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:136)
        at 
org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:108)
        at 
org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:507)
        at 
org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:223)
        at 
org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:77)
{noformat} 
  

> Scanner can be stuck in infinite loop if the HFile is corrupted
> ---------------------------------------------------------------
>
>                 Key: HBASE-12949
>                 URL: https://issues.apache.org/jira/browse/HBASE-12949
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.94.3, 0.98.10
>            Reporter: Jerry He
>         Attachments: HBASE-12949-master-v2.patch, HBASE-12949-master.patch
>
>
> We've encountered problem where compaction hangs and never completes.
> After looking into it further, we found that the compaction scanner was stuck 
> in a infinite loop. See stack below.
> {noformat}
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:296)
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.reseek(KeyValueHeap.java:257)
> org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:697)
> org.apache.hadoop.hbase.regionserver.StoreScanner.seekToNextRow(StoreScanner.java:672)
> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:529)
> org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:223)
> {noformat}
> We identified the hfile that seems to be corrupted.  Using HFile tool shows 
> the following:
> {noformat}
> [biadmin@hdtest009 bin]$ hbase org.apache.hadoop.hbase.io.hfile.HFile -v -k 
> -m -f 
> /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7
> 15/01/23 11:53:17 INFO Configuration.deprecation: hadoop.native.lib is 
> deprecated. Instead, use io.native.lib.available
> 15/01/23 11:53:18 INFO util.ChecksumType: Checksum using 
> org.apache.hadoop.util.PureJavaCrc32
> 15/01/23 11:53:18 INFO util.ChecksumType: Checksum can use 
> org.apache.hadoop.util.PureJavaCrc32C
> 15/01/23 11:53:18 INFO Configuration.deprecation: fs.default.name is 
> deprecated. Instead, use fs.defaultFS
> Scanning -> 
> /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7
> WARNING, previous row is greater then current row
>         filename -> 
> /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7
>         previous -> 
> \x00/20110203-094231205-79442793-1410161293068203000\x0Aattributes16794406\x00\x00\x01\x00\x00\x00\x00\x00\x00
>         current  ->
> Exception in thread "main" java.nio.BufferUnderflowException
>         at java.nio.Buffer.nextGetIndex(Buffer.java:489)
>         at java.nio.HeapByteBuffer.getInt(HeapByteBuffer.java:347)
>         at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.readKeyValueLen(HFileReaderV2.java:856)
>         at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:768)
>         at 
> org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.scanKeysValues(HFilePrettyPrinter.java:362)
>         at 
> org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.processFile(HFilePrettyPrinter.java:262)
>         at 
> org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.run(HFilePrettyPrinter.java:220)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>         at 
> org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.main(HFilePrettyPrinter.java:539)
>         at org.apache.hadoop.hbase.io.hfile.HFile.main(HFile.java:802)
> {noformat}
> Turning on Java Assert shows the following:
> {noformat}
> Exception in thread "main" java.lang.AssertionError: Key 
> 20110203-094231205-79442793-1410161293068203000/attributes:16794406/1099511627776/Minimum/vlen=15/mvcc=0
>  followed by a smaller key //0/Minimum/vlen=0/mvcc=0 in cf attributes
>         at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.checkScanOrder(StoreScanner.java:672)
> {noformat}
> It shows that the hfile seems to be corrupted -- the keys don't seem to be 
> right.
> But Scanner is not able to give a meaningful error, but stuck in an infinite 
> loop in here:
> {code}
> KeyValueHeap.generalizedSeek()
> while ((scanner = heap.poll()) != null) {
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to