This is a good one. We have discussed checking the HFile, but a new tool would have to be written. Running the current HFile tool; which stepping through 1000s of storefiles would add quite a bit of overhead. You could script using the HFile tool to find the HFile that is corrupt I suspect you have an idea of the offending region. Other than what Matteo posted, I am unaware of another way to fix it. I wonder if all three copies are messed up, did you run a fsck?
On Mon, Aug 26, 2013 at 3:31 PM, Jean-Marc Spaggiari < [email protected]> wrote: > Hi, > > Don't ask me how, but I have one table in a pretty strange state. > > First, seems that I have one corrupted HFile (at least). > > FirstKey return null. (Same for StopKey since header is corrupted). > > Exception in thread "main" java.lang.NullPointerException > at org.apache.hadoop.hbase.KeyValue.keyToString(KeyValue.java:716) > at > > org.apache.hadoop.hbase.io.hfile.AbstractHFileReader.toStringFirstKey(AbstractHFileReader.java:138) > at > > org.apache.hadoop.hbase.io.hfile.AbstractHFileReader.toString(AbstractHFileReader.java:149) > at > > org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.printMeta(HFilePrettyPrinter.java:325) > at > > org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.processFile(HFilePrettyPrinter.java:241) > at > > org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.run(HFilePrettyPrinter.java:196) > at org.apache.hadoop.hbase.io.hfile.HFile.main(HFile.java:756) > > Because the header is not correct. > > org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading > HFile Trailer from file > > hdfs://node3:9000/hbase/work_proposed/db83e64f34a5a608335818321f1a6c32/.oldlogs/hlog.1377344531526 > at > org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:551) > at > > org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:595) > at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:642) > at > > org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.processFile(HFilePrettyPrinter.java:217) > at > > org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.run(HFilePrettyPrinter.java:196) > at org.apache.hadoop.hbase.io.hfile.HFile.main(HFile.java:756) > Caused by: java.lang.IllegalArgumentException: Invalid HFile version: > 16275367 (expected to be between 1 and 2) > at > org.apache.hadoop.hbase.io.hfile.HFile.checkFormatVersion(HFile.java:771) > at > > org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:323) > at > org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:549) > ... 5 more > > HBCK don't detect that: > 0 inconsistencies detected. > Status: OK > > Also, I have an issue with the first key / last key reported by the stores > overlapping when the one reported by the META are not. > > Store: > firstKey=\xF5\x9A\xEA&\x00\x00\x00\x00... > lastKey=\xFF\xFF\xFF\xFE\x00\x00\x00\x00... > Meta: > firstKey=\xF5\x9A\xEA&\x00\x00\x00\x00 > lastKey=\xF5\x9B@}\x00\x00\x00\x00... > > > So, few things. > > 1) We should add something into HBCK to check the HFile format again > corruption. > 2) We should add something into HBCK to validate META regions boundaries > against Store Files > 3) How can I repaire my HFile? ;) > > I'm already working on #2 and will have something ready soon. Then I will > most probably more to #1. But I only have detection dont for now. I'm not > sure exactly what are the correct steps to repair... > > JM > -- Kevin O'Dell Systems Engineer, Cloudera
