Yes, I ran FSCK and HBCK and both report healthy. I just finished the check part, and I found 10 regions out of 437 where the meta boundaries are not correct compared to the stores one.
I agree that checking ALL the HFiles might take a while, but I think we should provide an option to do that. Someone might want to do that periodically (One a week? One a day?) to validate that everything is clear. It takes only few seconds on my case to run it. So there is 2 things. HFile beeing corrupted (and I will take a look at Matteo's option) and boundaries beeing wrong. For the boundaries, I guess one solution to correct that is to "merge" the overlapping regions into a single one, compact them, and re-split them? What's scary here is that everything is reported as healthy and green, but at the end it's not. JM 2013/8/26 Kevin O'dell <[email protected]> > This is a good one. We have discussed checking the HFile, but a new tool > would have to be written. Running the current HFile tool; which stepping > through 1000s of storefiles would add quite a bit of overhead. You could > script using the HFile tool to find the HFile that is corrupt I suspect you > have an idea of the offending region. Other than what Matteo posted, I am > unaware of another way to fix it. I wonder if all three copies are messed > up, did you run a fsck? > > > On Mon, Aug 26, 2013 at 3:31 PM, Jean-Marc Spaggiari < > [email protected]> wrote: > > > Hi, > > > > Don't ask me how, but I have one table in a pretty strange state. > > > > First, seems that I have one corrupted HFile (at least). > > > > FirstKey return null. (Same for StopKey since header is corrupted). > > > > Exception in thread "main" java.lang.NullPointerException > > at org.apache.hadoop.hbase.KeyValue.keyToString(KeyValue.java:716) > > at > > > > > org.apache.hadoop.hbase.io.hfile.AbstractHFileReader.toStringFirstKey(AbstractHFileReader.java:138) > > at > > > > > org.apache.hadoop.hbase.io.hfile.AbstractHFileReader.toString(AbstractHFileReader.java:149) > > at > > > > > org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.printMeta(HFilePrettyPrinter.java:325) > > at > > > > > org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.processFile(HFilePrettyPrinter.java:241) > > at > > > > > org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.run(HFilePrettyPrinter.java:196) > > at org.apache.hadoop.hbase.io.hfile.HFile.main(HFile.java:756) > > > > Because the header is not correct. > > > > org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading > > HFile Trailer from file > > > > > hdfs://node3:9000/hbase/work_proposed/db83e64f34a5a608335818321f1a6c32/.oldlogs/hlog.1377344531526 > > at > > org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:551) > > at > > > > > org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:595) > > at > org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:642) > > at > > > > > org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.processFile(HFilePrettyPrinter.java:217) > > at > > > > > org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.run(HFilePrettyPrinter.java:196) > > at org.apache.hadoop.hbase.io.hfile.HFile.main(HFile.java:756) > > Caused by: java.lang.IllegalArgumentException: Invalid HFile version: > > 16275367 (expected to be between 1 and 2) > > at > > org.apache.hadoop.hbase.io.hfile.HFile.checkFormatVersion(HFile.java:771) > > at > > > > > org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:323) > > at > > org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:549) > > ... 5 more > > > > HBCK don't detect that: > > 0 inconsistencies detected. > > Status: OK > > > > Also, I have an issue with the first key / last key reported by the > stores > > overlapping when the one reported by the META are not. > > > > Store: > > firstKey=\xF5\x9A\xEA&\x00\x00\x00\x00... > > lastKey=\xFF\xFF\xFF\xFE\x00\x00\x00\x00... > > Meta: > > firstKey=\xF5\x9A\xEA&\x00\x00\x00\x00 > > lastKey=\xF5\x9B@}\x00\x00\x00\x00... > > > > > > So, few things. > > > > 1) We should add something into HBCK to check the HFile format again > > corruption. > > 2) We should add something into HBCK to validate META regions boundaries > > against Store Files > > 3) How can I repaire my HFile? ;) > > > > I'm already working on #2 and will have something ready soon. Then I will > > most probably more to #1. But I only have detection dont for now. I'm not > > sure exactly what are the correct steps to repair... > > > > JM > > > > > > -- > Kevin O'Dell > Systems Engineer, Cloudera >
