Which version of Hadoop? If you get a data center wide power outage you can lose data.
In Hadoop 1.1.1 or later you force a sync on block close, and thus you won't at least lose any old data (i.e. HFiles that were recently written due to compactions). I have blogged about that here: http://hadoop-hbase.blogspot.com/2013/07/protected-hbase-against-data-center.html -- Lars ________________________________ From: 宾莉金 <[email protected]> To: [email protected] Sent: Tuesday, January 14, 2014 7:36 PM Subject: Corrupt HFile We use hbase-0.94.10 and encounter corrupt hfile. 2014-01-11 23:24:16,547 DEBUG org.apache.hadoop.hbase.util.FSUtils: Creating file=hdfs://dump002002.cm6:9000/hbase-0.90/cbu2/735414b148ed70e79f4c0406963bb0c9/.tmp/8a4869aafeae43ee8294bf7b65b92e63 with permission=rwxrwxrwx 2014-01-11 23:24:16,550 INFO org.apache.hadoop.hbase.regionserver.StoreFile: Bloom filter type for hdfs://dump002002.cm6:9000/hbase-0.90/cbu2/735414b148ed70e79f4c0406963bb0c9/.tmp/8a4869aafeae43ee8294bf7b65b92e63: ROW, CompoundBloomFilterWriter 2014-01-11 23:24:16,550 INFO org.apache.hadoop.hbase.regionserver.StoreFile: Delete Family Bloom filter type for hdfs://dump002002.cm6:9000/hbase-0.90/cbu2/735414b148ed70e79f4c0406963bb0c9/.tmp/8a4869aafeae43ee8294bf7b65b92e63: CompoundBloomFilterWriter 2014-01-11 23:25:29,769 INFO org.apache.hadoop.hbase.regionserver.Store: Renaming compacted file at hdfs://dump002002.cm6:9000/hbase-0.90/cbu2/735414b148ed70e79f4c0406963bb0c9/.tmp/8a4869aafeae43ee8294bf7b65b92e63 to hdfs://dump002002.cm6:9000/hbase-0.90/cbu2/735414b148ed70e79f4c0406963bb0c9/cf/8a4869aafeae43ee8294bf7b65b92e63 2014-01-11 23:25:29,914 INFO org.apache.hadoop.hbase.regionserver.Store: Completed major compaction of 4 file(s) in cf of cbu2,/0614,1386589566547.735414b148ed70e79f4c0406963bb0c9. into 8a4869aafeae43ee8294bf7b65b92e63, size=883.6 M; total size for store is 884.5 M 2014-01-15 00:15:54,412 WARN org.apache.hadoop.hbase.io.hfile.HFile: File hdfs://dump002002.cm6:9000/hbase-0.90/cbu2/735414b148ed70e79f4c0406963bb0c9/cf/8a4869aafeae43ee8294bf7b65b92e63 Stored checksum value of -1392402101 at offset 64905 does not match computed checksum 242524898, total data size 64950 Checksum data range offset 16384 len 16384 Header dump: magic: 4918304907327195946 blockType DATA compressedBlockSizeNoHeader 64884 uncompressedBlockSizeNoHeader 262185 prevBlockOffset 71067534 checksumType CRC32 bytesPerChecksum 16384 onDiskDataSizeWithHeader 64901 2014-01-15 00:15:54,412 WARN org.apache.hadoop.hbase.io.hfile.HFile: HBase checksum verification failed for file hdfs://dump002002.cm6:9000/hbase-0.90/cbu2/735414b148ed70e79f4c0406963bb0c9/cf/8a4869aafeae43ee8294bf7b65b92e63 at offset 71128929 filesize 926536062. Retrying read with HDFS checksums turned on... 2014-01-15 00:15:54,413 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: java.lang.ArrayIndexOutOfBoundsException at com.hadoop.compression.lzo.LzoDecompressor.setInput(LzoDecompressor.java:208) at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:97) at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:83) at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) at java.io.BufferedInputStream.read1(BufferedInputStream.java:258) at java.io.BufferedInputStream.read(BufferedInputStream.java:317) at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:192) at org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.decompress(HFileBlock.java:1461) at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockDataInternal(HFileBlock.java:1891) at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockData(HFileBlock.java:1734) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:342) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.readNextDataBlock(HFileReaderV2.java:597) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:695) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:248) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:161) at org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:321) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.reseek(KeyValueHeap.java:265) at org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:545) at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:411) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:143) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:3973) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:4045) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3916) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3897) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3940) at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4867) at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4840) at org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:2253) at org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3828) at sun.reflect.GeneratedMethodAccessor38.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426) -- *Best Regards,* lijin bin
