Do you observe any difference (configuration, load, etc) between the two servers ? What if you turn off hbase.regionserver.checksum.verify - would the problem still show up ?
Thanks On Oct 12, 2014, at 6:16 AM, Steve Fan <[email protected]> wrote: > I tried another approach: if I move the problematic region to another > server, everything would be fine and I can read that region again. But the > only issue left is that the region would be reassigned to the original > server and is not readable again. > > On Sun, Oct 12, 2014 at 7:45 PM, Steve Fan <[email protected]> wrote: > >> Any comment on this issue? The HFile checksum seems to be a recently added >> feature. I tried to delete the region causing the problem and insert the >> data again. But this time another region had compaction failure. >> >> In my case, each row has one column family and each column is ~128kb. I >> use python thrift api to insert data, with mutation batch set to 100. The >> entire table has ~4 million rows and I have 3 region servers with 1 master. >> >> On Fri, Oct 10, 2014 at 9:55 AM, Steve Fan <[email protected]> wrote: >> >>> hbase.regionserver.checksum.verify = true >>> hbase.hstore.checksum.algorithm = CRC32 >>> >>> >>> On Thu, Oct 9, 2014 at 10:06 PM, Ted Yu <[email protected]> wrote: >>> >>>> What're the values for the following config ? >>>> >>>> hbase.regionserver.checksum.verify >>>> hbase.hstore.checksum.algorithm >>>> >>>> Cheers >>>> >>>> On Thu, Oct 9, 2014 at 6:29 AM, Steve Fan <[email protected]> wrote: >>>> >>>>> I'm getting compaction failure for a region after heavy writes. >>>>> >>>>> Running hbase hbck -details returns everything is OK. >>>>> >>>>> I'm running 0.98.1-cdh5.1.0 >>>>> >>>>> ERROR org.apache.hadoop.hbase.regionserver.CompactSplitThread: >>>> Compaction >>>>> failed Request = >>>> regionName=table,rowkey,1409072707535.b3481b3baef0fdc711b178caf6a6072a., >>>>> storeName=data, fileCount=3, fileSize=474.5 M (129.4 M, 214.8 M, 130.3 >>>> M), >>>>> priority=1, time=8587702075256007 >>>>> java.lang.IndexOutOfBoundsException >>>>> at java.nio.ByteBuffer.wrap(ByteBuffer.java:371) >>>>> at >>>> org.apache.hadoop.hbase.io.hfile.HFileBlock.getBufferReadOnly(HFileBlock.java:343) >>>>> at >>>> org.apache.hadoop.hbase.io.hfile.ChecksumUtil.validateBlockChecksum(ChecksumUtil.java:150) >>>>> at >>>> org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.validateBlockChecksum(HFileBlock.java:1573) >>>>> at >>>> org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockDataInternal(HFileBlock.java:1509) >>>>> at >>>> org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockData(HFileBlock.java:1314) >>>>> at >>>> org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:355) >>>>> at >>>> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.readNextDataBlock(HFileReaderV2.java:605) >>>>> at >>>> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:719) >>>>> at >>>> org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:136) >>>>> at >>>> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:108) >>>>> at >>>> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:507) >>>>> at >>>> org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:217) >>>>> at >>>> org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:76) >>>>> at >>>> org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:109) >>>>> at >>>> org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1080) >>>>> at >>>> org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:1482) >>>>> at >>>> org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.run(CompactSplitThread.java:475) >>>>> at >>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>>>> at >>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>>>> at java.lang.Thread.run(Thread.java:745) >>
