Nick Dimiduk created HBASE-29131:
------------------------------------

             Summary: Introduce the option for post-compaction validation of 
HFiles
                 Key: HBASE-29131
                 URL: https://issues.apache.org/jira/browse/HBASE-29131
             Project: HBase
          Issue Type: New Feature
          Components: Compaction
            Reporter: Nick Dimiduk


After enabling zstd compression on a table, we experienced an incident where 
data files produced during compaction were subsequently not readable. The 
client experienced an exception, the call stack for which is is below. 
Restoring after this incident was quite annoying, we'd like to avoid repeating 
the experience.

The idea is that the region server should read back the files it writes out at 
the end of the compaction. This extra precaution would function as a last-ditch 
verification that the output file is wholesome before committing the 
compaction. If the file is not readable, the compaction would be aborted and 
we'd try again the next time a compaction is requested.

{noformat}
... INFO  o.a.h.h.c.AsyncRequestFutureImpl - id=1, table=XXX, attempt=4/13, 
failureCount=250ops, last exception=java.io.IOException: java.io.IOException: 
Src size is incorrect
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:504)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124)
        at 
org.apache.hadoop.hbase.ipc.CallRunnerWithContext.run(CallRunnerWithContext.java:103)
        at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:102)
        at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:82)
Caused by: com.github.luben.zstd.ZstdException: Src size is incorrect
        at 
com.github.luben.zstd.ZstdDecompressCtx.decompressDirectByteBuffer(ZstdDecompressCtx.java:172)
        at 
com.github.luben.zstd.ZstdDecompressCtx.decompress(ZstdDecompressCtx.java:241)
        at 
org.apache.hadoop.hbase.io.compress.zstd.ZstdDecompressor.decompress(ZstdDecompressor.java:73)
        at 
org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:88)
        at 
org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:105)
        at 
java.base/java.io.BufferedInputStream.read1(BufferedInputStream.java:345)
        at 
java.base/java.io.BufferedInputStream.implRead(BufferedInputStream.java:420)
        at 
java.base/java.io.BufferedInputStream.read(BufferedInputStream.java:399)
        at 
org.apache.hadoop.hbase.io.util.BlockIOUtils.readFullyWithHeapBuffer(BlockIOUtils.java:151)
        at 
org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultDecodingContext.prepareDecoding(HFileBlockDefaultDecodingContext.java:104)
        at 
org.apache.hadoop.hbase.io.hfile.HFileBlock.unpack(HFileBlock.java:644)
        at 
org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1353)
        at 
org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1252)
        at 
org.apache.hadoop.hbase.io.hfile.NoOpIndexBlockEncoder$NoOpEncodedSeeker.loadDataBlockWithScanInfo(NoOpIndexBlockEncoder.java:346)
        at 
org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReaderV2.loadDataBlockWithScanInfo(HFileBlockIndex.java:514)
        at 
org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:674)
        at 
org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:659)
        at 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:338)
        at 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:256)
        at 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.enforceSeek(StoreFileScanner.java:457)
        at 
org.apache.hadoop.hbase.regionserver.KeyValueHeap.pollRealKV(KeyValueHeap.java:357)
        at 
org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:302)
        at 
org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:266)
        at 
org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:1102)
        at 
org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:1093)
        at 
org.apache.hadoop.hbase.regionserver.StoreScanner.seekOrSkipToNextColumn(StoreScanner.java:833)
        at 
org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:734)
        at 
org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:151)
        at 
org.apache.hadoop.hbase.regionserver.RegionScannerImpl.populateResult(RegionScannerImpl.java:339)
        at 
org.apache.hadoop.hbase.regionserver.RegionScannerImpl.nextInternal(RegionScannerImpl.java:511)
        at 
org.apache.hadoop.hbase.regionserver.RegionScannerImpl.nextRaw(RegionScannerImpl.java:275)
        at 
org.apache.hadoop.hbase.regionserver.RegionScannerImpl.next(RegionScannerImpl.java:253)
        at 
org.apache.hadoop.hbase.regionserver.RegionScannerImpl.next(RegionScannerImpl.java:240)
        at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2652)
        at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:859)
        at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2892)
        at 
org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:45961)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:437)
        ... 4 more
{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to