Nick Dimiduk created HBASE-29131:
------------------------------------
Summary: Introduce the option for post-compaction validation of
HFiles
Key: HBASE-29131
URL: https://issues.apache.org/jira/browse/HBASE-29131
Project: HBase
Issue Type: New Feature
Components: Compaction
Reporter: Nick Dimiduk
After enabling zstd compression on a table, we experienced an incident where
data files produced during compaction were subsequently not readable. The
client experienced an exception, the call stack for which is is below.
Restoring after this incident was quite annoying, we'd like to avoid repeating
the experience.
The idea is that the region server should read back the files it writes out at
the end of the compaction. This extra precaution would function as a last-ditch
verification that the output file is wholesome before committing the
compaction. If the file is not readable, the compaction would be aborted and
we'd try again the next time a compaction is requested.
{noformat}
... INFO o.a.h.h.c.AsyncRequestFutureImpl - id=1, table=XXX, attempt=4/13,
failureCount=250ops, last exception=java.io.IOException: java.io.IOException:
Src size is incorrect
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:504)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124)
at
org.apache.hadoop.hbase.ipc.CallRunnerWithContext.run(CallRunnerWithContext.java:103)
at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:102)
at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:82)
Caused by: com.github.luben.zstd.ZstdException: Src size is incorrect
at
com.github.luben.zstd.ZstdDecompressCtx.decompressDirectByteBuffer(ZstdDecompressCtx.java:172)
at
com.github.luben.zstd.ZstdDecompressCtx.decompress(ZstdDecompressCtx.java:241)
at
org.apache.hadoop.hbase.io.compress.zstd.ZstdDecompressor.decompress(ZstdDecompressor.java:73)
at
org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:88)
at
org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:105)
at
java.base/java.io.BufferedInputStream.read1(BufferedInputStream.java:345)
at
java.base/java.io.BufferedInputStream.implRead(BufferedInputStream.java:420)
at
java.base/java.io.BufferedInputStream.read(BufferedInputStream.java:399)
at
org.apache.hadoop.hbase.io.util.BlockIOUtils.readFullyWithHeapBuffer(BlockIOUtils.java:151)
at
org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultDecodingContext.prepareDecoding(HFileBlockDefaultDecodingContext.java:104)
at
org.apache.hadoop.hbase.io.hfile.HFileBlock.unpack(HFileBlock.java:644)
at
org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1353)
at
org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1252)
at
org.apache.hadoop.hbase.io.hfile.NoOpIndexBlockEncoder$NoOpEncodedSeeker.loadDataBlockWithScanInfo(NoOpIndexBlockEncoder.java:346)
at
org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReaderV2.loadDataBlockWithScanInfo(HFileBlockIndex.java:514)
at
org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:674)
at
org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:659)
at
org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:338)
at
org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:256)
at
org.apache.hadoop.hbase.regionserver.StoreFileScanner.enforceSeek(StoreFileScanner.java:457)
at
org.apache.hadoop.hbase.regionserver.KeyValueHeap.pollRealKV(KeyValueHeap.java:357)
at
org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:302)
at
org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:266)
at
org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:1102)
at
org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:1093)
at
org.apache.hadoop.hbase.regionserver.StoreScanner.seekOrSkipToNextColumn(StoreScanner.java:833)
at
org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:734)
at
org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:151)
at
org.apache.hadoop.hbase.regionserver.RegionScannerImpl.populateResult(RegionScannerImpl.java:339)
at
org.apache.hadoop.hbase.regionserver.RegionScannerImpl.nextInternal(RegionScannerImpl.java:511)
at
org.apache.hadoop.hbase.regionserver.RegionScannerImpl.nextRaw(RegionScannerImpl.java:275)
at
org.apache.hadoop.hbase.regionserver.RegionScannerImpl.next(RegionScannerImpl.java:253)
at
org.apache.hadoop.hbase.regionserver.RegionScannerImpl.next(RegionScannerImpl.java:240)
at
org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2652)
at
org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:859)
at
org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2892)
at
org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:45961)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:437)
... 4 more
{noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)