[
https://issues.apache.org/jira/browse/HBASE-27073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17687979#comment-17687979
]
Duo Zhang commented on HBASE-27073:
-----------------------------------
OK, very easy to reproduce...
{code}
public static void main(String[] args) throws Exception {
CompressionContext ctx =
new CompressionContext(LRUDictionary.class, false, false, true,
Compression.Algorithm.GZ);
ValueCompressor compressor = ctx.getValueCompressor();
byte[] compressed = compressor.compress(new byte[0], 0, 0);
System.out.println("compressed length: " + compressed.length);
ByteArrayInputStream bis = new ByteArrayInputStream(compressed);
int read = compressor.decompress(bis, compressed.length, new byte[0], 0, 0);
System.out.println("read length: " + read);
System.out.println("position: " + (compressed.length - bis.available()));
}
{code}
The output is
{noformat}
compressed length: 20
read length: 0
position: 0
{noformat}
So we will read from the wrong position after an empty value...
I tried a simple fix, to manually skip the bytes after calling
compressedIn.read in CompressionContext.ValueCompressor.decompress, then I can
read the 'broken' WAL file provided by [~Xiaolin Ha] successfully.
I'm not sure if this would break the compressor's state and also not sure
whether this is the root cause of the flakiness of
TestReplicationValueCompressedWAL.testMultiplePuts as I do not think we will
write empty value in this test?
Anyway, let me file a new issue to address the specific problem and see if it
could fix the flakiness here.
Thanks.
> TestReplicationValueCompressedWAL.testMultiplePuts is flaky
> -----------------------------------------------------------
>
> Key: HBASE-27073
> URL: https://issues.apache.org/jira/browse/HBASE-27073
> Project: HBase
> Issue Type: Bug
> Affects Versions: 2.5.0
> Environment: Java version: 1.8.0_322
> OS name: "linux", version: "5.10.0-13-arm64", arch: "aarch64", family: "unix"
> Reporter: Andrew Kyle Purtell
> Priority: Minor
> Fix For: 2.6.0, 3.0.0-alpha-4, 2.5.4
>
>
> org.apache.hadoop.hbase.replication.regionserver.TestReplicationValueCompressedWAL.testMultiplePuts
> 
Run 1: TestReplicationValueCompressedWAL.testMultiplePuts:56 Waited too
> much time for replication
> Run 2: PASS
--
This message was sent by Atlassian Jira
(v8.20.10#820010)