[ 
https://issues.apache.org/jira/browse/HBASE-27073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17687979#comment-17687979
 ] 

Duo Zhang commented on HBASE-27073:
-----------------------------------

OK, very easy to reproduce...

{code}
  public static void main(String[] args) throws Exception {
    CompressionContext ctx =
      new CompressionContext(LRUDictionary.class, false, false, true, 
Compression.Algorithm.GZ);
    ValueCompressor compressor = ctx.getValueCompressor();
    byte[] compressed = compressor.compress(new byte[0], 0, 0);
    System.out.println("compressed length: " + compressed.length);
    ByteArrayInputStream bis = new ByteArrayInputStream(compressed);
    int read = compressor.decompress(bis, compressed.length, new byte[0], 0, 0);
    System.out.println("read length: " + read);
    System.out.println("position: " + (compressed.length - bis.available()));
  }
{code}

The output is
{noformat}
compressed length: 20
read length: 0
position: 0
{noformat}

So we will read from the wrong position after an empty value...

I tried a simple fix, to manually skip the bytes after calling 
compressedIn.read in CompressionContext.ValueCompressor.decompress, then I can 
read the 'broken' WAL file provided by [~Xiaolin Ha] successfully.

I'm not sure if this would break the compressor's state and also not sure 
whether this is the root cause of the flakiness of 
TestReplicationValueCompressedWAL.testMultiplePuts as I do not think we will 
write empty value in this test?

Anyway, let me file a new issue to address the specific problem and see if it 
could fix the flakiness here.

Thanks.

> TestReplicationValueCompressedWAL.testMultiplePuts is flaky
> -----------------------------------------------------------
>
>                 Key: HBASE-27073
>                 URL: https://issues.apache.org/jira/browse/HBASE-27073
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 2.5.0
>         Environment: Java version: 1.8.0_322
> OS name: "linux", version: "5.10.0-13-arm64", arch: "aarch64", family: "unix"
>            Reporter: Andrew Kyle Purtell
>            Priority: Minor
>             Fix For: 2.6.0, 3.0.0-alpha-4, 2.5.4
>
>
> org.apache.hadoop.hbase.replication.regionserver.TestReplicationValueCompressedWAL.testMultiplePuts
>   
Run 1: TestReplicationValueCompressedWAL.testMultiplePuts:56 Waited too 
> much time for replication
>   Run 2: PASS



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to