[
https://issues.apache.org/jira/browse/HBASE-27073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689306#comment-17689306
]
Andrew Kyle Purtell edited comment on HBASE-27073 at 2/15/23 6:51 PM:
----------------------------------------------------------------------
bq. Why they all failed at position 65536...
Yes it is suspicious and possibly a buffering issue leading to a short read
when a buffer becomes full. My comment above may be relevant:
{quote}
When working on WAL value compression a while back I remember *the first
version used a temporary growable buffer (a ByteArrayOutputStream if I recall
correctly) to collect all encrypted bytes of the value before submitting the
payload to the codec*. Later in code review Bharath and I went back and forth a
bit on a trick with input streams to reduce the number of copies. To fix this I
would go back to the earlier approach.
{quote}
when I said "encrypted" I meant "compressed", sorry about that.
This may be happening outside of value compression. Turn off value compression
and leave only the base WAL compression enabled and see if it still reproduces.
However the underlying cause would be the same if my theory is correct... First
we read a length indicating the size of the compressed bytes to read, then we
read that length until it is fully complete, and only then can we submit it for
decompression. We may need a middle buffer to collect the full number of
compressed bytes over multiple reads from the input stream, if the input stream
is returning before the full number of bytes are read in a single read call and
it is necessary to read multiple times from the input stream before the full
number of compressed bytes are available.
was (Author: apurtell):
bq. Why they all failed at position 65536...
Yes it is suspicious and possibly a buffering issue leading to a short read
when a buffer becomes full. My comment above may be relevant:
{quote}
When working on WAL value compression a while back I remember *the first
version used a temporary growable buffer (a ByteArrayOutputStream if I recall
correctly) to collect all encrypted bytes of the value before submitting the
payload to the codec*. Later in code review Bharath and I went back and forth a
bit on a trick with input streams to reduce the number of copies. To fix this I
would go back to the earlier approach.
{quote}
> TestReplicationValueCompressedWAL.testMultiplePuts is flaky
> -----------------------------------------------------------
>
> Key: HBASE-27073
> URL: https://issues.apache.org/jira/browse/HBASE-27073
> Project: HBase
> Issue Type: Bug
> Affects Versions: 2.5.0
> Environment: Java version: 1.8.0_322
> OS name: "linux", version: "5.10.0-13-arm64", arch: "aarch64", family: "unix"
> Reporter: Andrew Kyle Purtell
> Priority: Minor
> Fix For: 2.6.0, 3.0.0-alpha-4, 2.5.4
>
>
> org.apache.hadoop.hbase.replication.regionserver.TestReplicationValueCompressedWAL.testMultiplePuts
> 
Run 1: TestReplicationValueCompressedWAL.testMultiplePuts:56 Waited too
> much time for replication
> Run 2: PASS
--
This message was sent by Atlassian Jira
(v8.20.10#820010)