Re: pglz compression performance, take two

Andrey Borodin Sun, 05 Feb 2023 18:00:44 -0800

On Sun, Feb 5, 2023 at 5:51 PM Tomas Vondra
<[email protected]> wrote:
>
> On 2/5/23 19:36, Andrey Borodin wrote:
> > On Fri, Jan 6, 2023 at 10:02 PM Andrey Borodin <[email protected]> 
> > wrote:
> >>
> >> Hello! Please find attached v8.
> >
> > I got some interesting feedback from some patch users.
> > There was an oversight that frequently yielded results that are 1,2 or
> > 3 bytes longer than expected.
> > Looking closer I found that the correctness of the last 3-byte tail is
> > checked in two places. PFA fix for this. Previously compressed data
> > was correct, however in some cases few bytes longer than the result of
> > current pglz implementation.
> >
>
> Thanks. What were the consequences of the issue? Lower compression
> ratio, or did we then fail to decompress the data (or would current pglz
> implementation fail to decompress it)?
>
The data was decompressed fine. But extension tests (Citus's columnar
engine) hard-coded a lot of compression ratio stuff.
And there is still 1 more test where optimized version produces 1 byte
longer output. I'm trying to find it, but with no success yet.


There are known and documented cases when optimized pglz version would
do so. good_match without 10-division and memcmp by 4 bytes. But even
disabling this, still observing 1-byte longer compression results
persists... The problem is the length is changed after deleting some
data, so compression of that particular sequence seems to be somewhere
far away.
It was funny at the beginning - to hunt for 1 byte. But the weekend is
ending, and it seems that byte slipped from me again...

Best regards, Andrey Borodin.

Re: pglz compression performance, take two

Reply via email to