Re: pg_basebackup's --gzip switch misbehaves

2022-11-16 Thread Daniel Gustafsson
> On 16 Nov 2022, at 02:02, Michael Paquier  wrote:
> On Tue, Nov 15, 2022 at 11:09:54AM +0100, Daniel Gustafsson wrote:
>>> On 15 Nov 2022, at 00:58, Michael Paquier  wrote:

>>> It looks like there is a second one in install-windows.sgml.
>> 
>> Not sure I follow. IPC::Run is already linked to with a ulink from that page
>> (albeit with an empty tag rendering the URL instead).
> 
> Ah, I did not notice that there was already a link to that with
> IPC::Run. Anyway, shouldn't CPAN be marked at least as an 
> if we are not going to use a link on it? acronyms.sgml lists it, just
> saying.

Fair enough. I fixed that and applied this to HEAD.

--
Daniel Gustafsson   https://vmware.com/





Re: pg_basebackup's --gzip switch misbehaves

2022-11-15 Thread Michael Paquier
On Tue, Nov 15, 2022 at 11:09:54AM +0100, Daniel Gustafsson wrote:
>> On 15 Nov 2022, at 00:58, Michael Paquier  wrote:
>> 
>> On Mon, Nov 14, 2022 at 03:27:14PM +0100, Daniel Gustafsson wrote:
>>> Ugh, yes, that's what it should say.
>> 
>> A split sounds fine by me.  On top of what Tom has mentioned, I have
>> spotted two small-ish things.
>> 
>> -This module is available from CPAN or an operating system package.
>> +This module is available from
>> +https://metacpan.org/release/IPC-Run;>CPAN
>> +or an operating system package.
>> 
>> It looks like there is a second one in install-windows.sgml.
> 
> Not sure I follow.  IPC::Run is already linked to with a ulink from that page
> (albeit with an empty tag rendering the URL instead).

Ah, I did not notice that there was already a link to that with
IPC::Run.  Anyway, shouldn't CPAN be marked at least as an 
if we are not going to use a link on it?  acronyms.sgml lists it, just
saying.

> A related nitpick I found though is that metacpan has changed their URL
> structure and these links now 301 redirect.  The attached 0001 fixes that 
> first
> before applying the other part.

WFM.
--
Michael


signature.asc
Description: PGP signature


Re: pg_basebackup's --gzip switch misbehaves

2022-11-15 Thread Daniel Gustafsson
> On 15 Nov 2022, at 00:58, Michael Paquier  wrote:
> 
> On Mon, Nov 14, 2022 at 03:27:14PM +0100, Daniel Gustafsson wrote:
>> Ugh, yes, that's what it should say.
> 
> A split sounds fine by me.  On top of what Tom has mentioned, I have
> spotted two small-ish things.
> 
> -This module is available from CPAN or an operating system package.
> +This module is available from
> +https://metacpan.org/release/IPC-Run;>CPAN
> +or an operating system package.
> 
> It looks like there is a second one in install-windows.sgml.

Not sure I follow.  IPC::Run is already linked to with a ulink from that page
(albeit with an empty tag rendering the URL instead).

A related nitpick I found though is that metacpan has changed their URL
structure and these links now 301 redirect.  The attached 0001 fixes that first
before applying the other part.

> +Many operations in the test suites use a 180 second timeout, which on 
> slow
> Nit: s/180 second/180-second/?

Ok.

--
Daniel Gustafsson   https://vmware.com/



v3-0001-doc-update-metacpan.org-links-to-avoid-redirects.patch
Description: Binary data


v3-0002-doc-document-the-TAP-test-environment-variables.patch
Description: Binary data


Re: pg_basebackup's --gzip switch misbehaves

2022-11-14 Thread Michael Paquier
On Mon, Nov 14, 2022 at 03:27:14PM +0100, Daniel Gustafsson wrote:
> Ugh, yes, that's what it should say.

A split sounds fine by me.  On top of what Tom has mentioned, I have
spotted two small-ish things.

-This module is available from CPAN or an operating system package.
+This module is available from
+https://metacpan.org/release/IPC-Run;>CPAN
+or an operating system package.

It looks like there is a second one in install-windows.sgml.

+Many operations in the test suites use a 180 second timeout, which on slow
Nit: s/180 second/180-second/?
--
Michael


signature.asc
Description: PGP signature


Re: pg_basebackup's --gzip switch misbehaves

2022-11-14 Thread Daniel Gustafsson
> On 14 Nov 2022, at 15:23, Tom Lane  wrote:
> 
> Daniel Gustafsson  writes:
>> How about this version?
> 
> This isn't correct shell syntax is it?
> 
> +PG_TEST_NOCLEAN make -C src/bin/pg_dump check
> 
> I think you meant
> 
> +PG_TEST_NOCLEAN=1 make -C src/bin/pg_dump check
> 
> or the like.

Ugh, yes, that's what it should say.

--
Daniel Gustafsson   https://vmware.com/





Re: pg_basebackup's --gzip switch misbehaves

2022-11-14 Thread Tom Lane
Daniel Gustafsson  writes:
> How about this version?

This isn't correct shell syntax is it?

+PG_TEST_NOCLEAN make -C src/bin/pg_dump check

I think you meant

+PG_TEST_NOCLEAN=1 make -C src/bin/pg_dump check

or the like.

regards, tom lane




Re: pg_basebackup's --gzip switch misbehaves

2022-11-14 Thread Daniel Gustafsson
> On 3 Nov 2022, at 12:49, Michael Paquier  wrote:
> 
> On Wed, Nov 02, 2022 at 09:42:12PM +0100, Daniel Gustafsson wrote:
>> I think that's a good idea, not everyone running tests will read the 
>> internals
>> documentation (or even know abou it even).  How about the attached?
> 
> Thanks for the patch.  Perhaps this should be mentioned additionally
> in install-windows.sgml?  I have not tested, but as long as these
> variables are configured with a "set" command in a command prompt,
> they would be passed down to the processes triggered by vcregress.pl
> (see for example TESTLOGDIR and TESTDATADIR).

That's probably a good idea, I've amended the patch with that and also made the
CPAN mention of IPC::Run into a ulink like how it is in the Windows section in
passing.  To avoid duplicating the info in the docs I made it into a sect2
which can be linked to.  How about this version?

--
Daniel Gustafsson   https://vmware.com/



regress_tap-v2.diff
Description: Binary data


Re: pg_basebackup's --gzip switch misbehaves

2022-11-03 Thread Michael Paquier
On Wed, Nov 02, 2022 at 09:42:12PM +0100, Daniel Gustafsson wrote:
> I think that's a good idea, not everyone running tests will read the internals
> documentation (or even know abou it even).  How about the attached?

Thanks for the patch.  Perhaps this should be mentioned additionally
in install-windows.sgml?  I have not tested, but as long as these
variables are configured with a "set" command in a command prompt,
they would be passed down to the processes triggered by vcregress.pl
(see for example TESTLOGDIR and TESTDATADIR).
--
Michael


signature.asc
Description: PGP signature


Re: pg_basebackup's --gzip switch misbehaves

2022-11-02 Thread Daniel Gustafsson
> On 16 Sep 2022, at 04:22, Michael Paquier  wrote:

> By the way, should we document PG_TEST_TIMEOUT_DEFAULT and
> PG_TEST_NOCLEAN not only in src/test/perl/README but also doc/?.  We
> provide something in the docs about PROVE_FLAGS and PROVE_TESTS, for
> example. 

I think that's a good idea, not everyone running tests will read the internals
documentation (or even know abou it even).  How about the attached?

--
Daniel Gustafsson   https://vmware.com/



regress_tap.diff
Description: Binary data


Re: pg_basebackup's --gzip switch misbehaves

2022-09-22 Thread Michael Paquier
On Thu, Sep 22, 2022 at 12:47:34AM -0400, Tom Lane wrote:
> Sure.  I'd say we have 48 hours to choose whether to put this in v15.
> But not more than that.

I have a window to be able to look at the buildfarm today, tomorrow
being harder, so I have adjusted that now on both HEAD and
REL_15_STABLE for consistency.
--
Michael


signature.asc
Description: PGP signature


Re: pg_basebackup's --gzip switch misbehaves

2022-09-21 Thread Tom Lane
Michael Paquier  writes:
> On Wed, Sep 21, 2022 at 11:43:56PM -0400, Tom Lane wrote:
>> I don't have any opinion on the concrete merits of this change,
>> but I want to note that 15rc1 wraps on Monday, and we don't like
>> people pushing noncritical changes shortly before a wrap.  There
>> is not a lot of time for fooling around here.

> If I were to do it in the next couple of hours, we'd still have quite
> a couple of days of coverage, which is plenty as far as I understand?

Sure.  I'd say we have 48 hours to choose whether to put this in v15.
But not more than that.

regards, tom lane




Re: pg_basebackup's --gzip switch misbehaves

2022-09-21 Thread Michael Paquier
On Wed, Sep 21, 2022 at 11:43:56PM -0400, Tom Lane wrote:
> I don't have any opinion on the concrete merits of this change,
> but I want to note that 15rc1 wraps on Monday, and we don't like
> people pushing noncritical changes shortly before a wrap.  There
> is not a lot of time for fooling around here.

If I were to do it in the next couple of hours, we'd still have quite
a couple of days of coverage, which is plenty as far as I understand?

Saying that, it is not a critical change.  Just to give some numbers,
for a fresh initdb's instance base.tar.zst is at:
- 3.6MB at level 0.
- 3.8MB at level 1.
- 3.6MB at level 2.
- 4.3MB at level -1.
- 4.6MB at level -2.
- 6.1MB at level -7.

I am not sure if there would be a huge demand for this much control
over the current [1,22], but the library wants to control dynamically
the bounds and has the APIs to allow that.
--
Michael


signature.asc
Description: PGP signature


Re: pg_basebackup's --gzip switch misbehaves

2022-09-21 Thread Tom Lane
Justin Pryzby  writes:
> However the patch ends up, +0.75 to backpatch it to v15 rather than
> calling it a new feature in v16.

I don't have any opinion on the concrete merits of this change,
but I want to note that 15rc1 wraps on Monday, and we don't like
people pushing noncritical changes shortly before a wrap.  There
is not a lot of time for fooling around here.

regards, tom lane




Re: pg_basebackup's --gzip switch misbehaves

2022-09-21 Thread Justin Pryzby
On Thu, Sep 22, 2022 at 10:25:11AM +0900, Michael Paquier wrote:
> On Wed, Sep 21, 2022 at 07:31:48PM -0500, Justin Pryzby wrote:
> > I think at some point (maybe before releasing 1.3.4) the range was
> > increased to very large(small), negative levels.  It's possible to query
> > the library about the lowest supported compression level, but then
> > there's a complication regarding the client-side library version vs the
> > server-side version.  So it seems better to just use -7.
> 
> Indeed.  Contrary to the default level, there are no variables for the
> minimum and maximum levels.  As you are pointing out, a lookup at 
> zstd_compress.c shows that we have ZSTD_minCLevel() and
> ZSTD_maxCLevel() that assign the bounds.  Both are available since
> 1.4.0.  We still need a backend-side check as the level passed with a
> BASE_BACKUP command would be only validated there.  It seems to me 
> that this is going to be less of a headache in the long-term if we
> just use those routines at runtime, as zstd wants to keep some freedom
> with the min and max bounds for the compression level, at least that's
> the flexibility that this gives the library.  So I would tweak things
> as the attached.

Okay.  Will that complicate tests at all?  It looks like it's not an
issue for the tests currently proposed in the CF APP.
https://commitfest.postgresql.org/39/3835/

However the patch ends up, +0.75 to backpatch it to v15 rather than
calling it a new feature in v16.

-- 
Justin




Re: pg_basebackup's --gzip switch misbehaves

2022-09-21 Thread Michael Paquier
On Wed, Sep 21, 2022 at 07:31:48PM -0500, Justin Pryzby wrote:
> I think at some point (maybe before releasing 1.3.4) the range was
> increased to very large(small), negative levels.  It's possible to query
> the library about the lowest supported compression level, but then
> there's a complication regarding the client-side library version vs the
> server-side version.  So it seems better to just use -7.

Indeed.  Contrary to the default level, there are no variables for the
minimum and maximum levels.  As you are pointing out, a lookup at 
zstd_compress.c shows that we have ZSTD_minCLevel() and
ZSTD_maxCLevel() that assign the bounds.  Both are available since
1.4.0.  We still need a backend-side check as the level passed with a
BASE_BACKUP command would be only validated there.  It seems to me 
that this is going to be less of a headache in the long-term if we
just use those routines at runtime, as zstd wants to keep some freedom
with the min and max bounds for the compression level, at least that's
the flexibility that this gives the library.  So I would tweak things
as the attached.
--
Michael
diff --git a/src/common/compression.c b/src/common/compression.c
index e40ce98ef3..0c6bb9177b 100644
--- a/src/common/compression.c
+++ b/src/common/compression.c
@@ -324,8 +324,9 @@ validate_compress_specification(pg_compress_specification *spec)
 			default_level = 0;	/* fast mode */
 			break;
 		case PG_COMPRESSION_ZSTD:
-			max_level = 22;
 #ifdef USE_ZSTD
+			max_level = ZSTD_maxCLevel();
+			min_level = ZSTD_minCLevel();
 			default_level = ZSTD_CLEVEL_DEFAULT;
 #endif
 			break;
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index f63c912e97..5cff3d3c07 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -2757,8 +2757,10 @@ psql "dbname=postgres replication=database" -c "IDENTIFY_SYSTEM;"
-1), for lz4 an integer
between 1 and 12 (default 0 for fast compression
mode), and for zstd an integer between
-   1 and 22 (default
-   ZSTD_CLEVEL_DEFAULT or 3).
+   ZSTD_minCLevel() (usually -131072)
+   and ZSTD_maxCLevel() (usually 22)
+   (default ZSTD_CLEVEL_DEFAULT or
+   3).
   
 
   


signature.asc
Description: PGP signature


Re: pg_basebackup's --gzip switch misbehaves

2022-09-21 Thread Justin Pryzby
On Tue, Sep 13, 2022 at 04:13:20PM +0900, Michael Paquier wrote:
> diff --git a/src/common/compression.c b/src/common/compression.c
> index da3c291c0f..ac26287d54 100644
> --- a/src/common/compression.c
> +++ b/src/common/compression.c
> @@ -249,36 +299,49 @@ expect_integer_value(char *keyword, char *value, 
> pg_compress_specification *resu
>  char *
>  validate_compress_specification(pg_compress_specification *spec)
>  {
> + int min_level = 1;
> + int max_level = 1;
> + int default_level = 0;
> +
>   /* If it didn't even parse OK, it's definitely no good. */
>   if (spec->parse_error != NULL)
>   return spec->parse_error;
>  
>   /*
> -  * If a compression level was specified, check that the algorithm 
> expects
> -  * a compression level and that the level is within the legal range for
> -  * the algorithm.
> +  * Check that the algorithm expects a compression level and it is
> +  * is within the legal range for the algorithm.
>*/
> - if ((spec->options & PG_COMPRESSION_OPTION_LEVEL) != 0)
> + switch (spec->algorithm)
>   {
> - int min_level = 1;
> - int max_level;
> -
> - if (spec->algorithm == PG_COMPRESSION_GZIP)
> + case PG_COMPRESSION_GZIP:
>   max_level = 9;
> - else if (spec->algorithm == PG_COMPRESSION_LZ4)
> +#ifdef HAVE_LIBZ
> + default_level = Z_DEFAULT_COMPRESSION;
> +#endif
> + break;
> + case PG_COMPRESSION_LZ4:
>   max_level = 12;
> - else if (spec->algorithm == PG_COMPRESSION_ZSTD)
> + default_level = 0;  /* fast mode */
> + break;
> + case PG_COMPRESSION_ZSTD:
>   max_level = 22;

I should've suggested to add:

>   min_level = -7;

which has been supported since zstd 1.3.4 (and postgres requires 1.4.0).

I think at some point (maybe before releasing 1.3.4) the range was
increased to very large(small), negative levels.  It's possible to query
the library about the lowest supported compression level, but then
there's a complication regarding the client-side library version vs the
server-side version.  So it seems better to just use -7.

-- 
Justin




Re: pg_basebackup's --gzip switch misbehaves

2022-09-15 Thread Michael Paquier
On Wed, Sep 14, 2022 at 10:26:42AM +0200, Daniel Gustafsson wrote:
> Maybe the creation of $tempdir should take PG_TEST_NOCLEAN into account and 
> not
> register CLEANUP if set?

Agreed.  It sounds like a good idea to me to extend that to temporary
paths, and then check those rmtree() calls where the tests would not
retain too much data for small-ish machines.

By the way, should we document PG_TEST_TIMEOUT_DEFAULT and
PG_TEST_NOCLEAN not only in src/test/perl/README but also doc/?.  We
provide something in the docs about PROVE_FLAGS and PROVE_TESTS, for
example. 
--
Michael


signature.asc
Description: PGP signature


Re: pg_basebackup's --gzip switch misbehaves

2022-09-14 Thread Daniel Gustafsson
> On 13 Sep 2022, at 23:38, Tom Lane  wrote:

> The $tempdir is some temporary subdirectory of tmp_check that will get nuked 
> at
> the end of the TAP test no matter what.  So these rmtrees are merely making 
> the
> evidence disappear a bit faster; it will anyway.


Maybe the creation of $tempdir should take PG_TEST_NOCLEAN into account and not
register CLEANUP if set?

--
Daniel Gustafsson   https://vmware.com/





Re: pg_basebackup's --gzip switch misbehaves

2022-09-13 Thread Tom Lane
Michael Paquier  writes:
> And so, I have spent a couple of hours torturing the patch, applying
> it after a few tweaks and CI runs:
> ...
> The buildfarm is green so I think that we are good.  I have closed the
> open item.

+1, thanks for taking care of that.

As far as my original complaint about mamba goes, I've not quite
been able to run it to ground.  However, I found that NetBSD
seems to be shipping unmodified zlib 1.2.11, which contains a
number of known bugs in deflate_stored() --- that is, the code
path implementing compression level 0.  Red Hat for one is
carrying several back-patched fixes in that part of zlib.
So for the moment I'm willing to write it off as "not our bug".
We aren't intentionally testing compression level 0, and hardly
anybody would intentionally use it in the field, so it's not
really a thing worth worrying about IMO.  But if mamba continues
to show failures in that test then it will be worth looking closer.

regards, tom lane




Re: pg_basebackup's --gzip switch misbehaves

2022-09-13 Thread Michael Paquier
On Tue, Sep 13, 2022 at 05:38:47PM -0400, Tom Lane wrote:
> is something I tried along the way to diagnosing the problem, and
> it turns out to have exactly zero effect.  The $tempdir is some
> temporary subdirectory of tmp_check that will get nuked at the end
> of the TAP test no matter what.  So these rmtrees are merely making
> the evidence disappear a bit faster; it will anyway.

FWIW, I just stick a die() in the middle of the code path when I want
to look at specific results.  Similar method, same result.

>  # Test background stream process terminating before the basebackup has
> 
> which is not real clean, since then the files get left behind even
> on success, which I doubt we want either.

Another thing that could be done here is to use the same base location
as the cluster nodes aka $PostgreSQL::Test::Utils::tmp_check.  That
would mean storing in a repo more data associated to the base backups
after a fresh run, though.  I am not sure that small machine would
like this accumulation in a single run even if disk space is cheap
these days.

> Anyway, I have no objection to dropping the rmtrees, since they're
> pretty useless as the code stands.  Just wanted to mention this
> issue for the archives.

I see more ways to change the existing behavior, so for now I have
left that untouched.

And so, I have spent a couple of hours torturing the patch, applying
it after a few tweaks and CI runs:
- --without-zlib was causing a failure in the pg_basebackup tests as
we have a few tests that parse and validate a set of invalid specs for
the client-side and server-side compression.  With zlib around, the
tests and their expected results are unchanged, that's just a 
consequence of moving the assignment of a default level much earlier.
- pg_basebackup was triggering an assertion when using client-lz4 or
client-zstd as we use the directory method of walmethods.c.  In this
case, we just support zlib as compression and enforce no compression
when we are under lz4 or zstd.  This came from an over-simplification
of the code.  There is a gap in the testing of pg_basebackup,
actually, because we have zero tests for LZ4 and zstd there.
- The documentation of the replication protocol needed some
adjustments for the default comporession levels.

The buildfarm is green so I think that we are good.  I have closed the
open item.

You have mentioned upthread an extra thing about the fact that we pass
down 0 to deflateParams(). This is indeed wrong and we are lucky that
Z_DEFAULT_STRATEGY maps to 0.  Better to fix and backpatch this one
down to where gzip compression has been added to walmethods.c..  I'll
just do that in a bit after double-checking the area and the other
routines.
--
Michael


signature.asc
Description: PGP signature


Re: pg_basebackup's --gzip switch misbehaves

2022-09-13 Thread Tom Lane
Michael Paquier  writes:
> Attached is the patch I am finishing with, consisting of:
> - the removal of PG_COMPRESSION_OPTION_LEVEL.
> - assigning a default compression level when nothing is specified in
> the spec.
> - a couple of complifications in pg_receivewal, pg_basebackup and the
> backend code as there is no need to worry about the compression
> level.

This looks good to me.  It seems simpler, and I concur that it
fixes the described problem.  I now see

tmp_check/backup_gzip:
total 3668
-rw-r-. 1 postgres postgres  137756 Sep 13 17:29 backup_manifest
-rw-r-. 1 postgres postgres 3537499 Sep 13 17:29 base.tar.gz
-rw-r-. 1 postgres postgres   73989 Sep 13 17:29 pg_wal.tar.gz

tmp_check/backup_gzip2:
total 3168
-rw-r-. 1 postgres postgres  137756 Sep 13 17:29 backup_manifest
-rw-r-. 1 postgres postgres 3083516 Sep 13 17:29 base.tar.gz
-rw-r-. 1 postgres postgres   17069 Sep 13 17:29 pg_wal.tar.gz

tmp_check/backup_gzip3:
total 3668
-rw-r-. 1 postgres postgres  137756 Sep 13 17:29 backup_manifest
-rw-r-. 1 postgres postgres 3537517 Sep 13 17:29 base.tar.gz
-rw-r-. 1 postgres postgres   73988 Sep 13 17:29 pg_wal.tar.gz

which looks sane: the gzip2 case should, and does, have better
compression than the other two.

BTW, this bit:

diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl 
b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index 3d1a4ddd5c..40f1d3f7e2 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -860,9 +860,6 @@ SKIP:
my $gzip_is_valid =
  system_log($gzip, '--test', @zlib_files, @zlib_files2, @zlib_files3);
is($gzip_is_valid, 0, "gzip verified the integrity of compressed data");
-   rmtree("$tempdir/backup_gzip");
-   rmtree("$tempdir/backup_gzip2");
-   rmtree("$tempdir/backup_gzip3");
 }
 
 # Test background stream process terminating before the basebackup has

is something I tried along the way to diagnosing the problem, and
it turns out to have exactly zero effect.  The $tempdir is some
temporary subdirectory of tmp_check that will get nuked at the end
of the TAP test no matter what.  So these rmtrees are merely making
the evidence disappear a bit faster; it will anyway.

What I did to diagnose the problem was this:

@@ -860,9 +860,9 @@ SKIP:
my $gzip_is_valid =
  system_log($gzip, '--test', @zlib_files, @zlib_files2, @zlib_files3);
is($gzip_is_valid, 0, "gzip verified the integrity of compressed data");
-   rmtree("$tempdir/backup_gzip");
-   rmtree("$tempdir/backup_gzip2");
-   rmtree("$tempdir/backup_gzip3");
+   system_log('mv', "$tempdir/backup_gzip", "tmp_check");
+   system_log('mv', "$tempdir/backup_gzip2", "tmp_check");
+   system_log('mv', "$tempdir/backup_gzip3", "tmp_check");
 }
 
 # Test background stream process terminating before the basebackup has

which is not real clean, since then the files get left behind even
on success, which I doubt we want either.

Anyway, I have no objection to dropping the rmtrees, since they're
pretty useless as the code stands.  Just wanted to mention this
issue for the archives.

regards, tom lane




Re: pg_basebackup's --gzip switch misbehaves

2022-09-13 Thread Michael Paquier
On Sun, Sep 04, 2022 at 02:20:52PM +0900, Michael Paquier wrote:
> ffd5365 has missed that wal_compress_level should be set to
> Z_DEFAULT_COMPRESSION if there is nothing set in the compression
> spec for a zlib build.  pg_receivewal.c enforces that already.

So, I have looked at this one.  And it seems to me that the confusion
comes down to the existence of PG_COMPRESSION_OPTION_LEVEL.  I have
considered a couple of approaches here, like introducing an extra
routine in compression.c to assign a default compression level, but
my conclusion is at the end simpler: we always finish by setting up a
level even if the caller wants nothing, in which can we can just use
each library's default.  And lz4, zstd and zlib are able to handle the
case where a default is given down to their internal routines just
fine.

Attached is the patch I am finishing with, consisting of:
- the removal of PG_COMPRESSION_OPTION_LEVEL.
- assigning a default compression level when nothing is specified in
the spec.
- a couple of complifications in pg_receivewal, pg_basebackup and the
backend code as there is no need to worry about the compression
level.

A nice effect of this approach is that we can centralize the checks on
lz4, zstd and zlib when a build does not support any of these
options, as well as centralize the place where the default compression
levels are set.  This passes all the regression tests, and it fixes
the issue reported.  (Note that I have yet to run tests with all the
libraries disabled in ./configure.)

Thoughts?
--
Michael
From 8752cb283d65e40fc4096f84d99bcb93d97c59bc Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Tue, 13 Sep 2022 16:06:29 +0900
Subject: [PATCH] Simplify compression level handling in compression
 specifications

---
 src/include/common/compression.h |   3 +-
 src/backend/backup/basebackup_gzip.c |  10 +-
 src/backend/backup/basebackup_lz4.c  |   9 +-
 src/backend/backup/basebackup_zstd.c |  11 +-
 src/common/compression.c | 105 +++
 src/bin/pg_basebackup/bbstreamer_gzip.c  |   4 +-
 src/bin/pg_basebackup/bbstreamer_lz4.c   |   3 +-
 src/bin/pg_basebackup/bbstreamer_zstd.c  |  13 +--
 src/bin/pg_basebackup/pg_basebackup.c|  19 +---
 src/bin/pg_basebackup/pg_receivewal.c|  35 +--
 src/bin/pg_basebackup/t/010_pg_basebackup.pl |   3 -
 11 files changed, 107 insertions(+), 108 deletions(-)

diff --git a/src/include/common/compression.h b/src/include/common/compression.h
index 436b48104e..5d680058ed 100644
--- a/src/include/common/compression.h
+++ b/src/include/common/compression.h
@@ -22,8 +22,7 @@ typedef enum pg_compress_algorithm
 	PG_COMPRESSION_ZSTD
 } pg_compress_algorithm;
 
-#define PG_COMPRESSION_OPTION_LEVEL			(1 << 0)
-#define PG_COMPRESSION_OPTION_WORKERS		(1 << 1)
+#define PG_COMPRESSION_OPTION_WORKERS		(1 << 0)
 
 typedef struct pg_compress_specification
 {
diff --git a/src/backend/backup/basebackup_gzip.c b/src/backend/backup/basebackup_gzip.c
index a965866ff2..1ec42791f1 100644
--- a/src/backend/backup/basebackup_gzip.c
+++ b/src/backend/backup/basebackup_gzip.c
@@ -72,13 +72,9 @@ bbsink_gzip_new(bbsink *next, pg_compress_specification *compress)
 
 	Assert(next != NULL);
 
-	if ((compress->options & PG_COMPRESSION_OPTION_LEVEL) == 0)
-		compresslevel = Z_DEFAULT_COMPRESSION;
-	else
-	{
-		compresslevel = compress->level;
-		Assert(compresslevel >= 1 && compresslevel <= 9);
-	}
+	compresslevel = compress->level;
+	Assert((compresslevel >= 1 && compresslevel <= 9) ||
+		   compresslevel == Z_DEFAULT_COMPRESSION);
 
 	sink = palloc0(sizeof(bbsink_gzip));
 	*((const bbsink_ops **) >base.bbs_ops) = _gzip_ops;
diff --git a/src/backend/backup/basebackup_lz4.c b/src/backend/backup/basebackup_lz4.c
index d919e3dec7..986272ab9a 100644
--- a/src/backend/backup/basebackup_lz4.c
+++ b/src/backend/backup/basebackup_lz4.c
@@ -72,13 +72,8 @@ bbsink_lz4_new(bbsink *next, pg_compress_specification *compress)
 
 	Assert(next != NULL);
 
-	if ((compress->options & PG_COMPRESSION_OPTION_LEVEL) == 0)
-		compresslevel = 0;
-	else
-	{
-		compresslevel = compress->level;
-		Assert(compresslevel >= 1 && compresslevel <= 12);
-	}
+	compresslevel = compress->level;
+	Assert(compresslevel >= 0 && compresslevel <= 12);
 
 	sink = palloc0(sizeof(bbsink_lz4));
 	*((const bbsink_ops **) >base.bbs_ops) = _lz4_ops;
diff --git a/src/backend/backup/basebackup_zstd.c b/src/backend/backup/basebackup_zstd.c
index 865067f8dc..409bf88101 100644
--- a/src/backend/backup/basebackup_zstd.c
+++ b/src/backend/backup/basebackup_zstd.c
@@ -96,14 +96,11 @@ bbsink_zstd_begin_backup(bbsink *sink)
 	if (!mysink->cctx)
 		elog(ERROR, "could not create zstd compression context");
 
-	if ((compress->options & PG_COMPRESSION_OPTION_LEVEL) != 0)
-	{
-		ret = ZSTD_CCtx_setParameter(mysink->cctx, ZSTD_c_compressionLevel,
+	ret = ZSTD_CCtx_setParameter(mysink->cctx, ZSTD_c_compressionLevel,
 	 compress->level);
-		if 

Re: pg_basebackup's --gzip switch misbehaves

2022-09-03 Thread Michael Paquier
On Sat, Sep 03, 2022 at 11:11:29AM -0400, Tom Lane wrote:
> It makes sense that base.tar.gz is compressed a little better with
> --gzip than with level-1 compression, but why is pg_wal.tar.gz not
> compressed at all?  It looks like the problem probably boils down to
> which of "-1" and "0" means "default behavior" vs "no compression",
> with different code layers interpreting that differently.  I can't
> find exactly where that's happening, but I did manage to stop the
> failures with this crude hack:

There is a distinction coming in pg_basebackup.c from the way we
deparse the compression specification and the default compression
level that should be assigned if there is no level directly specified
by the user.  It seems to me that the error comes from this code in
BaseBackup() when we are under STREAM_WAL (default):

if (client_compress->algorithm == PG_COMPRESSION_GZIP)
{
wal_compress_algorithm = PG_COMPRESSION_GZIP;
wal_compress_level =
   (client_compress->options & PG_COMPRESSION_OPTION_LEVEL)
   != 0 ? client_compress->level : 0;

ffd5365 has missed that wal_compress_level should be set to
Z_DEFAULT_COMPRESSION if there is nothing set in the compression
spec for a zlib build.  pg_receivewal.c enforces that already.

> That's not right as a real fix, because it would have the effect
> that "--compress gzip:0" would also invoke default compression,
> whereas what it should do is produce the uncompressed output
> we're actually getting.  Both cases have compression_level == 0
> by the time we reach here, though.

Nope, that would not be right.

> BTW, I'm fairly astonished that anyone would have thought that three
> complete pg_basebackup cycles testing essentially-identical options
> were a good use of developer time and buildfarm cycles from here to
> eternity.  Even if digging into it did expose a bug, the test case
> deserves little credit for that, because it entirely failed to call
> attention to the problem.  I had to whack the script pretty hard
> just to get it to not delete the evidence.

The introduction of the compression specification has introduced a lot
of patterns where we expect or not expect compression to happen, and
on top of that this needs to be careful about backward-compatibility.
--
Michael


signature.asc
Description: PGP signature


pg_basebackup's --gzip switch misbehaves

2022-09-03 Thread Tom Lane
I've been trying to figure out why my new buildfarm animal mamba
occasionally fails the pg_basebackup tests [1][2].  I've not run
that to ground yet, but one thing I've found that's consistently
reproducible everywhere is that pg_basebackup's --gzip switch
misbehaves.  The manual says, and the principle of least astonishment
agrees, that that should invoke gzip with the default compression
level.  However, the three test cases beginning at about line 810 of
010_pg_basebackup.pl produce these output file sizes on my x86_64
Linux machine:

backup_gzip ("--compress 1"):
total 3672
-rw-r-. 1 postgres postgres  137756 Sep  2 23:38 backup_manifest
-rw-r-. 1 postgres postgres 3538992 Sep  2 23:38 base.tar.gz
-rw-r-. 1 postgres postgres   73991 Sep  2 23:38 pg_wal.tar.gz

backup_gzip2 ("--gzip"):
total 19544
-rw-r-. 1 postgres postgres   137756 Sep  2 23:38 backup_manifest
-rw-r-. 1 postgres postgres  3086972 Sep  2 23:38 base.tar.gz
-rw-r-. 1 postgres postgres 16781399 Sep  2 23:38 pg_wal.tar.gz

backup_gzip3 ("--compress gzip:1"):
total 3672
-rw-r-. 1 postgres postgres  137756 Sep  2 23:38 backup_manifest
-rw-r-. 1 postgres postgres 3539006 Sep  2 23:38 base.tar.gz
-rw-r-. 1 postgres postgres   73989 Sep  2 23:38 pg_wal.tar.gz

It makes sense that base.tar.gz is compressed a little better with
--gzip than with level-1 compression, but why is pg_wal.tar.gz not
compressed at all?  It looks like the problem probably boils down to
which of "-1" and "0" means "default behavior" vs "no compression",
with different code layers interpreting that differently.  I can't
find exactly where that's happening, but I did manage to stop the
failures with this crude hack:

diff --git a/src/bin/pg_basebackup/walmethods.c 
b/src/bin/pg_basebackup/walmethods.c
index e90aa0ba37..e9b578 100644
--- a/src/bin/pg_basebackup/walmethods.c
+++ b/src/bin/pg_basebackup/walmethods.c
@@ -1358,7 +1358,7 @@ CreateWalTarMethod(const char *tarbase,
sprintf(tar_data->tarfilename, "%s%s", tarbase, suffix);
tar_data->fd = -1;
tar_data->compression_algorithm = compression_algorithm;
-   tar_data->compression_level = compression_level;
+   tar_data->compression_level = compression_level > 0 ? compression_level 
: Z_DEFAULT_COMPRESSION;
tar_data->sync = sync;
 #ifdef HAVE_LIBZ
if (compression_algorithm == PG_COMPRESSION_GZIP)

That's not right as a real fix, because it would have the effect
that "--compress gzip:0" would also invoke default compression,
whereas what it should do is produce the uncompressed output
we're actually getting.  Both cases have compression_level == 0
by the time we reach here, though.

I suspect that there are related bugs in other code paths in this
rat's nest of undocumented functions and dubious API abstractions;
but since it's all undocumented, who can say which places are wrong
and which are not?

I might not ding this code quite this hard, if I hadn't had
equally-unpleasant encounters with it previously (eg 248c3a937).
It's a mess, and I do not find it to be up to project standards.

A vaguely-related matter is that the deflateParams calls all pass "0"
as the third parameter:

if (deflateParams(tar_data->zp, tar_data->compression_level, 0) != Z_OK)

Aside from being unreadable, that's entirely unwarranted familiarity
with the innards of libz.  zlib.h says you should be writing a named
constant, probably Z_DEFAULT_STRATEGY.

BTW, I'm fairly astonished that anyone would have thought that three
complete pg_basebackup cycles testing essentially-identical options
were a good use of developer time and buildfarm cycles from here to
eternity.  Even if digging into it did expose a bug, the test case
deserves little credit for that, because it entirely failed to call
attention to the problem.  I had to whack the script pretty hard
just to get it to not delete the evidence.

regards, tom lane

[1] 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mamba=2022-09-01%2018%3A38%3A27
[2] 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mamba=2022-08-31%2011%3A46%3A09