Re: Introduce pg_receivewal gzip compression tests

2021-07-19 Thread Michael Paquier
On Mon, Jul 19, 2021 at 04:03:33PM +0900, Michael Paquier wrote:
> Another advantage of this patch is the handling of ".gz" is reduced to
> one code path instead of four.  That makes a bit easier the
> introduction of new compression methods.
> 
> A second thing that was really confusing is that the name of the WAL
> segment generated in this code path completely ignored the type of
> compression.  This led to one confusing error message if failing to
> open a segment for write where we'd mention a .partial file rather
> than a .gz.partial file.  The versions of zlib I used on Windows
> looked buggy so I cannot conclude there, but I am sure that this
> should allow bowerbird to handle the test correctly.

After more testing and more review, I have applied and backpatched
this stuff.  Another thing I did on HEAD was to enable again the ZLIB
portion of the pg_receivewal tests on Windows.  bowerdird should stay
green (I hope), and it is better to have as much more coverage as
possible for all that.
--
Michael


signature.asc
Description: PGP signature


Re: Introduce pg_receivewal gzip compression tests

2021-07-19 Thread Michael Paquier
On Fri, Jul 16, 2021 at 02:08:57PM +0900, Michael Paquier wrote:
> This behavior is rather debatable, and it would be more instinctive to
> me to just skip any business related to the pre-padding if compression
> is enabled, at the cost of one extra callback in WalWriteMethod to
> grab the compression level (dir_open_for_write() skips that for
> compression) to allow receivelog.c to handle that.  But at the same
> time few users are going to care about that as pg_receivewal has most
> likely always the same set of options, so complicating this code is
> not really appealing either.

I have chewed on that over the weekend, and skipping the padding logic
if we are in compression mode in open_walfile() makes sense, so
attached is a patch that I'd like to backpatch.

Another advantage of this patch is the handling of ".gz" is reduced to
one code path instead of four.  That makes a bit easier the
introduction of new compression methods.

A second thing that was really confusing is that the name of the WAL
segment generated in this code path completely ignored the type of
compression.  This led to one confusing error message if failing to
open a segment for write where we'd mention a .partial file rather
than a .gz.partial file.  The versions of zlib I used on Windows
looked buggy so I cannot conclude there, but I am sure that this
should allow bowerbird to handle the test correctly.
--
Michael
diff --git a/src/bin/pg_basebackup/receivelog.c b/src/bin/pg_basebackup/receivelog.c
index 3952a3f943..7af9009320 100644
--- a/src/bin/pg_basebackup/receivelog.c
+++ b/src/bin/pg_basebackup/receivelog.c
@@ -88,26 +88,29 @@ static bool
 open_walfile(StreamCtl *stream, XLogRecPtr startpoint)
 {
 	Walfile*f;
-	char		fn[MAXPGPATH];
+	char	   *fn;
 	ssize_t		size;
 	XLogSegNo	segno;
 
 	XLByteToSeg(startpoint, segno, WalSegSz);
 	XLogFileName(current_walfile_name, stream->timeline, segno, WalSegSz);
 
-	snprintf(fn, sizeof(fn), "%s%s", current_walfile_name,
-			 stream->partial_suffix ? stream->partial_suffix : "");
+	/* Note that this considers the compression used if necessary */
+	fn = stream->walmethod->get_file_name(current_walfile_name,
+		  stream->partial_suffix);
 
 	/*
 	 * When streaming to files, if an existing file exists we verify that it's
 	 * either empty (just created), or a complete WalSegSz segment (in which
 	 * case it has been created and padded). Anything else indicates a corrupt
-	 * file.
+	 * file. Compressed files have no need for padding, so just ignore this
+	 * case.
 	 *
 	 * When streaming to tar, no file with this name will exist before, so we
 	 * never have to verify a size.
 	 */
-	if (stream->walmethod->existsfile(fn))
+	if (stream->walmethod->compression() == 0 &&
+		stream->walmethod->existsfile(fn))
 	{
 		size = stream->walmethod->get_file_size(fn);
 		if (size < 0)
diff --git a/src/bin/pg_basebackup/t/020_pg_receivewal.pl b/src/bin/pg_basebackup/t/020_pg_receivewal.pl
index 158f7d176f..17fd71a450 100644
--- a/src/bin/pg_basebackup/t/020_pg_receivewal.pl
+++ b/src/bin/pg_basebackup/t/020_pg_receivewal.pl
@@ -72,13 +72,11 @@ $primary->command_ok(
 my @partial_wals = glob "$stream_dir/*\.partial";
 is(scalar(@partial_wals), 1, "one partial WAL segment was created");
 
-# Check ZLIB compression if available.  On Windows, some old versions
-# of zlib can cause some instabilities with this test, so disable it
-# for now.
+# Check ZLIB compression if available.
 SKIP:
 {
-	skip "postgres was not built with ZLIB support, or Windows is involved", 5
-	  if (!check_pg_config("#define HAVE_LIBZ 1") || $windows_os);
+	skip "postgres was not built with ZLIB support", 5
+	  if (!check_pg_config("#define HAVE_LIBZ 1"));
 
 	# Generate more WAL worth one completed, compressed, segment.
 	$primary->psql('postgres', 'SELECT pg_switch_wal();');
diff --git a/src/bin/pg_basebackup/walmethods.c b/src/bin/pg_basebackup/walmethods.c
index a15bbb20e7..3c0da70a04 100644
--- a/src/bin/pg_basebackup/walmethods.c
+++ b/src/bin/pg_basebackup/walmethods.c
@@ -68,20 +68,32 @@ dir_getlasterror(void)
 	return strerror(errno);
 }
 
+static char *
+dir_get_file_name(const char *pathname, const char *temp_suffix)
+{
+	char *tmppath = pg_malloc0(MAXPGPATH * sizeof(char));
+
+	snprintf(tmppath, MAXPGPATH, "%s%s%s",
+			 pathname, dir_data->compression > 0 ? ".gz" : "",
+			 temp_suffix ? temp_suffix : "");
+
+	return tmppath;
+}
+
 static Walfile
 dir_open_for_write(const char *pathname, const char *temp_suffix, size_t pad_to_size)
 {
 	static char tmppath[MAXPGPATH];
+	char	   *filename;
 	int			fd;
 	DirectoryMethodFile *f;
 #ifdef HAVE_LIBZ
 	gzFile		gzfp = NULL;
 #endif
 
-	snprintf(tmppath, sizeof(tmppath), "%s/%s%s%s",
-			 dir_data->basedir, pathname,
-			 dir_data->compression > 0 ? ".gz" : "",
-			 temp_suffix ? temp_suffix : "");
+	filename = dir_get_file_name(pathname, temp_suffix);
+	snprintf(tmppath, sizeof(tmppath), "%s/%s",
+			 dir_data->basedir, filename);
 
 	/*
 	 * Open a file for 

Re: Introduce pg_receivewal gzip compression tests

2021-07-15 Thread Michael Paquier
On Fri, Jul 16, 2021 at 08:59:11AM +0900, Michael Paquier wrote:
> --compress is used and the sync fails for a non-compressed segment.
> Looking at the code it is pretty obvious that open_walfile() is
> getting confused with the handling of an existing .partial segment
> while walmethods.c uses dir_data->compression in all the places that
> matter.  So that's a legit bug, that happens only when mixing
> pg_receivewal runs for where successive runs use the compression or
> non-compression modes.

Ditto.  After reading the code more carefully, the code is actually
able to work even if it could be cleaner:
1) dir_existsfile() would check for the existence of a
non-compressed, partial segment all the time.
2) If this non-compressed file was padded, the code would use
open_for_write() that would open a compressed, partial segment.
3) The compressed, partial segment would be the one flushed.

This behavior is rather debatable, and it would be more instinctive to
me to just skip any business related to the pre-padding if compression
is enabled, at the cost of one extra callback in WalWriteMethod to
grab the compression level (dir_open_for_write() skips that for
compression) to allow receivelog.c to handle that.  But at the same
time few users are going to care about that as pg_receivewal has most
likely always the same set of options, so complicating this code is
not really appealing either.

> I am amazed that the other buildfarm members are not complaining, to
> be honest.  jacana runs this TAP test with MinGW and ZLIB, and does
> not complain.

I have spent more time on that with my own environment, and while
testing I have bumped on a different issue with zlib, which was
really weird.  In the same scenario as above, gzdopen() has been
failing for me at step 2), causing the test to loop forever.  We
document to use DLLs for ZLIB coming from zlib.net, but the ones
available there are really outdated as far as I can see (found some
called zlib.lib/dll myself, breaking Solution.pm).  For now I have
disabled those tests on Windows to bring back bowerbird to green, but
there is something else going on here.  We don't do much tests with
ZLIB on Windows for pg_basebackup and pg_dump, so there may be some
more issues?

@Andrew: which version of ZLIB are you using on bowerbird?  That's the
one in c:\prog\3p64.  That's a zdll.lib, right?
--
Michael


signature.asc
Description: PGP signature


Re: Introduce pg_receivewal gzip compression tests

2021-07-15 Thread Michael Paquier
On Thu, Jul 15, 2021 at 09:35:52PM +0900, Michael Paquier wrote:
> For this one, I'll try to test harder on my own host.  I am curious to
> see if the other Windows members running the TAP tests have anything
> to say.  Looking at the code of zlib, this would come from gz_zero()
> in gzflush(), which could blow up on a write() in gz_comp().

bowerbird has just failed for the second time in a row on EACCESS, so
there is more here than meets the eye.  Looking at the code, I think I
have spotted what it is and the buildfarm logs give a very good hint:
# Running: pg_receivewal -D
:/prog/bf/root/HEAD/pgsql.build/src/bin/pg_basebackup/tmp_check/t_020_pg_receivewal_primary_data/archive_wal
--verbose --endpos 0/328 --compress 1
pg_receivewal: starting log streaming at 0/200 (timeline 1)
pg_receivewal: fatal: could not fsync existing write-ahead log file
"00010002.partial": Permission denied
not ok 20 - streaming some WAL using ZLIB compression

--compress is used and the sync fails for a non-compressed segment.
Looking at the code it is pretty obvious that open_walfile() is
getting confused with the handling of an existing .partial segment
while walmethods.c uses dir_data->compression in all the places that
matter.  So that's a legit bug, that happens only when mixing
pg_receivewal runs for where successive runs use the compression or
non-compression modes.

I am amazed that the other buildfarm members are not complaining, to
be honest.  jacana runs this TAP test with MinGW and ZLIB, and does
not complain.
--
Michael


signature.asc
Description: PGP signature


Re: Introduce pg_receivewal gzip compression tests

2021-07-15 Thread gkokolatos



‐‐‐ Original Message ‐‐‐

On Thursday, July 15th, 2021 at 14:35, Michael Paquier  
wrote:

> On Thu, Jul 15, 2021 at 08:35:27PM +0900, Michael Paquier wrote:

>
> > 2.  curculio:
> > Looking at the OpenBSD code (usr.bin/compress/main.c), long options
> > are supported, where --version does exit(0) without printing
> > set_outfile() is doing a discard of the file suffixes it does not
> > recognize, and I think that their implementation bumps on .gz.partial
> > and generates an exit code of 512 to map with WARNING. I still wish
> > to keep this test, and I'd like to think that the contents of
> > @zlib_wals are enough in terms of coverage. What do you think?
>
> After thinking more about this one, I have taken the course to just
> remove the .gz.partial segment from the check, a full segment should
> be enough in terms of coverage. I prefer this simplification over a
> rename of the .partial segment or a tweak of the error code to map
> with WARNING.

Fair enough.

Cheers,
//Georgios

> ---
>
> Michael




Re: Introduce pg_receivewal gzip compression tests

2021-07-15 Thread Michael Paquier
On Thu, Jul 15, 2021 at 08:35:27PM +0900, Michael Paquier wrote:
> 1) bowerbird on Windows/MSVC:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=bowerbird=2021-07-15%2010%3A30%3A36
> pg_receivewal: fatal: could not fsync existing write-ahead log file
> "00010002.partial": Permission denied
> not ok 20 - streaming some WAL using ZLIB compression
> I don't think the existing code can be blamed for that as this means a
> failure with gzflush().  Likely a concurrency issue as that's an
> EACCES.  If that's repeatable, that could point to an actual issue
> with pg_receivewal --compress.

For this one, I'll try to test harder on my own host.  I am curious to
see if the other Windows members running the TAP tests have anything
to say.  Looking at the code of zlib, this would come from gz_zero()
in gzflush(), which could blow up on a write() in gz_comp().

> 2) curculio:
> 
> Looking at the OpenBSD code (usr.bin/compress/main.c), long options
> are supported, where --version does exit(0) without printing
> anything, and --test is supported even if that's not on the man pages.
> set_outfile() is doing a discard of the file suffixes it does not
> recognize, and I think that their implementation bumps on .gz.partial
> and generates an exit code of 512 to map with WARNING.  I still wish
> to keep this test, and I'd like to think that the contents of
> @zlib_wals are enough in terms of coverage.  What do you think?

After thinking more about this one, I have taken the course to just
remove the .gz.partial segment from the check, a full segment should
be enough in terms of coverage.  I prefer this simplification over a
rename of the .partial segment or a tweak of the error code to map
with WARNING.
--
Michael


signature.asc
Description: PGP signature


Re: Introduce pg_receivewal gzip compression tests

2021-07-15 Thread Michael Paquier
On Thu, Jul 15, 2021 at 07:48:08AM +, gkokola...@pm.me wrote:
> Let us hope that it will prevent some bugs from happening.

The buildfarm has two reports.

1) bowerbird on Windows/MSVC:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=bowerbird=2021-07-15%2010%3A30%3A36
pg_receivewal: fatal: could not fsync existing write-ahead log file
"00010002.partial": Permission denied
not ok 20 - streaming some WAL using ZLIB compression
I don't think the existing code can be blamed for that as this means a
failure with gzflush().  Likely a concurrency issue as that's an
EACCES.  If that's repeatable, that could point to an actual issue
with pg_receivewal --compress.

2) curculio:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=curculio=2021-07-15%2010%3A30%3A15
# Running: gzip --test
 
/home/pgbf/buildroot/HEAD/pgsql.build/src/bin/pg_basebackup/tmp_check/t_020_pg_receivewal_primary_data/archive_wal/00010002.gz
 
/home/pgbf/buildroot/HEAD/pgsql.build/src/bin/pg_basebackup/tmp_check/t_020_pg_receivewal_primary_data/archive_wal/00010003.gz.partial
gzip:
 
/home/pgbf/buildroot/HEAD/pgsql.build/src/bin/pg_basebackup/tmp_check/t_020_pg_receivewal_primary_data/archive_wal/00010003.gz.partial:
 unknown suffix: ignored
 not ok 24 - gzip verified the integrity of compressed WAL segments

Looking at the OpenBSD code (usr.bin/compress/main.c), long options
are supported, where --version does exit(0) without printing
anything, and --test is supported even if that's not on the man pages.
set_outfile() is doing a discard of the file suffixes it does not
recognize, and I think that their implementation bumps on .gz.partial
and generates an exit code of 512 to map with WARNING.  I still wish
to keep this test, and I'd like to think that the contents of
@zlib_wals are enough in terms of coverage.  What do you think?
--
Michael


signature.asc
Description: PGP signature


Re: Introduce pg_receivewal gzip compression tests

2021-07-15 Thread gkokolatos



‐‐‐ Original Message ‐‐‐

On Thursday, July 15th, 2021 at 09:00, Michael Paquier  
wrote:

> On Wed, Jul 14, 2021 at 02:11:09PM +, gkokola...@pm.me wrote:
>
> > Please find v6 attached.
>
> Thanks. I have spent some time checking this stuff in details, and
> I did some tests on Windows while on it. A run of pgperltidy was
> missing. A second thing is that you added one useless WAL segment
> switch in the ZLIB block, and two at the end, causing the first two in
> the set of three (one in the ZLIB block and one in the final command)
> to be no-ops as they followed a previous WAL switch. The final one
> was not needed as no WAL is generated after that.
>

Thank you for the work and comments.

> And applied. Let's see if the buildfarm has anything to say. Perhaps
> this will even catch some bugs that pre-existed.

Let us hope that it will prevent some bugs from happening.

Cheers,
//Georgios

> ---
>
> Michael




Re: Introduce pg_receivewal gzip compression tests

2021-07-15 Thread Michael Paquier
On Wed, Jul 14, 2021 at 02:11:09PM +, gkokola...@pm.me wrote:
> Please find v6 attached.

Thanks.  I have spent some time checking this stuff in details, and
I did some tests on Windows while on it.  A run of pgperltidy was
missing.  A second thing is that you added one useless WAL segment
switch in the ZLIB block, and two at the end, causing the first two in
the set of three (one in the ZLIB block and one in the final command)
to be no-ops as they followed a previous WAL switch.  The final one
was not needed as no WAL is generated after that.

And applied.  Let's see if the buildfarm has anything to say.  Perhaps
this will even catch some bugs that pre-existed.
--
Michael


signature.asc
Description: PGP signature


Re: Introduce pg_receivewal gzip compression tests

2021-07-14 Thread gkokolatos


‐‐‐ Original Message ‐‐‐

On Wednesday, July 14th, 2021 at 04:17, Michael Paquier  
wrote:

> On Tue, Jul 13, 2021 at 11:16:06AM +, gkokola...@pm.me wrote:
> > Agreed. For the record that is why I said v6 :)
> Okay, thanks.

Please find v6 attached.

Cheers,
//Georgios

> ---
>
> Michael

v6-0001-Introduce-pg_receivewal-gzip-compression-tests.patch
Description: Binary data


Re: Introduce pg_receivewal gzip compression tests

2021-07-13 Thread Michael Paquier
On Tue, Jul 13, 2021 at 11:16:06AM +, gkokola...@pm.me wrote:
> Agreed. For the record that is why I said v6 :)

Okay, thanks.
--
Michael


signature.asc
Description: PGP signature


Re: Introduce pg_receivewal gzip compression tests

2021-07-13 Thread gkokolatos



‐‐‐ Original Message ‐‐‐

On Tuesday, July 13th, 2021 at 12:26, Michael Paquier  
wrote:

> On Tue, Jul 13, 2021 at 08:28:44AM +, gkokola...@pm.me wrote:
> > Sounds great. Let me cook up v6 for this.
>
> Thanks. Could you use v5 I posted upthread as a base? There were
> some improvements in the variable names, the comments and the test
> descriptions.

Agreed. For the record that is why I said v6 :)

Cheers,
//Georgios

> -
> Michael




Re: Introduce pg_receivewal gzip compression tests

2021-07-13 Thread Michael Paquier
On Tue, Jul 13, 2021 at 08:28:44AM +, gkokola...@pm.me wrote:
> Sounds great. Let me cook up v6 for this.

Thanks.  Could you use v5 I posted upthread as a base?  There were
some improvements in the variable names, the comments and the test
descriptions.
--
Michael


signature.asc
Description: PGP signature


Re: Introduce pg_receivewal gzip compression tests

2021-07-13 Thread gkokolatos



‐‐‐ Original Message ‐‐‐

On Tuesday, July 13th, 2021 at 10:14, Michael Paquier  
wrote:

> On Tue, Jul 13, 2021 at 04:37:53PM +0900, Michael Paquier wrote:
>
> Poking at this problem, I partially take this statement back as this
> requires an initial run of pg_receivewal --endpos to ensure the
> creation of one .gz and one .gz.partial. So I guess that this should
> be structured as:
>
> 1.  Keep the existing pg_receivewal --endpos.
> 2.  Add the ZLIB block, with one pg_receivewal --endpos.
> 3.  Add at the end one extra pg_receivewal --endpos, outside of the ZLIB
> block, which should check the creation of a .partial, non-compressed
> segment. This should mention that we place this command at the end of
> the test for the start streaming point computation.
> LZ4 tests would be then between 2) and 3), or 1) and 2).

Sounds great. Let me cook up v6 for this.

>
> --
>
> Michael




Re: Introduce pg_receivewal gzip compression tests

2021-07-13 Thread gkokolatos



‐‐‐ Original Message ‐‐‐

On Tuesday, July 13th, 2021 at 09:37, Michael Paquier  
wrote:

> On Tue, Jul 13, 2021 at 06:36:59AM +, gkokola...@pm.me wrote:
>
> > I am sorry this was not so clear. It is indeed running twice the binary
> > with different flags. However the goal is not to check the flags, but
> > to make certain that the partial file has now been completed. That is
> > why there was code asserting that the previous FILENAME.gz.partial file
> > after the second invocation is converted to FILENAME.gz.
>
> The first run you are adding checks the same thing thanks to
> pg_switch_wal(), where pg_receivewal completes the generation of
> 00010002.gz and finishes with
> 00010003.gz.partial.

This is correct. It is the 00010003 awl that the rest
of the tests are targeting.

>
> > Additionally the second invocation of pg_receivewal is extending the
> > coverage of FindStreamingStart().
>
> Hmm. It looks like a waste in runtime once we mix LZ4 in that as that
> would mean 5 runs of pg_receivewal, but we really need only three of
> them with --endpos:
> -   One with ZLIB compression.
> -   One with LZ4 compression.
> -   One without compression.
>
> Do you think that we could take advantage of what is now the only run
> of pg_receivewal --endpos for that? We could make the ZLIB checks run
> first, conditionally, and then let the last command with --endpos
> perform a full scan of the contents in $stream_dir with the .gz files
> already in place. The addition of LZ4 would be an extra conditional
> block similar to what's introduced in ZLIB, running before the last
> command without compression.

I will admit that for the current patch I am not taking lz4 into account as
at the moment I have little idea as to how the lz4 patch will advance with the
review rounds. I simply accepted that it will be rebased on top of the patch
in the current thread and probably need to modify the current then.

But I digress. I would like have some combination of .gz and .gz.parial but
I will not take too strong of a stance. I am happy to go with your suggestion.

Cheers,
//Georgios

>
> --
>
> Michael




Re: Introduce pg_receivewal gzip compression tests

2021-07-13 Thread Michael Paquier
On Tue, Jul 13, 2021 at 04:37:53PM +0900, Michael Paquier wrote:
> Hmm.  It looks like a waste in runtime once we mix LZ4 in that as that
> would mean 5 runs of pg_receivewal, but we really need only three of
> them with --endpos:
> - One with ZLIB compression.
> - One with LZ4 compression.
> - One without compression.
> 
> Do you think that we could take advantage of what is now the only run
> of pg_receivewal --endpos for that?  We could make the ZLIB checks run
> first, conditionally, and then let the last command with --endpos
> perform a full scan of the contents in $stream_dir with the .gz files
> already in place.  The addition of LZ4 would be an extra conditional
> block similar to what's introduced in ZLIB, running before the last
> command without compression.

Poking at this problem, I partially take this statement back as this
requires an initial run of pg_receivewal --endpos to ensure the
creation of one .gz and one .gz.partial.  So I guess that this should
be structured as:
1) Keep the existing pg_receivewal --endpos.
2) Add the ZLIB block, with one pg_receivewal --endpos.
3) Add at the end one extra pg_receivewal --endpos, outside of the ZLIB
block, which should check the creation of a .partial, non-compressed
segment.  This should mention that we place this command at the end of
the test for the start streaming point computation.

LZ4 tests would be then between 2) and 3), or 1) and 2).
--
Michael


signature.asc
Description: PGP signature


Re: Introduce pg_receivewal gzip compression tests

2021-07-13 Thread Michael Paquier
On Tue, Jul 13, 2021 at 06:36:59AM +, gkokola...@pm.me wrote:
> I am sorry this was not so clear. It is indeed running twice the binary
> with different flags. However the goal is not to check the flags, but
> to make certain that the partial file has now been completed. That is
> why there was code asserting that the previous FILENAME.gz.partial file
> after the second invocation is converted to FILENAME.gz.

The first run you are adding checks the same thing thanks to
pg_switch_wal(), where pg_receivewal completes the generation of
00010002.gz and finishes with
00010003.gz.partial.

> Additionally the second invocation of pg_receivewal is extending the
> coverage of FindStreamingStart().

Hmm.  It looks like a waste in runtime once we mix LZ4 in that as that
would mean 5 runs of pg_receivewal, but we really need only three of
them with --endpos:
- One with ZLIB compression.
- One with LZ4 compression.
- One without compression.

Do you think that we could take advantage of what is now the only run
of pg_receivewal --endpos for that?  We could make the ZLIB checks run
first, conditionally, and then let the last command with --endpos
perform a full scan of the contents in $stream_dir with the .gz files
already in place.  The addition of LZ4 would be an extra conditional
block similar to what's introduced in ZLIB, running before the last
command without compression.
--
Michael


signature.asc
Description: PGP signature


Re: Introduce pg_receivewal gzip compression tests

2021-07-13 Thread gkokolatos



‐‐‐ Original Message ‐‐‐

On Tuesday, July 13th, 2021 at 03:53, Michael Paquier  
wrote:

> On Mon, Jul 12, 2021 at 04:46:29PM +, gkokola...@pm.me wrote:
>
> > On Monday, July 12th, 2021 at 17:07, gkokola...@pm.me wrote:
> >
> > > ‐‐‐ Original Message ‐‐‐
>
> Are you using outlook? The format of your messages gets blurry on the
>
> PG website, so does it for me.

I am using protonmail's web page. I was not aware of the issue. Thank you
for bringing it up to my attention. I shall try to address it.

>
> > > I am admittedly not so well versed on Windows systems. Thank you for
> > >
> > > informing me.
> > >
> > > Please find attached v3 of the patch where $ENV{GZIP_PROGRAM} is used
> > >
> > > instead. To the best of my knowledge one should avoid using $ENV{GZIP}
> > >
> > > because that would translate to the obsolete, yet used environment
> > >
> > > variable GZIP which holds a set of default options for gzip. In essence
> > >
> > > it would be equivalent to executing:
> > >
> > > GZIP=gzip gzip --test 
> > >
> > > which can result to errors similar to:
> > >
> > > gzip: gzip: non-option in GZIP environment variable
>
> -# make this available to TAP test scripts
>
> +# make these available to TAP test scripts
>
> export TAR
>
> +export GZIP_PROGRAM=$(GZIP)
>
> Wow. So this comes from the fact that the command gzip can feed on
>
> the environment variable from the same name. I was not aware of
>
> that, and a comment would be in place here. That means complicating a
>
> bit the test flow for people on Windows, but I am fine to live with
>
> that as long as this does not fail hard. One extra thing we could do
>
> is drop this part of the test, but I agree that this is useful to have
>
> around as a validity check.

Great.

>
> > After a bit more thinking, I went ahead and added on top of v3 a test
> >
> > verifying that the gzip program can actually be called.
>
> - system_log($gzip, '--version') != 0);
>
>
>
> Checking after that does not hurt, indeed. I am wondering if we
>
> should do that for TAR.

I do not think that this will be a necessity for TAR. TAR after all
is discovered by autoconf, which gzip is not.

>
> Another thing I find unnecessary is the number of the tests. This
>
> does two rounds of pg_receivewal just to test the long and short
>
> options of -Z/-compress, which brings only coverage to make sure that
>
> both option names are handled. That's a high cost for a low amound of
>
> extra coverage, so let's cut the runtime in half and just use the
>
> round with --compress.

I am sorry this was not so clear. It is indeed running twice the binary
with different flags. However the goal is not to check the flags, but
to make certain that the partial file has now been completed. That is
why there was code asserting that the previous FILENAME.gz.partial file
after the second invocation is converted to FILENAME.gz.

Additionally the second invocation of pg_receivewal is extending the
coverage of FindStreamingStart().

The different flags was an added bonus.

>
> There was also a bit of confusion with ZLIB and gzip in the variable
>
> names and the comments, the latter being only the command while the
>
> compression happens with zlib. With a round of indentation on top of
>
> all that, I ge tthe attached.
>
> What do you think?

Thank you very much for the patch. I would prefer to keep the parts that
tests that .gz.partial are completed on a subsequent run if you agree.

Cheers,
//Georgios

> --
>
> Michael




Re: Introduce pg_receivewal gzip compression tests

2021-07-12 Thread Michael Paquier
On Mon, Jul 12, 2021 at 04:46:29PM +, gkokola...@pm.me wrote:
> On Monday, July 12th, 2021 at 17:07,  wrote:
>> ‐‐‐ Original Message ‐‐‐

Are you using outlook?  The format of your messages gets blurry on the
PG website, so does it for me.

>> I am admittedly not so well versed on Windows systems. Thank you for
>> informing me.
>> Please find attached v3 of the patch where $ENV{GZIP_PROGRAM} is used
>> instead. To the best of my knowledge one should avoid using $ENV{GZIP}
>> because that would translate to the obsolete, yet used environment
>> variable GZIP which holds a set of default options for gzip. In essence
>> it would be equivalent to executing:
>> GZIP=gzip gzip --test 
>> which can result to errors similar to:
>> gzip: gzip: non-option in GZIP environment variable

-# make this available to TAP test scripts
+# make these available to TAP test scripts
 export TAR
+export GZIP_PROGRAM=$(GZIP)

Wow.  So this comes from the fact that the command gzip can feed on
the environment variable from the same name.  I was not aware of
that, and a comment would be in place here.  That means complicating a
bit the test flow for people on Windows, but I am fine to live with
that as long as this does not fail hard.  One extra thing we could do
is drop this part of the test, but I agree that this is useful to have
around as a validity check.

> After a bit more thinking, I went ahead and added on top of v3 a test
> verifying that the gzip program can actually be called.

+   system_log($gzip, '--version') != 0);
Checking after that does not hurt, indeed.  I am wondering if we
should do that for TAR.

Another thing I find unnecessary is the number of the tests.  This
does two rounds of pg_receivewal just to test the long and short
options of -Z/-compress, which brings only coverage to make sure that
both option names are handled.  That's a high cost for a low amound of
extra coverage, so let's cut the runtime in half and just use the
round with --compress.

There was also a bit of confusion with ZLIB and gzip in the variable
names and the comments, the latter being only the command while the
compression happens with zlib.  With a round of indentation on top of
all that, I ge tthe attached.

What do you think?
--
Michael
From 44d971f8d4187fa34dc7426b629eafeeb0af53c2 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Tue, 13 Jul 2021 10:51:20 +0900
Subject: [PATCH v5] Add some tests for pg_receivewal --compress

---
 src/bin/pg_basebackup/Makefile   |  6 ++-
 src/bin/pg_basebackup/t/020_pg_receivewal.pl | 46 +++-
 2 files changed, 50 insertions(+), 2 deletions(-)

diff --git a/src/bin/pg_basebackup/Makefile b/src/bin/pg_basebackup/Makefile
index 66e0070f1a..459d514183 100644
--- a/src/bin/pg_basebackup/Makefile
+++ b/src/bin/pg_basebackup/Makefile
@@ -18,8 +18,12 @@ subdir = src/bin/pg_basebackup
 top_builddir = ../../..
 include $(top_builddir)/src/Makefile.global
 
-# make this available to TAP test scripts
+# make these available to TAP test scripts
 export TAR
+# Note that GZIP cannot be used directly as this environment variable is
+# used by the command "gzip" to pass down options, so stick with a different
+# name.
+export GZIP_PROGRAM=$(GZIP)
 
 override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
 LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
diff --git a/src/bin/pg_basebackup/t/020_pg_receivewal.pl b/src/bin/pg_basebackup/t/020_pg_receivewal.pl
index a547c97ef1..9fa1a3378b 100644
--- a/src/bin/pg_basebackup/t/020_pg_receivewal.pl
+++ b/src/bin/pg_basebackup/t/020_pg_receivewal.pl
@@ -5,7 +5,7 @@ use strict;
 use warnings;
 use TestLib;
 use PostgresNode;
-use Test::More tests => 19;
+use Test::More tests => 23;
 
 program_help_ok('pg_receivewal');
 program_version_ok('pg_receivewal');
@@ -66,6 +66,50 @@ $primary->command_ok(
 	],
 	'streaming some WAL with --synchronous');
 
+# Check ZLIB compression if available.
+SKIP:
+{
+	skip "postgres was not build with ZLIB support", 4
+	  if (!check_pg_config("#define HAVE_LIBZ 1"));
+
+	# Generate more WAL.
+	$primary->psql('postgres', 'SELECT pg_switch_wal();');
+	$nextlsn =
+	  $primary->safe_psql('postgres', 'SELECT pg_current_wal_insert_lsn();');
+	chomp($nextlsn);
+	$primary->psql('postgres',
+		'INSERT INTO test_table VALUES (generate_series(100,200));');
+	$primary->psql('postgres', 'SELECT pg_switch_wal();');
+
+	$primary->command_ok(
+		[
+			'pg_receivewal', '-D', $stream_dir,  '--verbose',
+			'--endpos',  $nextlsn, '--compress', '1'
+		],
+		"streaming some WAL using ZLIB compression");
+
+	# Verify that the stored files are generated with the expected names.
+	my @zlib_wals = glob "$stream_dir/*.gz";
+	is(scalar(@zlib_wals), 1,
+		"one WAL segment compressed with ZLIB was created");
+	my @zlib_partial_wals = glob "$stream_dir/*.gz.partial";
+	is(scalar(@zlib_partial_wals),
+		1, "one partial WAL segment compressed with ZLIB was created");
+
+	# There are two files 

Re: Introduce pg_receivewal gzip compression tests

2021-07-12 Thread gkokolatos


‐‐‐ Original Message ‐‐‐

On Monday, July 12th, 2021 at 17:07,  wrote:

> ‐‐‐ Original Message ‐‐‐
>
> On Monday, July 12th, 2021 at 13:04, Michael Paquier mich...@paquier.xyz 
> wrote:
>
> > On Mon, Jul 12, 2021 at 09:42:32AM +, gkokola...@pm.me wrote:
> >
> > > This to my understanding means that gzip is expected to exist.
> > >
> > > If this is correct, then simply checking for the headers should
> > >
> > > suffice, since that is the only dependency for the files to be
> > >
> > > created.
> >
> > You cannot expect this to work on Windows when it comes to MSVC for
> >
> > example, as gzip may not be in the environment PATH so the test would
> >
> > fail hard. Let's just rely on $ENV{GZIP} instead, and skip the test
> >
> > if it is not defined.
>
> I am admittedly not so well versed on Windows systems. Thank you for
>
> informing me.
>
> Please find attached v3 of the patch where $ENV{GZIP_PROGRAM} is used
>
> instead. To the best of my knowledge one should avoid using $ENV{GZIP}
>
> because that would translate to the obsolete, yet used environment
>
> variable GZIP which holds a set of default options for gzip. In essence
>
> it would be equivalent to executing:
>
> GZIP=gzip gzip --test 
>
> which can result to errors similar to:
>
> gzip: gzip: non-option in GZIP environment variable
>

After a bit more thinking, I went ahead and added on top of v3 a test
verifying that the gzip program can actually be called.

Please find v4 attached.

Cheers,
//Georgios

>
> > Michael

v4-0001-Introduce-pg_receivewal-gzip-compression-tests.patch
Description: Binary data


Re: Introduce pg_receivewal gzip compression tests

2021-07-12 Thread gkokolatos


‐‐‐ Original Message ‐‐‐

On Monday, July 12th, 2021 at 13:04, Michael Paquier  
wrote:

> On Mon, Jul 12, 2021 at 09:42:32AM +, gkokola...@pm.me wrote:
>
> > This to my understanding means that gzip is expected to exist.
> >
> > If this is correct, then simply checking for the headers should
> >
> > suffice, since that is the only dependency for the files to be
> >
> > created.
>
> You cannot expect this to work on Windows when it comes to MSVC for
>
> example, as gzip may not be in the environment PATH so the test would
>
> fail hard. Let's just rely on $ENV{GZIP} instead, and skip the test
>
> if it is not defined.

I am admittedly not so well versed on Windows systems. Thank you for
informing me.

Please find attached v3 of the patch where $ENV{GZIP_PROGRAM} is used
instead. To the best of my knowledge one should avoid using $ENV{GZIP}
because that would translate to the obsolete, yet used environment
variable GZIP which holds a set of default options for gzip. In essence
it would be equivalent to executing:
   GZIP=gzip gzip --test 
which can result to errors similar to:
   gzip: gzip: non-option in GZIP environment variable

Cheers,
//Georgios

> 
>
> Michael

v3-0001-Introduce-pg_receivewal-gzip-compression-tests.patch
Description: Binary data


Re: Introduce pg_receivewal gzip compression tests

2021-07-12 Thread gkokolatos



‐‐‐ Original Message ‐‐‐

On Monday, July 12th, 2021 at 13:00, Gilles Darold  wrote:

> Le 12/07/2021 à 12:27, gkokola...@pm.me a écrit :
>
> > > > > Shouldn't this be coded as a loop going through @gzip_wals?
> > > > >
> > > > > I would hope that there is only one gz file created. There is a line
> > > >
> > > > further up that tests exactly that.
> > > >
> > > > -   is (scalar(@gzip_wals), 1, "one gzip compressed WAL was created");
> > > >
> > > > Let me amend that. The line should be instead:
> > >
> > > is (scalar(keys @gzip_wals), 1, "one gzip compressed WAL was created");
> > >
> > > To properly test that there is one entry.
> > >
> > > Let me provide with v2 to fix this.
>
> The following tests are not correct in Perl even if Perl returns the
>
> right value.
>
> +    is (scalar(keys @gzip_wals), 1, "one gzip compressed WAL was created");
>
> +    is (scalar(keys @gzip_partial_wals), 1,
>
> +        "one partial gzip compressed WAL was created");
>
> Function keys or values are used only with hashes but here you are using
>
> arrays. To obtain the length of the array you can just use the scalar
>
> function as Perl returns the length of the array when it is called in a
>
> scalar context. Please use the following instead:
>
> +    is (scalar(@gzip_wals), 1, "one gzip compressed WAL was created");
>
> +    is (scalar(@gzip_partial_wals), 1,
>
> +        "one partial gzip compressed WAL was created");

You are absolutely correct. I had used that in v1, yet since it got called out
I doubted myself, assumed I was wrong and the rest is history. I shall ament the
amendment for v3 of the patch.

Cheers,
//Georgios

>
>
> -
>
> Gilles Darold
>
> http://www.darold.net/




Re: Introduce pg_receivewal gzip compression tests

2021-07-12 Thread Michael Paquier
On Mon, Jul 12, 2021 at 09:42:32AM +, gkokola...@pm.me wrote:
> This to my understanding means that gzip is expected to exist.
> If this is correct, then simply checking for the headers should
> suffice, since that is the only dependency for the files to be
> created.

You cannot expect this to work on Windows when it comes to MSVC for
example, as gzip may not be in the environment PATH so the test would
fail hard.  Let's just rely on $ENV{GZIP} instead, and skip the test
if it is not defined.
--
Michael


signature.asc
Description: PGP signature


Re: Introduce pg_receivewal gzip compression tests

2021-07-12 Thread Gilles Darold
Le 12/07/2021 à 12:27, gkokola...@pm.me a écrit :

 Shouldn't this be coded as a loop going through @gzip_wals?
>>> I would hope that there is only one gz file created. There is a line
>>>
>>> further up that tests exactly that.
>>>
>>> -   is (scalar(@gzip_wals), 1, "one gzip compressed WAL was created");
>> Let me amend that. The line should be instead:
>>
>> is (scalar(keys @gzip_wals), 1, "one gzip compressed WAL was created");
>>
>> To properly test that there is one entry.
>>
>> Let me provide with v2 to fix this.


The following tests are not correct in Perl even if Perl returns the
right value.

+    is (scalar(keys @gzip_wals), 1, "one gzip compressed WAL was created");


+    is (scalar(keys @gzip_partial_wals), 1,
+        "one partial gzip compressed WAL was created");


Function keys or values are used only with hashes but here you are using
arrays. To obtain the length of the array you can just use the scalar
function as Perl returns the length of the array when it is called in a
scalar context. Please use the following instead:


+    is (scalar(@gzip_wals), 1, "one gzip compressed WAL was created");


+    is (scalar(@gzip_partial_wals), 1,
+        "one partial gzip compressed WAL was created");


-- 
Gilles Darold
http://www.darold.net/






Re: Introduce pg_receivewal gzip compression tests

2021-07-12 Thread gkokolatos


‐‐‐ Original Message ‐‐‐

On Monday, July 12th, 2021 at 11:56,  wrote:

> ‐‐‐ Original Message ‐‐‐
>
> On Monday, July 12th, 2021 at 11:42, gkokola...@pm.me wrote:
>
> > ‐‐‐ Original Message ‐‐‐
> >
> > On Monday, July 12th, 2021 at 08:42, Michael Paquier mich...@paquier.xyz 
> > wrote:
> >
> > > On Fri, Jul 09, 2021 at 11:26:58AM +, Georgios wrote:
> > >
> > > > As suggested on a different thread [1], pg_receivewal can increase it's 
> > > > test
> > > >
> > > > coverage. There exists a non trivial amount of code that handles gzip
> > > >
> > > > compression. The current patch introduces tests that cover creation of 
> > > > gzip
> > > >
> > > > compressed WAL files and the handling of gzip partial segments. Finally 
> > > > the
> > > >
> > > > integrity of the compressed files is verified.
> > >
> > > - # Verify compressed file's integrity
> > >
> > >
> > > - my $gzip_is_valid = system_log('gzip', '--test', 
> > > $gzip_wals[0]);
> > >
> > >
> > > - is($gzip_is_valid, 0, "program gzip verified file's 
> > > integrity");
> > >
> > >
> > >
> > > libz and gzip are usually split across different packages, hence there
> > >
> > > is no guarantee that this command is always available (same comment as
> > >
> > > for LZ4 from a couple of days ago).
> >
> > Of course. Though while going for it, I did find in Makefile.global.in:
> >
> > TAR = @TAR@
> >
> > XGETTEXT = @XGETTEXT@
> >
> > GZIP = gzip
> >
> > BZIP2 = bzip2
> >
> > DOWNLOAD = wget -O $@ --no-use-server-timestamps
> >
> > Which is also used by GNUmakefile.in
> >
> > distcheck: dist
> >
> > rm -rf $(dummy)
> >
> > mkdir $(dummy)
> >
> > $(GZIP) -d -c $(distdir).tar.gz | $(TAR) xf -
> >
> > install_prefix=`cd $(dummy) && pwd`; \
> >
> > This to my understanding means that gzip is expected to exist.
> >
> > If this is correct, then simply checking for the headers should
> >
> > suffice, since that is the only dependency for the files to be
> >
> > created.
> >
> > If this is wrong, then I will add the discovery code as in the
> >
> > other patch.
> >
> > > - [
> > >
> > >
> > > - 'pg_receivewal', '-D', $stream_dir, 
> > > '--verbose',
> > >
> > >
> > > - '--endpos',  $nextlsn, '-Z', '5'
> > >
> > >
> > > - ],
> > >
> > >
> > >
> > > I would keep the compression level to a minimum here, to limit CPU
> > >
> > > usage but still compress something faster.
> > >
> > > - # Verify compressed file's integrity
> > >
> > >
> > > - my $gzip_is_valid = system_log('gzip', '--test', 
> > > $gzip_wals[0]);
> > >
> > >
> > > - is($gzip_is_valid, 0, "program gzip verified file's 
> > > integrity");
> > >
> > >
> > >
> > > Shouldn't this be coded as a loop going through @gzip_wals?
> >
> > I would hope that there is only one gz file created. There is a line
> >
> > further up that tests exactly that.
> >
> > -   is (scalar(@gzip_wals), 1, "one gzip compressed WAL was created");
>
> Let me amend that. The line should be instead:
>
> is (scalar(keys @gzip_wals), 1, "one gzip compressed WAL was created");
>
> To properly test that there is one entry.
>
> Let me provide with v2 to fix this.


Please find v2 attached with the above.

Cheers,
//Georgios

>
> Cheers,
>
> //Georgios
>
> > Then there should also be a partial gz file which is tested further 
> > ahead.
> >
> > Cheers,
> >
> > //Georgios
> >
> >
> > > Michael

v2-0001-Introduce-pg_receivewal-gzip-compression-tests.patch
Description: Binary data


Re: Introduce pg_receivewal gzip compression tests

2021-07-12 Thread gkokolatos



‐‐‐ Original Message ‐‐‐

On Monday, July 12th, 2021 at 11:42,  wrote:

> ‐‐‐ Original Message ‐‐‐
>
> On Monday, July 12th, 2021 at 08:42, Michael Paquier mich...@paquier.xyz 
> wrote:
>
> > On Fri, Jul 09, 2021 at 11:26:58AM +, Georgios wrote:
> >
> > > As suggested on a different thread [1], pg_receivewal can increase it's 
> > > test
> > >
> > > coverage. There exists a non trivial amount of code that handles gzip
> > >
> > > compression. The current patch introduces tests that cover creation of 
> > > gzip
> > >
> > > compressed WAL files and the handling of gzip partial segments. Finally 
> > > the
> > >
> > > integrity of the compressed files is verified.
> >
> > -   # Verify compressed file's integrity
> >
> >
> > -   my $gzip_is_valid = system_log('gzip', '--test', $gzip_wals[0]);
> >
> >
> > -   is($gzip_is_valid, 0, "program gzip verified file's integrity");
> >
> >
> >
> > libz and gzip are usually split across different packages, hence there
> >
> > is no guarantee that this command is always available (same comment as
> >
> > for LZ4 from a couple of days ago).
>
> Of course. Though while going for it, I did find in Makefile.global.in:
>
> TAR = @TAR@
>
> XGETTEXT = @XGETTEXT@
>
> GZIP = gzip
>
> BZIP2 = bzip2
>
> DOWNLOAD = wget -O $@ --no-use-server-timestamps
>
> Which is also used by GNUmakefile.in
>
> distcheck: dist
>
> rm -rf $(dummy)
>
> mkdir $(dummy)
>
> $(GZIP) -d -c $(distdir).tar.gz | $(TAR) xf -
>
> install_prefix=`cd $(dummy) && pwd`; \
>
> This to my understanding means that gzip is expected to exist.
>
> If this is correct, then simply checking for the headers should
>
> suffice, since that is the only dependency for the files to be
>
> created.
>
> If this is wrong, then I will add the discovery code as in the
>
> other patch.
>
> > -   [
> >
> >
> > -   'pg_receivewal', '-D', $stream_dir, 
> > '--verbose',
> >
> >
> > -   '--endpos',  $nextlsn, '-Z', '5'
> >
> >
> > -   ],
> >
> >
> >
> > I would keep the compression level to a minimum here, to limit CPU
> >
> > usage but still compress something faster.
> >
> > -   # Verify compressed file's integrity
> >
> >
> > -   my $gzip_is_valid = system_log('gzip', '--test', $gzip_wals[0]);
> >
> >
> > -   is($gzip_is_valid, 0, "program gzip verified file's integrity");
> >
> >
> >
> > Shouldn't this be coded as a loop going through @gzip_wals?
>
> I would hope that there is only one gz file created. There is a line
>
> further up that tests exactly that.
>
> -   is (scalar(@gzip_wals), 1, "one gzip compressed WAL was created");


Let me amend that. The line should be instead:

  is (scalar(keys @gzip_wals), 1, "one gzip compressed WAL was created");

To properly test that there is one entry.

Let me provide with v2 to fix this.

Cheers,
//Georgios

>
> Then there should also be a partial gz file which is tested further ahead.
>
> Cheers,
>
> //Georgios
>
> > Michael




Re: Introduce pg_receivewal gzip compression tests

2021-07-12 Thread gkokolatos



‐‐‐ Original Message ‐‐‐

On Monday, July 12th, 2021 at 08:42, Michael Paquier  
wrote:

> On Fri, Jul 09, 2021 at 11:26:58AM +, Georgios wrote:
>
> > As suggested on a different thread [1], pg_receivewal can increase it's test
> >
> > coverage. There exists a non trivial amount of code that handles gzip
> >
> > compression. The current patch introduces tests that cover creation of gzip
> >
> > compressed WAL files and the handling of gzip partial segments. Finally the
> >
> > integrity of the compressed files is verified.
>
> - # Verify compressed file's integrity
>
>
> - my $gzip_is_valid = system_log('gzip', '--test', $gzip_wals[0]);
>
>
> - is($gzip_is_valid, 0, "program gzip verified file's integrity");
>
>
>
> libz and gzip are usually split across different packages, hence there
>
> is no guarantee that this command is always available (same comment as
>
> for LZ4 from a couple of days ago).


Of course. Though while going for it, I did find in Makefile.global.in:

  TAR = @TAR@
  XGETTEXT = @XGETTEXT@

  GZIP= gzip
  BZIP2   = bzip2

  DOWNLOAD = wget -O $@ --no-use-server-timestamps

Which is also used by GNUmakefile.in

  distcheck: dist
  rm -rf $(dummy)
  mkdir $(dummy)
  $(GZIP) -d -c $(distdir).tar.gz | $(TAR) xf -
  install_prefix=`cd $(dummy) && pwd`; \


This to my understanding means that gzip is expected to exist.
If this is correct, then simply checking for the headers should
suffice, since that is the only dependency for the files to be
created.

If this is wrong, then I will add the discovery code as in the
other patch.

>
> - [
>
>
> - 'pg_receivewal', '-D', $stream_dir, '--verbose',
>
>
> - '--endpos',  $nextlsn, '-Z', '5'
>
>
> - ],
>
>
>
> I would keep the compression level to a minimum here, to limit CPU
>
> usage but still compress something faster.
>
> - # Verify compressed file's integrity
>
>
> - my $gzip_is_valid = system_log('gzip', '--test', $gzip_wals[0]);
>
>
> - is($gzip_is_valid, 0, "program gzip verified file's integrity");
>
>
>
> Shouldn't this be coded as a loop going through @gzip_wals?

I would hope that there is only one gz file created. There is a line
further up that tests exactly that.

+   is (scalar(@gzip_wals), 1, "one gzip compressed WAL was created");


Then there should also be a partial gz file which is tested further ahead.

Cheers,
//Georgios

> ---
>
> Michael




Re: Introduce pg_receivewal gzip compression tests

2021-07-12 Thread Michael Paquier
On Fri, Jul 09, 2021 at 11:26:58AM +, Georgios wrote:
> As suggested on a different thread [1], pg_receivewal can increase it's test
> coverage. There exists a non trivial amount of code that handles gzip
> compression. The current patch introduces tests that cover creation of gzip
> compressed WAL files and the handling of gzip partial segments. Finally the
> integrity of the compressed files is verified.

+   # Verify compressed file's integrity
+   my $gzip_is_valid = system_log('gzip', '--test', $gzip_wals[0]);
+   is($gzip_is_valid, 0, "program gzip verified file's integrity");
libz and gzip are usually split across different packages, hence there
is no guarantee that this command is always available (same comment as
for LZ4 from a couple of days ago).

+   [
+   'pg_receivewal', '-D', $stream_dir, '--verbose',
+   '--endpos',  $nextlsn, '-Z', '5'
+   ],
I would keep the compression level to a minimum here, to limit CPU
usage but still compress something faster.

+   # Verify compressed file's integrity
+   my $gzip_is_valid = system_log('gzip', '--test', $gzip_wals[0]);
+   is($gzip_is_valid, 0, "program gzip verified file's integrity");
Shouldn't this be coded as a loop going through @gzip_wals?
--
Michael


signature.asc
Description: PGP signature


Introduce pg_receivewal gzip compression tests

2021-07-09 Thread Georgios
Hi,

As suggested on a different thread [1], pg_receivewal can increase it's test
coverage. There exists a non trivial amount of code that handles gzip
compression. The current patch introduces tests that cover creation of gzip
compressed WAL files and the handling of gzip partial segments. Finally the
integrity of the compressed files is verified.

I hope you find this useful.

Cheers,
//Georgios

[1] 
https://www.postgresql.org/message-id/flat/ZCm1J5vfyQ2E6dYvXz8si39HQ2gwxSZ3IpYaVgYa3lUwY88SLapx9EEnOf5uEwrddhx2twG7zYKjVeuP5MwZXCNPybtsGouDsAD1o2L_I5E%3D%40pm.me

v1-0001-Introduce-pg_receivewal-gzip-compression-tests.patch
Description: Binary data