On Mon, Apr 11, 2022 at 12:46:02PM +0000, gkokola...@pm.me wrote:
> On Monday, April 11th, 2022 at 8:52 AM, Michael Paquier <mich...@paquier.xyz>
> wrote:
>> - 0001 is a simple rename of backup_compression.{c,h} to
>> compression.{c,h}, removing anything related to base backups from
>> that. One extra reason behind this renaming is that I would like to
>> use this infra for pg_dump, but that's material for 16~.
> 
> I agree with the design. If you permit me a couple of nitpicks regarding 
> naming.
> 
> +typedef enum pg_compress_algorithm
> +{
> +   PG_COMPRESSION_NONE,
> +   PG_COMPRESSION_GZIP,
> +   PG_COMPRESSION_LZ4,
> +   PG_COMPRESSION_ZSTD
> +} pg_compress_algorithm;
> 
> Elsewhere in the codebase, (e.g. create_table.sgml, alter_table.sgml,
> brin_tuple.c, detoast.c, toast_compression.c, tupdesc.c, gist.c to mention a
> few) variations of of the nomenclature "compression method" are used, like
> 'VARDATA_COMPRESSED_GET_COMPRESS_METHOD' or 'InvalidCompressionMethod' etc. I
> feel that it would be nicer if we followed one naming rule for this and I
> recommend to substitute algorithm for method throughout.

Technically and as far as I know, both are correct and hold more or
less the same meaning.  pg_basebackup's code exposes algorithm in a
more extended way, so I have just stuck to it for the internal
variables and such.  Perhaps we could rename the whole, but I see no
strong reason to do that.

> Last, even though it is not needed now, it will be helpful to have a
> PG_COMPRESSION_INVALID in some scenarios. Though we can add it when we come to
> it.

Perhaps.  There is no need for it yet, though.  pg_dump would not need
that, as well.

>> - 0002 removes WalCompressionMethod, replacing it by
>> pg_compress_algorithm as these are the same enums. Robert complained
>> about the confusion that WalCompressionMethod could lead to as this
>> could be used for the compression of data, and not only WAL. I have
>> renamed some variables to be more consistent, while on it.
> 
> It looks good. If you choose to discard the comment regarding the use of
> 'method' over 'algorithm' from above, can you please use the full word in the
> variable, e.g. 'wal_compress_algorithm' instead of 'wal_compress_algo'. I can
> not really explain it, the later reads a bit rude. Then again that may be just
> me.

Thanks.  I have been able to do an extra pass on 0001 and 0002, fixing
those naming inconsistencies with "algo" vs "algorithm" that you and
Robert have reported, and applied them.  For 0003, I'll look at it
later.  Attached is a rebase with improvements about the variable
names.
--
Michael
From 94850ac604402371135e85de27a5d9074a947d4b Mon Sep 17 00:00:00 2001
From: Michael Paquier <mich...@paquier.xyz>
Date: Tue, 12 Apr 2022 18:14:28 +0900
Subject: [PATCH v2] Rework compression options of pg_receivewal

Since babbbb5 and the introduction of LZ4 in pg_receivewal, the
compression of archived WAL is controlled by two options:
- --compression-method with "gzip", "none" or "lz4" as possible value.
- --compress=N to specify a compression level, with a
backward-incompatible change where a value of zero leads to a failure
instead of no compression.

This commit takes advantage of the previous refactoring done to rework
the compression options of pg_receivewal:
- --compression-method is removed.
- --compress is extended to use the same grammar as pg_basebackup, as of
a METHOD:DETAIL specification, where a METHOD is "gzip", "none" or "lz4"
and a DETAIL is a comma-separated list of options, the only keyword
supported is now "level" to control the compression level.  If only an
integer is specified as value of this option, "none" is implied for 0
and "gzip" is implied.  So this brings back --compress to be
backward-compatible with ~14, while still supporting LZ4.

This has the advantage of centralizing the set of checks used by
pg_basebackup.
---
 src/bin/pg_basebackup/pg_receivewal.c        | 131 +++++++++++++------
 src/bin/pg_basebackup/t/020_pg_receivewal.pl |  16 +--
 doc/src/sgml/ref/pg_receivewal.sgml          |  47 ++++---
 3 files changed, 121 insertions(+), 73 deletions(-)

diff --git a/src/bin/pg_basebackup/pg_receivewal.c b/src/bin/pg_basebackup/pg_receivewal.c
index ede1d4648d..f942883332 100644
--- a/src/bin/pg_basebackup/pg_receivewal.c
+++ b/src/bin/pg_basebackup/pg_receivewal.c
@@ -57,6 +57,8 @@ static XLogRecPtr endpos = InvalidXLogRecPtr;
 
 
 static void usage(void);
+static void parse_compress_options(char *option, char **algorithm,
+								   char **detail);
 static DIR *get_destination_dir(char *dest_folder);
 static void close_destination_dir(DIR *dest_dir, char *dest_folder);
 static XLogRecPtr FindStreamingStart(uint32 *tli);
@@ -90,9 +92,8 @@ usage(void)
 	printf(_("      --synchronous      flush write-ahead log immediately after writing\n"));
 	printf(_("  -v, --verbose          output verbose messages\n"));
 	printf(_("  -V, --version          output version information, then exit\n"));
-	printf(_("      --compression-method=METHOD\n"
-			 "                         method to compress logs\n"));
-	printf(_("  -Z, --compress=1-9     compress logs with given compression level\n"));
+	printf(_("  -Z, --compress=METHOD[:DETAIL]\n"
+			 "                         compress as specified\n"));
 	printf(_("  -?, --help             show this help, then exit\n"));
 	printf(_("\nConnection options:\n"));
 	printf(_("  -d, --dbname=CONNSTR   connection string\n"));
@@ -108,6 +109,66 @@ usage(void)
 	printf(_("%s home page: <%s>\n"), PACKAGE_NAME, PACKAGE_URL);
 }
 
+/*
+ * Basic parsing of a value specified for -Z/--compress
+ *
+ * The parsing consists of a METHOD:DETAIL string fed later on to a more
+ * advanced routine in charge of proper validation checks.  This only extracts
+ * METHOD and DETAIL.  If only an integer is found, the method is implied by
+ * the value specified.
+ */
+static void
+parse_compress_options(char *option, char **algorithm, char **detail)
+{
+	char	   *sep;
+	char	   *endp;
+	long		result;
+
+	/*
+	 * Check whether the compression specification consists of a bare integer.
+	 *
+	 * For backward-compatibility, assume "none" if the integer found is zero
+	 * and "gzip" otherwise.
+	 */
+	result = strtol(option, &endp, 10);
+	if (*endp == '\0')
+	{
+		if (result == 0)
+		{
+			*algorithm = pstrdup("none");
+			*detail = NULL;
+		}
+		else
+		{
+			*algorithm = pstrdup("gzip");
+			*detail = pstrdup(option);
+		}
+		return;
+	}
+
+	/*
+	 * Check whether there is a compression detail following the algorithm
+	 * name.
+	 */
+	sep = strchr(option, ':');
+	if (sep == NULL)
+	{
+		*algorithm = pstrdup(option);
+		*detail = NULL;
+	}
+	else
+	{
+		char   *alg;
+
+		alg = palloc((sep - option) + 1);
+		memcpy(alg, option, sep - option);
+		alg[sep - option] = '\0';
+
+		*algorithm = alg;
+		*detail = pstrdup(sep + 1);
+	}
+}
+
 /*
  * Check if the filename looks like a WAL file, letting caller know if this
  * WAL segment is partial and/or compressed.
@@ -651,7 +712,6 @@ main(int argc, char **argv)
 		{"if-not-exists", no_argument, NULL, 3},
 		{"synchronous", no_argument, NULL, 4},
 		{"no-sync", no_argument, NULL, 5},
-		{"compression-method", required_argument, NULL, 6},
 		{NULL, 0, NULL, 0}
 	};
 
@@ -660,6 +720,10 @@ main(int argc, char **argv)
 	char	   *db_name;
 	uint32		hi,
 				lo;
+	pg_compress_specification compression_spec;
+	char	   *compression_detail = NULL;
+	char	   *compression_algorithm_str = "none";
+	char	   *error_detail = NULL;
 
 	pg_logging_init(argv[0]);
 	progname = get_progname(argv[0]);
@@ -728,9 +792,8 @@ main(int argc, char **argv)
 				verbose++;
 				break;
 			case 'Z':
-				if (!option_parse_int(optarg, "-Z/--compress", 1, 9,
-									  &compresslevel))
-					exit(1);
+				parse_compress_options(optarg, &compression_algorithm_str,
+									   &compression_detail);
 				break;
 /* action */
 			case 1:
@@ -748,17 +811,6 @@ main(int argc, char **argv)
 			case 5:
 				do_sync = false;
 				break;
-			case 6:
-				if (pg_strcasecmp(optarg, "gzip") == 0)
-					compression_algorithm = PG_COMPRESSION_GZIP;
-				else if (pg_strcasecmp(optarg, "lz4") == 0)
-					compression_algorithm = PG_COMPRESSION_LZ4;
-				else if (pg_strcasecmp(optarg, "none") == 0)
-					compression_algorithm = PG_COMPRESSION_NONE;
-				else
-					pg_fatal("invalid value \"%s\" for option %s",
-							 optarg, "--compression-method");
-				break;
 			default:
 				/* getopt_long already emitted a complaint */
 				pg_log_error_hint("Try \"%s --help\" for more information.", progname);
@@ -810,24 +862,30 @@ main(int argc, char **argv)
 		exit(1);
 	}
 
-
 	/*
-	 * Compression-related options.
+	 * Compression options
 	 */
+	if (!parse_compress_algorithm(compression_algorithm_str,
+								  &compression_algorithm))
+		pg_fatal("unrecognized compression algorithm \"%s\"",
+				 compression_algorithm_str);
+
+	parse_compress_specification(compression_algorithm, compression_detail,
+								 &compression_spec);
+	error_detail = validate_compress_specification(&compression_spec);
+	if (error_detail != NULL)
+		pg_fatal("invalid compression specification: %s",
+				 error_detail);
+
+	/* Extract the compression level, if found in the specification */
+	if ((compression_spec.options & PG_COMPRESSION_OPTION_LEVEL) != 0)
+		compresslevel = compression_spec.level;
+
 	switch (compression_algorithm)
 	{
-		case PG_COMPRESSION_NONE:
-			if (compresslevel != 0)
-			{
-				pg_log_error("cannot use --compress with --compression-method=%s",
-							 "none");
-				pg_log_error_hint("Try \"%s --help\" for more information.", progname);
-				exit(1);
-			}
-			break;
 		case PG_COMPRESSION_GZIP:
 #ifdef HAVE_LIBZ
-			if (compresslevel == 0)
+			if ((compression_spec.options & PG_COMPRESSION_OPTION_LEVEL) == 0)
 			{
 				pg_log_info("no value specified for --compress, switching to default");
 				compresslevel = Z_DEFAULT_COMPRESSION;
@@ -837,16 +895,11 @@ main(int argc, char **argv)
 					 "gzip");
 #endif
 			break;
+		case PG_COMPRESSION_NONE:
+			/* nothing to do */
+			break;
 		case PG_COMPRESSION_LZ4:
-#ifdef USE_LZ4
-			if (compresslevel != 0)
-			{
-				pg_log_error("cannot use --compress with --compression-method=%s",
-							 "lz4");
-				pg_log_error_hint("Try \"%s --help\" for more information.", progname);
-				exit(1);
-			}
-#else
+#ifndef USE_LZ4
 			pg_fatal("this build does not support compression with %s",
 					 "LZ4");
 #endif
diff --git a/src/bin/pg_basebackup/t/020_pg_receivewal.pl b/src/bin/pg_basebackup/t/020_pg_receivewal.pl
index 8c38816b22..57e274cd34 100644
--- a/src/bin/pg_basebackup/t/020_pg_receivewal.pl
+++ b/src/bin/pg_basebackup/t/020_pg_receivewal.pl
@@ -35,11 +35,10 @@ $primary->command_fails(
 	'failure if --synchronous specified with --no-sync');
 $primary->command_fails_like(
 	[
-		'pg_receivewal', '-D', $stream_dir, '--compression-method', 'none',
-		'--compress',    '1'
+		'pg_receivewal', '-D', $stream_dir, '--compress', 'none:1',
 	],
-	qr/\Qpg_receivewal: error: cannot use --compress with --compression-method=none/,
-	'failure if --compress specified with --compression-method=none');
+	qr/\Qpg_receivewal: error: invalid compression specification: compression algorithm "none" does not accept a compression level/,
+	'failure if --compress none:N (where N > 0)');
 
 # Slot creation and drop
 my $slot_name = 'test';
@@ -93,15 +92,12 @@ SKIP:
 	chomp($nextlsn);
 	$primary->psql('postgres', 'INSERT INTO test_table VALUES (2);');
 
-	# Note the trailing whitespace after the value of --compress, that is
-	# a valid value.
 	$primary->command_ok(
 		[
 			'pg_receivewal',        '-D',
 			$stream_dir,            '--verbose',
 			'--endpos',             $nextlsn,
-			'--compression-method', 'gzip',
-			'--compress',           '1 ',
+			'--compress',           'gzip:1',
 			'--no-loop'
 		],
 		"streaming some WAL using ZLIB compression");
@@ -156,10 +152,10 @@ SKIP:
 			'pg_receivewal', '-D',
 			$stream_dir,     '--verbose',
 			'--endpos',      $nextlsn,
-			'--no-loop',     '--compression-method',
+			'--no-loop',     '--compress',
 			'lz4'
 		],
-		'streaming some WAL using --compression-method=lz4');
+		'streaming some WAL using --compress=lz4');
 
 	# Verify that the stored files are generated with their expected
 	# names.
diff --git a/doc/src/sgml/ref/pg_receivewal.sgml b/doc/src/sgml/ref/pg_receivewal.sgml
index b846213fb7..4fe9e1a874 100644
--- a/doc/src/sgml/ref/pg_receivewal.sgml
+++ b/doc/src/sgml/ref/pg_receivewal.sgml
@@ -263,15 +263,32 @@ PostgreSQL documentation
      </varlistentry>
 
      <varlistentry>
-      <term><option>--compression-method=<replaceable class="parameter">method</replaceable></option></term>
+      <term><option>-Z <replaceable class="parameter">level</replaceable></option></term>
+      <term><option>-Z <replaceable class="parameter">method</replaceable>[:<replaceable>detail</replaceable>]</option></term>
+      <term><option>--compress=<replaceable class="parameter">level</replaceable></option></term>
+      <term><option>--compress=<replaceable class="parameter">method</replaceable>[:<replaceable>detail</replaceable>]</option></term>
       <listitem>
        <para>
-        Enables compression of write-ahead logs using the specified method.
-        Supported values are <literal>gzip</literal>, <literal>lz4</literal>
-        (if <productname>PostgreSQL</productname> was compiled with
-        <option>--with-lz4</option>), and <literal>none</literal>.
+        Enables compression of write-ahead logs.
+       </para>
+       <para>
+        The compression method can be set to <literal>gzip</literal>,
+        <literal>lz4</literal> (if <productname>PostgreSQL</productname>
+        was compiled with <option>--with-lz4</option>) or
+        <literal>none</literal> for no compression.
+        A compression detail string can optionally be specified.  If the
+        detail string is an integer, it specifies the compression level.
+        Otherwise, it should be a comma-separated list of items, each of the
+        form <literal>keyword</literal> or <literal>keyword=value</literal>.
+        Currently, the only supported keyword is <literal>level</literal>.
+       </para>
+       <para>
+        If no compression level is specified, the default compression level
+        will be used. If only a level is specified without mentioning an
+        algorithm, <literal>gzip</literal> compression will be used if the
+        level is greater than 0, and no compression will be used if the level
+        is 0.
        </para>
-
        <para>
         The suffix <filename>.gz</filename> will automatically be added to
         all filenames when using <literal>gzip</literal>, and the suffix
@@ -279,24 +296,6 @@ PostgreSQL documentation
        </para>
       </listitem>
      </varlistentry>
-
-     <varlistentry>
-      <term><option>-Z <replaceable class="parameter">level</replaceable></option></term>
-      <term><option>--compress=<replaceable class="parameter">level</replaceable></option></term>
-      <listitem>
-       <para>
-        Specifies the compression level (<literal>1</literal> through
-        <literal>9</literal>, <literal>1</literal> being worst compression
-        and <literal>9</literal> being best compression) for WAL segments
-        compressed with <application>gzip</application>.
-       </para>
-
-       <para>
-        This option requires <option>--compression-method</option> to be
-        specified with <literal>gzip</literal>.
-       </para>
-      </listitem>
-     </varlistentry>
     </variablelist>
 
    <para>
-- 
2.35.1

Attachment: signature.asc
Description: PGP signature

Reply via email to