Re: Test to dump and restore objects left behind by regression

Ashutosh Bapat Thu, 11 Jul 2024 22:13:03 -0700

On Tue, Jul 9, 2024 at 1:07 PM Michael Paquier <[email protected]> wrote:
>
> On Mon, Jul 08, 2024 at 03:59:30PM +0530, Ashutosh Bapat wrote:
> > Before submitting the patch, I looked for all the places which mention
> > AdjustUpgrade or AdjustUpgrade.pm to find places where the new module needs
> > to be mentioned. But I didn't find any. AdjustUpgrade is not mentioned
> > in src/test/perl/Makefile or src/test/perl/meson.build. Do we want to also
> > add AdjustUpgrade.pm in those files?
>
> Good question.  This has not been mentioned on the thread that added
> the module:
> https://www.postgresql.org/message-id/891521.1673657296%40sss.pgh.pa.us
>
> And I could see it as being useful if installed.  The same applies to
> Kerberos.pm, actually.  I'll ping that on a new thread.


For now, it may be better to maintain status-quo. If we see a need to
use these modules in future by say extensions or tests outside core
tree, we will add them to meson and make files.

>
> > We could forget 0002. I am fine with that.  But I can change the code such
> > that formats other than "plain" are tested when PG_TEST_EXTRAS contains
> > "regress_dump_formats". Would that be acceptable?
>
> Interesting idea.  That may be acceptable, under the same arguments as
> the xid_wraparound one.

Done. Added a new entry in PG_TEST_EXTRA documentation.

I have merged the two patches now.


-- 
Best Wishes,
Ashutosh Bapat

From aff0939b8aa8566daecf1ac35b9c7fce9fa851ca Mon Sep 17 00:00:00 2001
From: Ashutosh Bapat <[email protected]>
Date: Thu, 27 Jun 2024 10:03:53 +0530
Subject: [PATCH 2/2] Test pg_dump/restore of regression objects

002_pg_upgrade.pl tests pg_upgrade on the regression database left
behind by regression run. Modify it to test dump and restore of the
regression database as well.

Regression database created by regression run contains almost all the
database objects supported by PostgreSQL in various states. The test
thus covers wider dump and restore scenarios.

When PG_TEST_EXTRA has 'regress_dump_formats' in it, test dump and
restore in all supported formats. Otherwise test only "plain" format so
that the test finishes quickly.

Author: Ashutosh Bapat
Reviewed by: Michael Pacquire
Discussion: https://www.postgresql.org/message-id/CAExHW5uF5V=Cjecx3_Z=7xfh4rg2wf61pt+hfquzjbqourz...@mail.gmail.com
---
 doc/src/sgml/regress.sgml                   |  13 ++
 src/bin/pg_upgrade/t/002_pg_upgrade.pl      | 173 ++++++++++++++++++--
 src/test/perl/PostgreSQL/Test/AdjustDump.pm | 109 ++++++++++++
 3 files changed, 277 insertions(+), 18 deletions(-)
 create mode 100644 src/test/perl/PostgreSQL/Test/AdjustDump.pm

diff --git a/doc/src/sgml/regress.sgml b/doc/src/sgml/regress.sgml
index d1042e0222..8c1a9ddc40 100644
--- a/doc/src/sgml/regress.sgml
+++ b/doc/src/sgml/regress.sgml
@@ -336,6 +336,19 @@ make check-world PG_TEST_EXTRA='kerberos ldap ssl load_balance libpq_encryption'
       </para>
      </listitem>
     </varlistentry>
+
+    <varlistentry>
+     <term><literal>regress_dump_formats</literal></term>
+     <listitem>
+      <para>
+       When enabled,
+       <filename>src/bin/pg_upgrade/t/002_pg_upgrade.pl</filename> tests dump
+       and restore of regression database using all dump formats. Otherwise
+       tests only <literal>plain</literal> format. Not enabled by default
+       because it is resource intensive.
+      </para>
+     </listitem>
+    </varlistentry>
    </variablelist>
 
    Tests for features that are not supported by the current build
diff --git a/src/bin/pg_upgrade/t/002_pg_upgrade.pl b/src/bin/pg_upgrade/t/002_pg_upgrade.pl
index 17af2ce61e..613512ffe7 100644
--- a/src/bin/pg_upgrade/t/002_pg_upgrade.pl
+++ b/src/bin/pg_upgrade/t/002_pg_upgrade.pl
@@ -13,6 +13,7 @@ use File::Path qw(rmtree);
 use PostgreSQL::Test::Cluster;
 use PostgreSQL::Test::Utils;
 use PostgreSQL::Test::AdjustUpgrade;
+use PostgreSQL::Test::AdjustDump;
 use Test::More;
 
 # Can be changed to test the other modes.
@@ -36,9 +37,9 @@ sub generate_db
 		"created database with ASCII characters from $from_char to $to_char");
 }
 
-# Filter the contents of a dump before its use in a content comparison.
-# This returns the path to the filtered dump.
-sub filter_dump
+# Filter the contents of a dump before its use in a content comparison for
+# upgrade testing. This returns the path to the filtered dump.
+sub filter_dump_for_upgrade
 {
 	my ($is_old, $old_version, $dump_file) = @_;
 	my $dump_contents = slurp_file($dump_file);
@@ -61,6 +62,44 @@ sub filter_dump
 	return $dump_file_filtered;
 }
 
+# Test that the given two files match.  The files usually contain pg_dump
+# output in "plain" format. Output the difference if any.
+sub compare_dumps
+{
+	my ($dump1, $dump2, $testname) = @_;
+
+	my $compare_res = compare($dump1, $dump2);
+	is($compare_res, 0, $testname);
+
+	# Provide more context if the dumps do not match.
+	if ($compare_res != 0)
+	{
+		my ($stdout, $stderr) =
+		  run_command([ 'diff', '-u', $dump1, $dump2 ]);
+		print "=== diff of $dump1 and $dump2\n";
+		print "=== stdout ===\n";
+		print $stdout;
+		print "=== stderr ===\n";
+		print $stderr;
+		print "=== EOF ===\n";
+	}
+}
+
+# Adjust the contents of a dump before its use in a content comparison for dump
+# and restore testing. This returns the path to the adjusted dump.
+sub adjust_dump_for_restore
+{
+	my ($dump_file, $is_original) = @_;
+	my $dump_adjusted = "${dump_file}_adjusted";
+
+	open(my $dh, '>', $dump_adjusted)
+	  || die "opening $dump_adjusted ";
+	print $dh adjust_regress_dumpfile(slurp_file($dump_file), $is_original);
+	close($dh);
+
+	return $dump_adjusted;
+}
+
 # The test of pg_upgrade requires two clusters, an old one and a new one
 # that gets upgraded.  Before running the upgrade, a logical dump of the
 # old cluster is taken, and a second logical dump of the new one is taken
@@ -258,6 +297,12 @@ else
 		}
 	}
 	is($rc, 0, 'regression tests pass');
+
+	# Test dump/restore of the objects left behind by regression. Ideally it
+	# should be done in a separate test, but doing it here saves us one full
+	# regression run. Do this while the old cluster remains usable before
+	# upgrading it.
+	test_regression_dump_restore($oldnode, %node_params);
 }
 
 # Initialize a new node for the upgrade.
@@ -502,24 +547,116 @@ push(@dump_command, '--extra-float-digits', '0')
 $newnode->command_ok(\@dump_command, 'dump after running pg_upgrade');
 
 # Filter the contents of the dumps.
-my $dump1_filtered = filter_dump(1, $oldnode->pg_version, $dump1_file);
-my $dump2_filtered = filter_dump(0, $oldnode->pg_version, $dump2_file);
+my $dump1_filtered =
+  filter_dump_for_upgrade(1, $oldnode->pg_version, $dump1_file);
+my $dump2_filtered =
+  filter_dump_for_upgrade(0, $oldnode->pg_version, $dump2_file);
 
 # Compare the two dumps, there should be no differences.
-my $compare_res = compare($dump1_filtered, $dump2_filtered);
-is($compare_res, 0, 'old and new dumps match after pg_upgrade');
-
-# Provide more context if the dumps do not match.
-if ($compare_res != 0)
+compare_dumps($dump1_filtered, $dump2_filtered,
+	'old and new dumps match after pg_upgrade');
+
+# Test dump and restore of objects left behind regression run.
+#
+# It is expected that regression tests, which create `regression` database, are
+# run on `src_node`, which in turn is left in running state. The dump is
+# restored on a fresh node created using given `node_params`. Plain dumps from
+# both the nodes are compared to make sure that all the dumped objects are
+# restored faithfully.
+sub test_regression_dump_restore
 {
-	my ($stdout, $stderr) =
-	  run_command([ 'diff', '-u', $dump1_filtered, $dump2_filtered ]);
-	print "=== diff of $dump1_filtered and $dump2_filtered\n";
-	print "=== stdout ===\n";
-	print $stdout;
-	print "=== stderr ===\n";
-	print $stderr;
-	print "=== EOF ===\n";
+	my ($src_node, %node_params) = @_;
+	my $dst_node = PostgreSQL::Test::Cluster->new('dst_node');
+
+	# Dump the original database in "plain" format for comparison later. The
+	# order of columns in COPY statements dumped from the original database and
+	# that from the restored database differs. These differences are hard to
+	# adjust. Hence we compare only schema dumps for now.
+	my $src_dump_file = "$tempdir/src_dump.sql";
+	command_ok(
+		[
+			'pg_dump', '-s',
+			'--no-sync', '-d',
+			$src_node->connstr('regression'), '-f',
+			$src_dump_file
+		],
+		'pg_dump on original instance');
+	my $src_adjusted_dump = adjust_dump_for_restore($src_dump_file, 1);
+
+	# Setup destination database
+	$dst_node->init(%node_params);
+	$dst_node->start;
+
+	# Testing all dump formats takes longer. Do it only when explicitly
+	# requested.
+	my @formats;
+	if (   $ENV{PG_TEST_EXTRA}
+		&& $ENV{PG_TEST_EXTRA} =~ /\bregress_dump_formats\b/)
+	{
+		@formats = ('tar', 'directory', 'custom', 'plain');
+	}
+	else
+	{
+		@formats = ('plain');
+	}
+
+	for my $format (@formats)
+	{
+		my $dump_file = "$tempdir/regression_dump.$format";
+		my $format_spec = substr($format, 0, 1);
+		my $restored_db = 'regression_' . $format;
+
+		# Even though we compare only schema from the original and the restored
+		# database, we dump and restore data as well to catch any errors while
+		# doing so.
+		command_ok(
+			[
+				'pg_dump', "-F$format_spec", '--no-sync',
+				'-d', $src_node->connstr('regression'),
+				'-f', $dump_file
+			],
+			"pg_dump on source instance in '$format' format");
+
+		$dst_node->command_ok([ 'createdb', $restored_db ],
+			"created destination database '$restored_db'");
+
+		# Restore into destination database.
+		my @restore_command;
+		if ($format eq 'plain')
+		{
+			# Restore dump in "plain" format using `psql`.
+			@restore_command = [
+				'psql', '-d', $dst_node->connstr($restored_db),
+				'-f', $dump_file
+			];
+		}
+		else
+		{
+			@restore_command = [
+				'pg_restore', '-d',
+				$dst_node->connstr($restored_db), $dump_file
+			];
+		}
+		command_ok(@restore_command,
+			"pg_restore on destination instance in '$format' format");
+
+		# Dump restored database for comparison
+		my $dst_dump_file = "$tempdir/dest_dump.$format.sql";
+		command_ok(
+			[
+				'pg_dump', '-s',
+				'--no-sync', '-d',
+				$dst_node->connstr($restored_db), '-f',
+				$dst_dump_file
+			],
+			"pg_dump on instance restored with '$format' format");
+		my $dst_adjusted_dump = adjust_dump_for_restore($dst_dump_file, 0);
+
+		# Compare adjusted dumps, there should be no differences.
+		compare_dumps($src_adjusted_dump, $dst_adjusted_dump,
+			"dump outputs of original and restored regression database, using format '$format', match"
+		);
+	}
 }
 
 done_testing();
diff --git a/src/test/perl/PostgreSQL/Test/AdjustDump.pm b/src/test/perl/PostgreSQL/Test/AdjustDump.pm
new file mode 100644
index 0000000000..cd0516b58f
--- /dev/null
+++ b/src/test/perl/PostgreSQL/Test/AdjustDump.pm
@@ -0,0 +1,109 @@
+
+# Copyright (c) 2024-2025, PostgreSQL Global Development Group
+
+=pod
+
+=head1 NAME
+
+PostgreSQL::Test::AdjustDump - helper module for dump and restore tests
+
+=head1 SYNOPSIS
+
+  use PostgreSQL::Test::AdjustDump;
+
+  # Adjust contents of dump output file so that dump output from original
+  # regression database and that from the restored regression database match
+  $dump = adjust_regress_dumpfile($dump, $original);
+
+=head1 DESCRIPTION
+
+C<PostgreSQL::Test::AdjustDump> encapsulates various hacks needed to
+compare the results of dump and retore tests
+
+=cut
+
+package PostgreSQL::Test::AdjustDump;
+
+use strict;
+use warnings FATAL => 'all';
+
+use Exporter 'import';
+
+our @EXPORT = qw(
+  adjust_regress_dumpfile
+);
+
+=pod
+
+=head1 ROUTINES
+
+=over
+
+=item $dump = adjust_regress_dumpfile($dump, $original)
+
+If we take dump of the regression database left behind after running regression
+tests, restore the dump, and take dump of the restored regression database, the
+outputs of both the dumps differ. Some regression tests purposefully create
+some child tables in such a way that their column orders differ from column
+orders of their respective parents. In the restored database, however, their
+column orders are same as that of their respective parents. Thus the column
+orders of these child tables in the original database and those in the restored
+database differ, causing difference in the dump outputs. See MergeAttributes()
+and dumpTableSchema() for details.
+
+This routine rearranges the column declarations in these C<CREATE TABLE ... INHERITS>
+statements in the dump file from original database to match that from the
+restored database.
+
+Additionally it adjusts blank and new lines to avoid noise.
+
+Arguments:
+
+=over
+
+=item C<dump>: Contents of dump file
+
+=item C<original>: 1 indicates that the given dump file is from the original
+database, else 0
+
+=back
+
+Returns the adjusted dump text.
+
+=cut
+
+sub adjust_regress_dumpfile
+{
+	my ($dump, $original) = @_;
+
+	# use Unix newlines
+	$dump =~ s/\r\n/\n/g;
+	# Suppress blank lines, as some places in pg_dump emit more or fewer.
+	$dump =~ s/\n\n+/\n/g;
+
+	# Adjust the CREATE TABLE ... INHERITS statements.
+	if ($original)
+	{
+		$dump =~ s/(^CREATE\sTABLE\spublic\.gtestxx_4\s\()
+				   (\n\s+b\sinteger),
+				   (\n\s+a\sinteger)/$1$3,$2/mgx;
+		$dump =~ s/(^CREATE\sTABLE\spublic\.test_type_diff2_c1\s\()
+				   (\n\s+int_four\sbigint),
+				   (\n\s+int_eight\sbigint),
+				   (\n\s+int_two\ssmallint)/$1$4,$2,$3/mgx;
+		$dump =~ s/(CREATE\sTABLE\spublic\.test_type_diff2_c2\s\()
+				   (\n\s+int_eight\sbigint),
+				   (\n\s+int_two\ssmallint),
+				   (\n\s+int_four\sbigint)/$1$3,$4,$2/mgx;
+	}
+
+	return $dump;
+}
+
+=pod
+
+=back
+
+=cut
+
+1;
-- 
2.34.1

Re: Test to dump and restore objects left behind by regression

Reply via email to