On Mon, Mar 24, 2025 at 5:44 PM Alvaro Herrera <alvhe...@alvh.no-ip.org> wrote:
>
> On 2025-Mar-24, Ashutosh Bapat wrote:
>
> > One concern I have with directory format is the dumped database is not
> > readable. This might make investigating a but identified the test a
> > bit more complex.
>
> Oh, it's readable all right.  You just need to use `pg_restore -f-` to
> read it.  No big deal.
>
>
> So I ran this a few times:
> /usr/bin/time make -j8 -Otarget -C /pgsql/build/master check-world -s 
> PROVE_FLAGS="-c -j6" > /dev/null
>
> commenting out the call to test_regression_dump_restore() to test how
> much additional runtime does the new test incur.
>
> With test:
>
> 136.95user 116.56system 1:13.23elapsed 346%CPU (0avgtext+0avgdata 
> 250704maxresident)k
> 4928inputs+55333008outputs (114major+14784937minor)pagefaults 0swaps
>
> 138.11user 117.43system 1:15.54elapsed 338%CPU (0avgtext+0avgdata 
> 278592maxresident)k
> 48inputs+55333464outputs (80major+14794494minor)pagefaults 0swaps
>
> 137.05user 113.13system 1:08.19elapsed 366%CPU (0avgtext+0avgdata 
> 279272maxresident)k
> 48inputs+55330064outputs (83major+14758028minor)pagefaults 0swaps
>
> without the new test:
>
> 135.46user 114.55system 1:14.69elapsed 334%CPU (0avgtext+0avgdata 
> 145372maxresident)k
> 32inputs+55155256outputs (105major+14737549minor)pagefaults 0swaps
>
> 135.48user 114.57system 1:09.60elapsed 359%CPU (0avgtext+0avgdata 
> 148224maxresident)k
> 16inputs+55155432outputs (95major+14749502minor)pagefaults 0swaps
>
> 133.76user 113.26system 1:14.92elapsed 329%CPU (0avgtext+0avgdata 
> 148064maxresident)k
> 48inputs+55154952outputs (84major+14749531minor)pagefaults 0swaps
>
> 134.06user 113.83system 1:16.09elapsed 325%CPU (0avgtext+0avgdata 
> 145940maxresident)k
> 32inputs+55155032outputs (83major+14738602minor)pagefaults 0swaps
>
> The increase in duration here is less than a second.
>
>
> My conclusion with these numbers is that it's not worth hiding this test
> in PG_TEST_EXTRA.

Thanks for the conclusion.

On Mon, Mar 24, 2025 at 3:29 PM Daniel Gustafsson <dan...@yesql.se> wrote:
>
> > On 24 Mar 2025, at 10:54, Ashutosh Bapat <ashutosh.bapat....@gmail.com> 
> > wrote:
>
> > 0003 - same as 0002 in the previous patch set. It excludes statistics
> > from comparison, otherwise the test will fail because of bug reported
> > at [1]. Ideally we shouldn't commit this patch so as to test
> > statistics dump and restore, but in case we need the test to pass till
> > the bug is fixed, we should merge this patch to 0001 before
> > committing.
>
> If the reported bug isn't fixed before feature freeze I think we should commit
> this regardless as it has clearly shown value by finding bugs (though perhaps
> under PG_TEST_EXTRA or in some disconnected till the bug is fixed to limit the
> blast-radius in the buildfarm).

Combining Alvaro's and Daniel's recommendations, I think we should
squash all the three of my patches while committing the test if the
bug is not fixed by then. Otherwise we should squash first two patches
and commit it. Just attaching the patches again for reference.

> If we really wanted to save some total test runtime,
> it might be better to write a regress schedule file for
> 027_stream_regress.pl which only takes the test that emit useful WAL,
> rather than all tests.

That's out of scope for this patch, but it seems like an idea worth exploring.

-- 
Best Wishes,
Ashutosh Bapat
From f26f88364a196dc9589ca451cb54f5e514e3422e Mon Sep 17 00:00:00 2001
From: Ashutosh Bapat <ashutosh.bapat....@gmail.com>
Date: Mon, 24 Mar 2025 11:21:12 +0530
Subject: [PATCH 2/3] Use only one format and make the test run default

According to Alvaro (and I agree with him), the test should be run by
default. Otherwise we get to know about a bug only after buildfarm
animal where it's enabled reports a failure. Further testing only one
format may suffice; since all the formats have shown the same bugs till
now.

If we use --directory format we can use -j which reduces the time taken
by dump/restore test by about 12%.

This patch removes PG_TEST_EXTRA option as well as runs the test only in
directory format with parallelism enabled.

Note for committer: If we decide to accept this change, it should be
merged with the previous commit.
---
 doc/src/sgml/regress.sgml              | 12 ----
 src/bin/pg_upgrade/t/002_pg_upgrade.pl | 76 +++++++++-----------------
 2 files changed, 25 insertions(+), 63 deletions(-)

diff --git a/doc/src/sgml/regress.sgml b/doc/src/sgml/regress.sgml
index 237b974b3ab..0e5e8e8f309 100644
--- a/doc/src/sgml/regress.sgml
+++ b/doc/src/sgml/regress.sgml
@@ -357,18 +357,6 @@ make check-world PG_TEST_EXTRA='kerberos ldap ssl load_balance libpq_encryption'
       </para>
      </listitem>
     </varlistentry>
-
-    <varlistentry>
-     <term><literal>regress_dump_test</literal></term>
-     <listitem>
-      <para>
-       When enabled, <filename>src/bin/pg_upgrade/t/002_pg_upgrade.pl</filename>
-       tests dump and restore of regression database left behind by the
-       regression run. Not enabled by default because it is time and resource
-       consuming.
-      </para>
-     </listitem>
-    </varlistentry>
    </variablelist>
 
    Tests for features that are not supported by the current build
diff --git a/src/bin/pg_upgrade/t/002_pg_upgrade.pl b/src/bin/pg_upgrade/t/002_pg_upgrade.pl
index d08eea6693f..cbd9831bf9e 100644
--- a/src/bin/pg_upgrade/t/002_pg_upgrade.pl
+++ b/src/bin/pg_upgrade/t/002_pg_upgrade.pl
@@ -268,16 +268,9 @@ else
 	# should be done in a separate TAP test, but doing it here saves us one full
 	# regression run.
 	#
-	# This step takes several extra seconds and some extra disk space, so
-	# requires an opt-in with the PG_TEST_EXTRA environment variable.
-	#
 	# Do this while the old cluster is running before it is shut down by the
 	# upgrade test.
-	if (   $ENV{PG_TEST_EXTRA}
-		&& $ENV{PG_TEST_EXTRA} =~ /\bregress_dump_test\b/)
-	{
-		test_regression_dump_restore($oldnode, %node_params);
-	}
+	test_regression_dump_restore($oldnode, %node_params);
 }
 
 # Initialize a new node for the upgrade.
@@ -590,53 +583,34 @@ sub test_regression_dump_restore
 	$dst_node->append_conf('postgresql.conf', 'autovacuum = off');
 	$dst_node->start;
 
-	# Test all formats one by one.
-	for my $format ('plain', 'tar', 'directory', 'custom')
-	{
-		my $dump_file = "$tempdir/regression_dump.$format";
-		my $restored_db = 'regression_' . $format;
-
-		# Use --create in dump and restore commands so that the restored
-		# database has the same configurable variable settings as the original
-		# database and the plain dumps taken for comparsion do not differ
-		# because of locale changes. Additionally this provides test coverage
-		# for --create option.
-		$src_node->command_ok(
-			[
-				'pg_dump', "-F$format", '--no-sync',
-				'-d', $src_node->connstr('regression'),
-				'--create', '-f', $dump_file
-			],
-			"pg_dump on source instance in $format format");
+	my $dump_file = "$tempdir/regression.dump";
 
-		my @restore_command;
-		if ($format eq 'plain')
-		{
-			# Restore dump in "plain" format using `psql`.
-			@restore_command = [ 'psql', '-d', 'postgres', '-f', $dump_file ];
-		}
-		else
-		{
-			@restore_command = [
-				'pg_restore', '--create',
-				'-d', 'postgres', $dump_file
-			];
-		}
-		$dst_node->command_ok(@restore_command,
-			"restored dump taken in $format format on destination instance");
+	# Use --create in dump and restore commands so that the restored database
+	# has the same configurable variable settings as the original database so
+	# that the plain dumps taken from both the database taken for comparisong do
+	# not differ because of locale changes. Additionally this provides test
+	# coverage for --create option.
+	#
+	# We use directory format which allows dumping and restoring in parallel to
+	# reduce the test's run time.
+	$src_node->command_ok(
+		[
+			'pg_dump', '-Fd', '-j2', '--no-sync',
+			'-d', $src_node->connstr('regression'),
+			'--create', '-f', $dump_file
+		],
+		"pg_dump on source instance succeeded");
 
-		my $dst_dump =
-		  get_dump_for_comparison($dst_node, 'regression',
-			'dest_dump.' . $format, 0);
+	$dst_node->command_ok(
+		[ 'pg_restore', '--create', '-j2', '-d', 'postgres', $dump_file ],
+		"restored dump to destination instance");
 
-		compare_files($src_dump, $dst_dump,
-			"dump outputs from original and restored regression database (using $format format) match"
-		);
+	my $dst_dump = get_dump_for_comparison($dst_node, 'regression',
+		'dest_dump', 0);
 
-		# Rename the restored database so that it is available for debugging in
-		# case the test fails.
-		$dst_node->safe_psql('postgres', "ALTER DATABASE regression RENAME TO $restored_db");
-	}
+	compare_files($src_dump, $dst_dump,
+			"dump outputs from original and restored regression database match"
+		);
 }
 
 # Dump database `db` from the given `node` in plain format and adjust it for
-- 
2.34.1

From fcfd0d25ecd374d55970817b4d3ea2aecdd58251 Mon Sep 17 00:00:00 2001
From: Ashutosh Bapat <ashutosh.ba...@enterprisedb.com>
Date: Thu, 27 Jun 2024 10:03:53 +0530
Subject: [PATCH 1/3] Test pg_dump/restore of regression objects

002_pg_upgrade.pl tests pg_upgrade of the regression database left
behind by regression run. Modify it to test dump and restore of the
regression database as well.

Regression database created by regression run contains almost all the
database objects supported by PostgreSQL in various states. Hence the
new testcase covers dump and restore scenarios not covered by individual
dump/restore cases. Till now 002_pg_upgrade only tested dump/restore
through pg_upgrade which only uses binary mode. Many regression tests
mention that they leave objects behind for dump/restore testing but they
are not tested in a non-binary mode. The new testcase closes that
gap.

Testing dump and restore of regression database makes this test run
longer for a relatively smaller benefit. Hence run it only when
explicitly requested by user by specifying "regress_dump_test" in
PG_TEST_EXTRA.

Note For the reviewers:
The new test has uncovered many bugs so far in one year.
1. Introduced by 14e87ffa5c54. Fixed in fd41ba93e4630921a72ed5127cd0d552a8f3f8fc.
2. Introduced by 0413a556990ba628a3de8a0b58be020fd9a14ed0. Reverted in 74563f6b90216180fc13649725179fc119dddeb5.
3. Fixed by d611f8b1587b8f30caa7c0da99ae5d28e914d54f
3. Being discussed on hackers at https://www.postgresql.org/message-id/caexhw5s47kmubpbbrjzsm-zfe0tj2o3gbagb7yaye8rq-v2...@mail.gmail.com

Author: Ashutosh Bapat
Reviewed by: Michael Pacquire, Daniel Gustafsson, Tom Lane, Alvaro Herrera
Discussion: https://www.postgresql.org/message-id/CAExHW5uF5V=Cjecx3_Z=7xfh4rg2wf61pt+hfquzjbqourz...@mail.gmail.com
---
 doc/src/sgml/regress.sgml                   |  12 ++
 src/bin/pg_upgrade/t/002_pg_upgrade.pl      | 144 ++++++++++++++++-
 src/test/perl/Makefile                      |   2 +
 src/test/perl/PostgreSQL/Test/AdjustDump.pm | 167 ++++++++++++++++++++
 src/test/perl/meson.build                   |   1 +
 5 files changed, 324 insertions(+), 2 deletions(-)
 create mode 100644 src/test/perl/PostgreSQL/Test/AdjustDump.pm

diff --git a/doc/src/sgml/regress.sgml b/doc/src/sgml/regress.sgml
index 0e5e8e8f309..237b974b3ab 100644
--- a/doc/src/sgml/regress.sgml
+++ b/doc/src/sgml/regress.sgml
@@ -357,6 +357,18 @@ make check-world PG_TEST_EXTRA='kerberos ldap ssl load_balance libpq_encryption'
       </para>
      </listitem>
     </varlistentry>
+
+    <varlistentry>
+     <term><literal>regress_dump_test</literal></term>
+     <listitem>
+      <para>
+       When enabled, <filename>src/bin/pg_upgrade/t/002_pg_upgrade.pl</filename>
+       tests dump and restore of regression database left behind by the
+       regression run. Not enabled by default because it is time and resource
+       consuming.
+      </para>
+     </listitem>
+    </varlistentry>
    </variablelist>
 
    Tests for features that are not supported by the current build
diff --git a/src/bin/pg_upgrade/t/002_pg_upgrade.pl b/src/bin/pg_upgrade/t/002_pg_upgrade.pl
index 00051b85035..d08eea6693f 100644
--- a/src/bin/pg_upgrade/t/002_pg_upgrade.pl
+++ b/src/bin/pg_upgrade/t/002_pg_upgrade.pl
@@ -12,6 +12,7 @@ use File::Path     qw(rmtree);
 use PostgreSQL::Test::Cluster;
 use PostgreSQL::Test::Utils;
 use PostgreSQL::Test::AdjustUpgrade;
+use PostgreSQL::Test::AdjustDump;
 use Test::More;
 
 # Can be changed to test the other modes.
@@ -35,8 +36,8 @@ sub generate_db
 		"created database with ASCII characters from $from_char to $to_char");
 }
 
-# Filter the contents of a dump before its use in a content comparison.
-# This returns the path to the filtered dump.
+# Filter the contents of a dump before its use in a content comparison for
+# upgrade testing. This returns the path to the filtered dump.
 sub filter_dump
 {
 	my ($is_old, $old_version, $dump_file) = @_;
@@ -262,6 +263,21 @@ else
 		}
 	}
 	is($rc, 0, 'regression tests pass');
+
+	# Test dump/restore of the objects left behind by regression. Ideally it
+	# should be done in a separate TAP test, but doing it here saves us one full
+	# regression run.
+	#
+	# This step takes several extra seconds and some extra disk space, so
+	# requires an opt-in with the PG_TEST_EXTRA environment variable.
+	#
+	# Do this while the old cluster is running before it is shut down by the
+	# upgrade test.
+	if (   $ENV{PG_TEST_EXTRA}
+		&& $ENV{PG_TEST_EXTRA} =~ /\bregress_dump_test\b/)
+	{
+		test_regression_dump_restore($oldnode, %node_params);
+	}
 }
 
 # Initialize a new node for the upgrade.
@@ -539,4 +555,128 @@ my $dump2_filtered = filter_dump(0, $oldnode->pg_version, $dump2_file);
 compare_files($dump1_filtered, $dump2_filtered,
 	'old and new dumps match after pg_upgrade');
 
+# Test dump and restore of objects left behind by the regression run.
+#
+# It is expected that regression tests, which create `regression` database, are
+# run on `src_node`, which in turn, is left in running state. A fresh node is
+# created using given `node_params`, which are expected to be the same ones used
+# to create `src_node`, so as to avoid any differences in the databases.
+#
+# Plain dumps from both the nodes are compared to make sure that all the dumped
+# objects are restored faithfully.
+sub test_regression_dump_restore
+{
+	my ($src_node, %node_params) = @_;
+	my $dst_node = PostgreSQL::Test::Cluster->new('dst_node');
+
+	# Make sure that the source and destination nodes have the same version and
+	# do not use custom install paths. In both the cases, the dump files may
+	# require additional adjustments unknown to code here. Do not run this test
+	# in such a case to avoid utilizing the time and resources unnecessarily.
+	if ($src_node->pg_version != $dst_node->pg_version
+		or defined $src_node->{_install_path})
+	{
+		fail("same version dump and restore test using default installation");
+		return;
+	}
+
+	# Dump the original database for comparison later.
+	my $src_dump =
+	  get_dump_for_comparison($src_node, 'regression', 'src_dump', 1);
+
+	# Setup destination database cluster
+	$dst_node->init(%node_params);
+	# Stabilize stats for comparison.
+	$dst_node->append_conf('postgresql.conf', 'autovacuum = off');
+	$dst_node->start;
+
+	# Test all formats one by one.
+	for my $format ('plain', 'tar', 'directory', 'custom')
+	{
+		my $dump_file = "$tempdir/regression_dump.$format";
+		my $restored_db = 'regression_' . $format;
+
+		# Use --create in dump and restore commands so that the restored
+		# database has the same configurable variable settings as the original
+		# database and the plain dumps taken for comparsion do not differ
+		# because of locale changes. Additionally this provides test coverage
+		# for --create option.
+		$src_node->command_ok(
+			[
+				'pg_dump', "-F$format", '--no-sync',
+				'-d', $src_node->connstr('regression'),
+				'--create', '-f', $dump_file
+			],
+			"pg_dump on source instance in $format format");
+
+		my @restore_command;
+		if ($format eq 'plain')
+		{
+			# Restore dump in "plain" format using `psql`.
+			@restore_command = [ 'psql', '-d', 'postgres', '-f', $dump_file ];
+		}
+		else
+		{
+			@restore_command = [
+				'pg_restore', '--create',
+				'-d', 'postgres', $dump_file
+			];
+		}
+		$dst_node->command_ok(@restore_command,
+			"restored dump taken in $format format on destination instance");
+
+		my $dst_dump =
+		  get_dump_for_comparison($dst_node, 'regression',
+			'dest_dump.' . $format, 0);
+
+		compare_files($src_dump, $dst_dump,
+			"dump outputs from original and restored regression database (using $format format) match"
+		);
+
+		# Rename the restored database so that it is available for debugging in
+		# case the test fails.
+		$dst_node->safe_psql('postgres', "ALTER DATABASE regression RENAME TO $restored_db");
+	}
+}
+
+# Dump database `db` from the given `node` in plain format and adjust it for
+# comparing dumps from the original and the restored database.
+#
+# `file_prefix` is used to create unique names for all dump files so that they
+# remain available for debugging in case the test fails.
+#
+# `adjust_child_columns` is passed to adjust_regress_dumpfile() which actually
+# adjusts the dump output.
+#
+# The name of the file containting adjusted dump is returned.
+sub get_dump_for_comparison
+{
+	my ($node, $db, $file_prefix, $adjust_child_columns) = @_;
+
+	my $dumpfile = $tempdir . '/' . $file_prefix . '.sql';
+	my $dump_adjusted = "${dumpfile}_adjusted";
+
+	# Usually we avoid comparing statistics in our tests since it is flaky by
+	# nature. However, if statistics is dumped and restored it is expected to be
+	# restored as it is i.e. the statistics from the original database and that
+	# from the restored database should match. We turn off autovacuum on the
+	# source and the target database to avoid any statistics update during
+	# restore operation. Hence we do not exclude statistics from dump.
+	$node->command_ok(
+		[
+			'pg_dump', '--no-sync', '-d', $node->connstr($db), '-f',
+			$dumpfile
+		],
+		'dump for comparison succeeded');
+
+	open(my $dh, '>', $dump_adjusted)
+	  || die
+	  "could not open $dump_adjusted for writing the adjusted dump: $!";
+	print $dh adjust_regress_dumpfile(slurp_file($dumpfile),
+		$adjust_child_columns);
+	close($dh);
+
+	return $dump_adjusted;
+}
+
 done_testing();
diff --git a/src/test/perl/Makefile b/src/test/perl/Makefile
index d82fb67540e..def89650ead 100644
--- a/src/test/perl/Makefile
+++ b/src/test/perl/Makefile
@@ -26,6 +26,7 @@ install: all installdirs
 	$(INSTALL_DATA) $(srcdir)/PostgreSQL/Test/Cluster.pm '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Test/Cluster.pm'
 	$(INSTALL_DATA) $(srcdir)/PostgreSQL/Test/BackgroundPsql.pm '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Test/BackgroundPsql.pm'
 	$(INSTALL_DATA) $(srcdir)/PostgreSQL/Test/AdjustUpgrade.pm '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Test/AdjustUpgrade.pm'
+	$(INSTALL_DATA) $(srcdir)/PostgreSQL/Test/AdjustDump.pm '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Test/AdjustDump.pm'
 	$(INSTALL_DATA) $(srcdir)/PostgreSQL/Version.pm '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Version.pm'
 
 uninstall:
@@ -36,6 +37,7 @@ uninstall:
 	rm -f '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Test/Cluster.pm'
 	rm -f '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Test/BackgroundPsql.pm'
 	rm -f '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Test/AdjustUpgrade.pm'
+	rm -f '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Test/AdjustDump.pm'
 	rm -f '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Version.pm'
 
 endif
diff --git a/src/test/perl/PostgreSQL/Test/AdjustDump.pm b/src/test/perl/PostgreSQL/Test/AdjustDump.pm
new file mode 100644
index 00000000000..74b9a60cf34
--- /dev/null
+++ b/src/test/perl/PostgreSQL/Test/AdjustDump.pm
@@ -0,0 +1,167 @@
+
+# Copyright (c) 2024-2025, PostgreSQL Global Development Group
+
+=pod
+
+=head1 NAME
+
+PostgreSQL::Test::AdjustDump - helper module for dump and restore tests
+
+=head1 SYNOPSIS
+
+  use PostgreSQL::Test::AdjustDump;
+
+  # Adjust contents of dump output file so that dump output from original
+  # regression database and that from the restored regression database match
+  $dump = adjust_regress_dumpfile($dump, $adjust_child_columns);
+
+=head1 DESCRIPTION
+
+C<PostgreSQL::Test::AdjustDump> encapsulates various hacks needed to
+compare the results of dump and restore tests
+
+=cut
+
+package PostgreSQL::Test::AdjustDump;
+
+use strict;
+use warnings FATAL => 'all';
+
+use Exporter 'import';
+use Test::More;
+
+our @EXPORT = qw(
+  adjust_regress_dumpfile
+);
+
+=pod
+
+=head1 ROUTINES
+
+=over
+
+=item $dump = adjust_regress_dumpfile($dump, $adjust_child_columns)
+
+If we take dump of the regression database left behind after running regression
+tests, restore the dump, and take dump of the restored regression database, the
+outputs of both the dumps differ in the following cases. This routine adjusts
+the given dump so that dump outputs from the original and restored database,
+respectively, match.
+
+Case 1: Some regression tests purposefully create child tables in such a way
+that the order of their inherited columns differ from column orders of their
+respective parents. In the restored database, however, the order of their
+inherited columns are same as that of their respective parents. Thus the column
+orders of these child tables in the original database and those in the restored
+database differ, causing difference in the dump outputs. See MergeAttributes()
+and dumpTableSchema() for details.  This routine rearranges the column
+declarations in the relevant C<CREATE TABLE... INHERITS> statements in the dump
+file from original database to match those from the restored database. We could,
+instead, adjust the statements in the dump from the restored database to match
+those from original database or adjust both to a canonical order. But we have
+chosen to adjust the statements in the dump from original database for no
+particular reason.
+
+Case 2: When dumping COPY statements the columns are ordered by their attribute
+number by fmtCopyColumnList(). If a column is added to a parent table after a
+child has inherited the parent and the child has its own columns, the attribute
+number of the column changes after restoring the child table. This is because
+when executing the dumped C<CREATE TABLE... INHERITS> statement all the parent
+attributes are created before any child attributes. Thus the order of columns in
+COPY statements dumped from the original and the restored databases,
+respectively, differs. Such tables in regression tests are listed below. It is
+hard to adjust the column order in the COPY statement along with the data. Hence
+we just remove such COPY statements from the dump output.
+
+Additionally the routine adjusts blank and new lines to avoid noise.
+
+Note: Usually we avoid comparing statistics in our tests since it is flaky by
+nature. However, if statistics is dumped and restored it is expected to be
+restored as it is i.e. the statistics from the original database and that from
+the restored database should match. Hence we do not filter statistics from dump,
+if it's dumped.
+
+Arguments:
+
+=over
+
+=item C<dump>: Contents of dump file
+
+=item C<adjust_child_columns>: 1 indicates that the given dump file requires
+adjusting columns in the child tables; usually when the dump is from original
+database. 0 indicates no such adjustment is needed; usually when the dump is
+from restored database.
+
+=back
+
+Returns the adjusted dump text.
+
+=cut
+
+sub adjust_regress_dumpfile
+{
+	my ($dump, $adjust_child_columns) = @_;
+
+	# use Unix newlines
+	$dump =~ s/\r\n/\n/g;
+
+	# Adjust the CREATE TABLE ... INHERITS statements.
+	if ($adjust_child_columns)
+	{
+		my $saved_dump = $dump;
+
+		$dump =~ s/(^CREATE\sTABLE\sgenerated_stored_tests\.gtestxx_4\s\()
+				   (\n\s+b\sinteger),
+				   (\n\s+a\sinteger\sNOT\sNULL)/$1$3,$2/mgx;
+		ok($saved_dump ne $dump,
+			'applied generated_stored_tests.gtestxx_4 adjustments');
+
+		$saved_dump = $dump;
+		$dump =~ s/(^CREATE\sTABLE\sgenerated_virtual_tests\.gtestxx_4\s\()
+				   (\n\s+b\sinteger),
+				   (\n\s+a\sinteger\sNOT\sNULL)/$1$3,$2/mgx;
+		ok($saved_dump ne $dump,
+			'applied generated_virtual_tests.gtestxx_4 adjustments');
+
+		$saved_dump = $dump;
+		$dump =~ s/(^CREATE\sTABLE\spublic\.test_type_diff2_c1\s\()
+				   (\n\s+int_four\sbigint),
+				   (\n\s+int_eight\sbigint),
+				   (\n\s+int_two\ssmallint)/$1$4,$2,$3/mgx;
+		ok($saved_dump ne $dump,
+			'applied public.test_type_diff2_c1 adjustments');
+
+		$saved_dump = $dump;
+		$dump =~ s/(^CREATE\sTABLE\spublic\.test_type_diff2_c2\s\()
+				   (\n\s+int_eight\sbigint),
+				   (\n\s+int_two\ssmallint),
+				   (\n\s+int_four\sbigint)/$1$3,$4,$2/mgx;
+		ok($saved_dump ne $dump,
+			'applied public.test_type_diff2_c2 adjustments');
+	}
+
+	# Remove COPY statements with differing column order
+	for my $table (
+		'public\.b_star', 'public\.c_star',
+		'public\.cc2', 'public\.d_star',
+		'public\.e_star', 'public\.f_star',
+		'public\.renamecolumnanother', 'public\.renamecolumnchild',
+		'public\.test_type_diff2_c1', 'public\.test_type_diff2_c2',
+		'public\.test_type_diff_c')
+	{
+		$dump =~ s/^COPY\s$table\s\(.+?^\\\.$//sm;
+	}
+
+	# Suppress blank lines, as some places in pg_dump emit more or fewer.
+	$dump =~ s/\n\n+/\n/g;
+
+	return $dump;
+}
+
+=pod
+
+=back
+
+=cut
+
+1;
diff --git a/src/test/perl/meson.build b/src/test/perl/meson.build
index 58e30f15f9d..492ca571ff8 100644
--- a/src/test/perl/meson.build
+++ b/src/test/perl/meson.build
@@ -14,4 +14,5 @@ install_data(
   'PostgreSQL/Test/Cluster.pm',
   'PostgreSQL/Test/BackgroundPsql.pm',
   'PostgreSQL/Test/AdjustUpgrade.pm',
+  'PostgreSQL/Test/AdjustDump.pm',
   install_dir: dir_pgxs / 'src/test/perl/PostgreSQL/Test')

base-commit: 73eba5004a06a744b6b8570e42432b9e9f75997b
-- 
2.34.1

From 435c659489b34a803675abb65144fab6f0550432 Mon Sep 17 00:00:00 2001
From: Ashutosh Bapat <ashutosh.bapat....@gmail.com>
Date: Tue, 25 Feb 2025 11:42:51 +0530
Subject: [PATCH 3/3] Do not dump statistics in the file dumped for comparison

The dumped and restored statistics of a materialized view may differ as
reported in [1].  Hence do not dump the statistics to avoid differences
in the dump output from the original and restored database.

[1] https://www.postgresql.org/message-id/caexhw5s47kmubpbbrjzsm-zfe0tj2o3gbagb7yaye8rq-v2...@mail.gmail.com

Ashutosh Bapat
---
 src/bin/pg_upgrade/t/002_pg_upgrade.pl | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/src/bin/pg_upgrade/t/002_pg_upgrade.pl b/src/bin/pg_upgrade/t/002_pg_upgrade.pl
index cbd9831bf9e..abe93a49258 100644
--- a/src/bin/pg_upgrade/t/002_pg_upgrade.pl
+++ b/src/bin/pg_upgrade/t/002_pg_upgrade.pl
@@ -630,15 +630,15 @@ sub get_dump_for_comparison
 	my $dumpfile = $tempdir . '/' . $file_prefix . '.sql';
 	my $dump_adjusted = "${dumpfile}_adjusted";
 
-	# Usually we avoid comparing statistics in our tests since it is flaky by
-	# nature. However, if statistics is dumped and restored it is expected to be
-	# restored as it is i.e. the statistics from the original database and that
-	# from the restored database should match. We turn off autovacuum on the
-	# source and the target database to avoid any statistics update during
-	# restore operation. Hence we do not exclude statistics from dump.
+	# If statistics is dumped and restored it is expected to be restored as it
+	# is i.e. the statistics from the original database and that from the
+	# restored database should match. We turn off autovacuum on the source and
+	# the target database to avoid any statistics update during restore
+	# operation. But as of now, there are cases when statistics is not being
+	# restored faithfully. Hence for now do not dump statistics.
 	$node->command_ok(
 		[
-			'pg_dump', '--no-sync', '-d', $node->connstr($db), '-f',
+			'pg_dump', '--no-sync', '--no-statistics', '-d', $node->connstr($db), '-f',
 			$dumpfile
 		],
 		'dump for comparison succeeded');
-- 
2.34.1

Reply via email to