On Wed, Apr 2, 2025 at 3:36 PM Ashutosh Bapat
<ashutosh.bapat....@gmail.com> wrote:
> >
> > No commitfest entry please.  Better to add an open item on the wiki
> > page.
> > https://wiki.postgresql.org/wiki/Open_Items
>
> Posted it on the thread where I have reported the bug. Hopefully, we
> will commit both the bug fix and test change to enable stats together.

Looks like the problem is in the test itself as pointed out by Jeff in
[1]. PFA patch fixing the test and enabling statistics back.

The test file is arranged as follows
1. Setup old cluster (this step also runs regression if needed)
2. create new cluster for upgrade by modifying some configuration from
the old cluster.
3. disable autovacuum on old cluster
4. Run dump/restore roundtrip test which creates a destination cluster
with the same configuration as the old cluster

A note about variable name changes and introduction of new variables.
We run step 2 between 1 and 3 so that autovacuum gets a chance to run
on the old cluster and update statistics. Autovacuum run is not
necessary but useful here. Before these changes all the cluster
initializations were using the same variables @initdb_params and
%node_params. However with these changes, we modify the variable in
step 2 and then again require original values in step 4. So I have
used two sets of variables prefixed with old_ and new_ for clusters
created in 1st step and 2nd step respectively. 4th step uses the
variables with prefix old_. I think this change eliminates confusion
caused by using same variables with different values.

[1] 
https://www.postgresql.org/message-id/5f3703fd7f27da62a8f3615218f937507f522347.camel%40j-davis.com

I will watch CF CI run to see if we see difference in statistics even
after this change.

-- 
Best Wishes,
Ashutosh Bapat
From d3ba3f4259a4594dbb43e43e540c977a2346c523 Mon Sep 17 00:00:00 2001
From: Ashutosh Bapat <ashutosh.bapat....@gmail.com>
Date: Wed, 2 Apr 2025 15:04:16 +0530
Subject: [PATCH] Fix differences in dumped statistics

Autovacuum is turned off after running dump/restore roundtrip test. This
test takes executes following steps sequentially.

1. takes a dump, with statistcs, from source database for comparison
2. initializes the destination database cluster
3. takes a dump to be restored on destination cluster
4. restores the dump
5. takes a dump from destination cluster for comparison

If autovacuum is triggered between 1st and 3rd step the statistics in
the dumps taken for comparison may differ, causing the test to fail. In
the testfile, autovacuum is turned off on the original cluster before
taking dump for upgrade test. Move the dump/restore roundtrip test after
that step so that statistics remain stable on the original cluster for
the entire duration of the test.

With this change we expect that there will be no difference in the
statistics dumped from the original and the restored databases. Hence
enable dumping statistics for comparison.

Author: Ashutosh Bapat (ashutosh.bapat....@gmail.com)
Analysis-by: Jeff Davis (pg...@j-davis.com)
---
 src/bin/pg_upgrade/t/002_pg_upgrade.pl | 148 ++++++++++++-------------
 1 file changed, 74 insertions(+), 74 deletions(-)

diff --git a/src/bin/pg_upgrade/t/002_pg_upgrade.pl b/src/bin/pg_upgrade/t/002_pg_upgrade.pl
index 7494614ee64..49ee6ae3003 100644
--- a/src/bin/pg_upgrade/t/002_pg_upgrade.pl
+++ b/src/bin/pg_upgrade/t/002_pg_upgrade.pl
@@ -86,7 +86,7 @@ sub get_dump_for_comparison
 	# Don't dump statistics, because there are still some bugs.
 	$node->run_log(
 		[
-			'pg_dump', '--no-sync', '--no-statistics',
+			'pg_dump', '--no-sync',
 			'-d' => $node->connstr($db),
 			'-f' => $dumpfile
 		]);
@@ -128,7 +128,7 @@ my $oldnode =
   PostgreSQL::Test::Cluster->new('old_node',
 	install_path => $ENV{oldinstall});
 
-my %node_params = ();
+my %old_node_params = ();
 
 # To increase coverage of non-standard segment size and group access without
 # increasing test runtime, run these tests with a custom setting.
@@ -194,34 +194,34 @@ else
 my %encodings = ('UTF-8' => 6, 'SQL_ASCII' => 0);
 my $original_encoding = $encodings{$original_enc_name};
 
-my @initdb_params = @custom_opts;
+my @old_initdb_params = @custom_opts;
 
-push @initdb_params, ('--encoding', $original_enc_name);
-push @initdb_params, ('--lc-collate', $original_datcollate);
-push @initdb_params, ('--lc-ctype', $original_datctype);
+push @old_initdb_params, ('--encoding', $original_enc_name);
+push @old_initdb_params, ('--lc-collate', $original_datcollate);
+push @old_initdb_params, ('--lc-ctype', $original_datctype);
 
 # add --locale-provider, if supported
 my %provider_name = ('b' => 'builtin', 'i' => 'icu', 'c' => 'libc');
 if ($oldnode->pg_version >= 15)
 {
-	push @initdb_params,
+	push @old_initdb_params,
 	  ('--locale-provider', $provider_name{$original_provider});
 	if ($original_provider eq 'b')
 	{
-		push @initdb_params, ('--builtin-locale', $original_datlocale);
+		push @old_initdb_params, ('--builtin-locale', $original_datlocale);
 	}
 	elsif ($original_provider eq 'i')
 	{
-		push @initdb_params, ('--icu-locale', $original_datlocale);
+		push @old_initdb_params, ('--icu-locale', $original_datlocale);
 	}
 }
 
 # Since checksums are now enabled by default, and weren't before 18,
 # pass '-k' to initdb on old versions so that upgrades work.
-push @initdb_params, '-k' if $oldnode->pg_version < 18;
+push @old_initdb_params, '-k' if $oldnode->pg_version < 18;
 
-$node_params{extra} = \@initdb_params;
-$oldnode->init(%node_params);
+$old_node_params{extra} = \@old_initdb_params;
+$oldnode->init(%old_node_params);
 $oldnode->start;
 
 my $result;
@@ -301,74 +301,19 @@ else
 	is($rc, 0, 'regression tests pass');
 }
 
-# Test that dump/restore of the regression database roundtrips cleanly.  This
-# doesn't work well when the nodes are different versions, so skip it in that
-# case.  Note that this isn't a pg_restore test, but it's convenient to do it
-# here because we've gone to the trouble of creating the regression database.
-#
-# Do this while the old cluster is running before it is shut down by the
-# upgrade test.
-SKIP:
-{
-	my $dstnode = PostgreSQL::Test::Cluster->new('dst_node');
-
-	skip "different Postgres versions"
-	  if ($oldnode->pg_version != $dstnode->pg_version);
-	skip "source node not using default install"
-	  if (defined $oldnode->install_path);
-
-	# Dump the original database for comparison later.
-	my $src_dump =
-	  get_dump_for_comparison($oldnode, 'regression', 'src_dump', 1);
-
-	# Setup destination database cluster
-	$dstnode->init(%node_params);
-	# Stabilize stats for comparison.
-	$dstnode->append_conf('postgresql.conf', 'autovacuum = off');
-	$dstnode->start;
-
-	my $dump_file = "$tempdir/regression.dump";
-
-	# Use --create in dump and restore commands so that the restored
-	# database has the same configurable variable settings as the original
-	# database so that the dumps taken from both databases taken do not
-	# differ because of locale changes. Additionally this provides test
-	# coverage for --create option.
-	#
-	# Use directory format so that we can use parallel dump/restore.
-	$oldnode->command_ok(
-		[
-			'pg_dump', '-Fd', '-j2', '--no-sync',
-			'-d' => $oldnode->connstr('regression'),
-			'--create', '-f' => $dump_file
-		],
-		'pg_dump on source instance');
-
-	$dstnode->command_ok(
-		[ 'pg_restore', '--create', '-j2', '-d' => 'postgres', $dump_file ],
-		'pg_restore to destination instance');
-
-	my $dst_dump =
-	  get_dump_for_comparison($dstnode, 'regression', 'dest_dump', 0);
-
-	compare_files($src_dump, $dst_dump,
-		'dump outputs from original and restored regression databases match');
-}
-
 # Initialize a new node for the upgrade.
 my $newnode = PostgreSQL::Test::Cluster->new('new_node');
 
-# Reset to original parameters.
-@initdb_params = @custom_opts;
 
 # The new cluster will be initialized with different locale settings,
 # but these settings will be overwritten with those of the original
 # cluster.
-push @initdb_params, ('--encoding', 'SQL_ASCII');
-push @initdb_params, ('--locale-provider', 'libc');
-
-$node_params{extra} = \@initdb_params;
-$newnode->init(%node_params);
+my %new_node_params = %old_node_params;
+my @new_initdb_params = @custom_opts;
+push @new_initdb_params, ('--encoding', 'SQL_ASCII');
+push @new_initdb_params, ('--locale-provider', 'libc');
+$new_node_params{extra} = \@new_initdb_params;
+$newnode->init(%new_node_params);
 
 # Stabilize stats for comparison.
 $newnode->append_conf('postgresql.conf', 'autovacuum = off');
@@ -410,10 +355,65 @@ if (defined($ENV{oldinstall}))
 	}
 }
 
-# Stabilize stats before pg_dumpall.
+# Stabilize stats before pg_dumpall. Doing it after initializing the new node
+# gives enough time for autovacuum to update statistics on the old node.
 $oldnode->append_conf('postgresql.conf', 'autovacuum = off');
 $oldnode->restart;
 
+# Test that dump/restore of the regression database roundtrips cleanly.  This
+# doesn't work well when the nodes are different versions, so skip it in that
+# case.  Note that this isn't a pg_restore test, but it's convenient to do it
+# here because we've gone to the trouble of creating the regression database.
+#
+# Do this while the old cluster is running before it is shut down by the
+# upgrade test but after turning its autovacuum off for stable statistics.
+SKIP:
+{
+	my $dstnode = PostgreSQL::Test::Cluster->new('dst_node');
+
+	skip "different Postgres versions"
+	  if ($oldnode->pg_version != $dstnode->pg_version);
+	skip "source node not using default install"
+	  if (defined $oldnode->install_path);
+
+	# Setup destination database cluster with the same configuration as the
+	# source cluster to avoid any differences between dumps taken from both the
+	# clusters caused by differences in their configurations.
+	$dstnode->init(%old_node_params);
+	# Stabilize stats for comparison.
+	$dstnode->append_conf('postgresql.conf', 'autovacuum = off');
+	$dstnode->start;
+
+	# Use --create in dump and restore commands so that the restored
+	# database has the same configurable variable settings as the original
+	# database so that the dumps taken from both databases taken do not
+	# differ because of locale changes. Additionally this provides test
+	# coverage for --create option.
+	#
+	# Use directory format so that we can use parallel dump/restore.
+	my $dump_file = "$tempdir/regression.dump";
+	$oldnode->command_ok(
+		[
+			'pg_dump', '-Fd', '-j2', '--no-sync',
+			'-d' => $oldnode->connstr('regression'),
+			'--create', '-f' => $dump_file
+		],
+		'pg_dump on source instance');
+
+	$dstnode->command_ok(
+		[ 'pg_restore', '--create', '-j2', '-d' => 'postgres', $dump_file ],
+		'pg_restore to destination instance');
+
+	# Dump original and restored database for comparison.
+	my $src_dump =
+	  get_dump_for_comparison($oldnode, 'regression', 'src_dump', 1);
+	my $dst_dump =
+	  get_dump_for_comparison($dstnode, 'regression', 'dest_dump', 0);
+
+	compare_files($src_dump, $dst_dump,
+		'dump outputs from original and restored regression databases match');
+}
+
 # Take a dump before performing the upgrade as a base comparison. Note
 # that we need to use pg_dumpall from the new node here.
 my @dump_command = (

base-commit: a7187c3723b41057522038c5e5db329d84f41ac4
-- 
2.34.1

Reply via email to