On Thu, Mar 13, 2025 at 6:10 PM Ashutosh Bapat <ashutosh.bapat....@gmail.com> wrote: > > > > I think the fix is to explicitly pass --lc-monetary to the old cluster > > and the restored cluster. 003 patch in the attached patch set does > > that. Please check if it fixes the issue for you. > > > > Additionally we should check that it gets copied to the new cluster as > > well. But I haven't figured out how to get those settings yet. This > > treatment is similar to how --lc-collate and --lc-ctype are treated. I > > am wondering whether we should explicitly pass --lc-messages, > > --lc-time and --lc-numeric as well. > > > > 2d819a08a1cbc11364e36f816b02e33e8dcc030b introduced buildin locale > > provider and added overrides to LC_COLLATE and LC_TYPE. But it did not > > override other LC_, which I think it should have. In pure upgrade > > test, the upgraded node inherits the locale settings of the original > > cluster, so this wasn't apparent. But with pg_dump testing, the > > original and restored databases are independent. Hence I think we have > > to override all LC_* settings by explicitly mentioning --lc-* options > > to initdb. Please let me know what you think about this? > >
Investigated this further. The problem is that the pg_regress run creates regression database with specific properties but the restored database does not have those properties. That led me to a better solution. Additionally it's local to the new test. Use --create when dumping and restoring the regression database. This way the database properties or "configuration variable settings (as pg_dump documentation calls them) are copied to the restored database as well. Those properties include LC_MONETARY. Additionally now the test covers --create option as well. PFA patches. -- Best Wishes, Ashutosh Bapat
From 886e241e304a23bb31b5e59f12149741dfff2b14 Mon Sep 17 00:00:00 2001 From: Ashutosh Bapat <ashutosh.bapat....@gmail.com> Date: Tue, 25 Feb 2025 11:42:51 +0530 Subject: [PATCH 2/2] Do not dump statistics in the file dumped for comparison The dumped and restored statistics of a materialized view may differ as reported in [1]. Hence do not dump the statistics to avoid differences in the dump output from the original and restored database. [1] https://www.postgresql.org/message-id/caexhw5s47kmubpbbrjzsm-zfe0tj2o3gbagb7yaye8rq-v2...@mail.gmail.com Ashutosh Bapat --- src/bin/pg_upgrade/t/002_pg_upgrade.pl | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/src/bin/pg_upgrade/t/002_pg_upgrade.pl b/src/bin/pg_upgrade/t/002_pg_upgrade.pl index d08eea6693f..f931fef2307 100644 --- a/src/bin/pg_upgrade/t/002_pg_upgrade.pl +++ b/src/bin/pg_upgrade/t/002_pg_upgrade.pl @@ -656,15 +656,15 @@ sub get_dump_for_comparison my $dumpfile = $tempdir . '/' . $file_prefix . '.sql'; my $dump_adjusted = "${dumpfile}_adjusted"; - # Usually we avoid comparing statistics in our tests since it is flaky by - # nature. However, if statistics is dumped and restored it is expected to be - # restored as it is i.e. the statistics from the original database and that - # from the restored database should match. We turn off autovacuum on the - # source and the target database to avoid any statistics update during - # restore operation. Hence we do not exclude statistics from dump. + # If statistics is dumped and restored it is expected to be restored as it + # is i.e. the statistics from the original database and that from the + # restored database should match. We turn off autovacuum on the source and + # the target database to avoid any statistics update during restore + # operation. But as of now, there are cases when statistics is not being + # restored faithfully. Hence for now do not dump statistics. $node->command_ok( [ - 'pg_dump', '--no-sync', '-d', $node->connstr($db), '-f', + 'pg_dump', '--no-sync', '--no-statistics', '-d', $node->connstr($db), '-f', $dumpfile ], 'dump for comparison succeeded'); -- 2.34.1
From 1723050dadb89f3187fef19c994d8c866ee5a788 Mon Sep 17 00:00:00 2001 From: Ashutosh Bapat <ashutosh.ba...@enterprisedb.com> Date: Thu, 27 Jun 2024 10:03:53 +0530 Subject: [PATCH 1/2] Test pg_dump/restore of regression objects 002_pg_upgrade.pl tests pg_upgrade of the regression database left behind by regression run. Modify it to test dump and restore of the regression database as well. Regression database created by regression run contains almost all the database objects supported by PostgreSQL in various states. Hence the new testcase covers dump and restore scenarios not covered by individual dump/restore cases. Till now 002_pg_upgrade only tested dump/restore through pg_upgrade which only uses binary mode. Many regression tests mention that they leave objects behind for dump/restore testing but they are not tested in a non-binary mode. The new testcase closes that gap. Testing dump and restore of regression database makes this test run longer for a relatively smaller benefit. Hence run it only when explicitly requested by user by specifying "regress_dump_test" in PG_TEST_EXTRA. Note For the reviewers: The new test has uncovered many bugs so far in one year. 1. Introduced by 14e87ffa5c54. Fixed in fd41ba93e4630921a72ed5127cd0d552a8f3f8fc. 2. Introduced by 0413a556990ba628a3de8a0b58be020fd9a14ed0. Reverted in 74563f6b90216180fc13649725179fc119dddeb5. 3. Fixed by d611f8b1587b8f30caa7c0da99ae5d28e914d54f 3. Being discussed on hackers at https://www.postgresql.org/message-id/caexhw5s47kmubpbbrjzsm-zfe0tj2o3gbagb7yaye8rq-v2...@mail.gmail.com Author: Ashutosh Bapat Reviewed by: Michael Pacquire, Daniel Gustafsson, Tom Lane, Alvaro Herrera Discussion: https://www.postgresql.org/message-id/CAExHW5uF5V=Cjecx3_Z=7xfh4rg2wf61pt+hfquzjbqourz...@mail.gmail.com --- doc/src/sgml/regress.sgml | 12 ++ src/bin/pg_upgrade/t/002_pg_upgrade.pl | 144 ++++++++++++++++- src/test/perl/Makefile | 2 + src/test/perl/PostgreSQL/Test/AdjustDump.pm | 167 ++++++++++++++++++++ src/test/perl/meson.build | 1 + 5 files changed, 324 insertions(+), 2 deletions(-) create mode 100644 src/test/perl/PostgreSQL/Test/AdjustDump.pm diff --git a/doc/src/sgml/regress.sgml b/doc/src/sgml/regress.sgml index 0e5e8e8f309..237b974b3ab 100644 --- a/doc/src/sgml/regress.sgml +++ b/doc/src/sgml/regress.sgml @@ -357,6 +357,18 @@ make check-world PG_TEST_EXTRA='kerberos ldap ssl load_balance libpq_encryption' </para> </listitem> </varlistentry> + + <varlistentry> + <term><literal>regress_dump_test</literal></term> + <listitem> + <para> + When enabled, <filename>src/bin/pg_upgrade/t/002_pg_upgrade.pl</filename> + tests dump and restore of regression database left behind by the + regression run. Not enabled by default because it is time and resource + consuming. + </para> + </listitem> + </varlistentry> </variablelist> Tests for features that are not supported by the current build diff --git a/src/bin/pg_upgrade/t/002_pg_upgrade.pl b/src/bin/pg_upgrade/t/002_pg_upgrade.pl index 00051b85035..d08eea6693f 100644 --- a/src/bin/pg_upgrade/t/002_pg_upgrade.pl +++ b/src/bin/pg_upgrade/t/002_pg_upgrade.pl @@ -12,6 +12,7 @@ use File::Path qw(rmtree); use PostgreSQL::Test::Cluster; use PostgreSQL::Test::Utils; use PostgreSQL::Test::AdjustUpgrade; +use PostgreSQL::Test::AdjustDump; use Test::More; # Can be changed to test the other modes. @@ -35,8 +36,8 @@ sub generate_db "created database with ASCII characters from $from_char to $to_char"); } -# Filter the contents of a dump before its use in a content comparison. -# This returns the path to the filtered dump. +# Filter the contents of a dump before its use in a content comparison for +# upgrade testing. This returns the path to the filtered dump. sub filter_dump { my ($is_old, $old_version, $dump_file) = @_; @@ -262,6 +263,21 @@ else } } is($rc, 0, 'regression tests pass'); + + # Test dump/restore of the objects left behind by regression. Ideally it + # should be done in a separate TAP test, but doing it here saves us one full + # regression run. + # + # This step takes several extra seconds and some extra disk space, so + # requires an opt-in with the PG_TEST_EXTRA environment variable. + # + # Do this while the old cluster is running before it is shut down by the + # upgrade test. + if ( $ENV{PG_TEST_EXTRA} + && $ENV{PG_TEST_EXTRA} =~ /\bregress_dump_test\b/) + { + test_regression_dump_restore($oldnode, %node_params); + } } # Initialize a new node for the upgrade. @@ -539,4 +555,128 @@ my $dump2_filtered = filter_dump(0, $oldnode->pg_version, $dump2_file); compare_files($dump1_filtered, $dump2_filtered, 'old and new dumps match after pg_upgrade'); +# Test dump and restore of objects left behind by the regression run. +# +# It is expected that regression tests, which create `regression` database, are +# run on `src_node`, which in turn, is left in running state. A fresh node is +# created using given `node_params`, which are expected to be the same ones used +# to create `src_node`, so as to avoid any differences in the databases. +# +# Plain dumps from both the nodes are compared to make sure that all the dumped +# objects are restored faithfully. +sub test_regression_dump_restore +{ + my ($src_node, %node_params) = @_; + my $dst_node = PostgreSQL::Test::Cluster->new('dst_node'); + + # Make sure that the source and destination nodes have the same version and + # do not use custom install paths. In both the cases, the dump files may + # require additional adjustments unknown to code here. Do not run this test + # in such a case to avoid utilizing the time and resources unnecessarily. + if ($src_node->pg_version != $dst_node->pg_version + or defined $src_node->{_install_path}) + { + fail("same version dump and restore test using default installation"); + return; + } + + # Dump the original database for comparison later. + my $src_dump = + get_dump_for_comparison($src_node, 'regression', 'src_dump', 1); + + # Setup destination database cluster + $dst_node->init(%node_params); + # Stabilize stats for comparison. + $dst_node->append_conf('postgresql.conf', 'autovacuum = off'); + $dst_node->start; + + # Test all formats one by one. + for my $format ('plain', 'tar', 'directory', 'custom') + { + my $dump_file = "$tempdir/regression_dump.$format"; + my $restored_db = 'regression_' . $format; + + # Use --create in dump and restore commands so that the restored + # database has the same configurable variable settings as the original + # database and the plain dumps taken for comparsion do not differ + # because of locale changes. Additionally this provides test coverage + # for --create option. + $src_node->command_ok( + [ + 'pg_dump', "-F$format", '--no-sync', + '-d', $src_node->connstr('regression'), + '--create', '-f', $dump_file + ], + "pg_dump on source instance in $format format"); + + my @restore_command; + if ($format eq 'plain') + { + # Restore dump in "plain" format using `psql`. + @restore_command = [ 'psql', '-d', 'postgres', '-f', $dump_file ]; + } + else + { + @restore_command = [ + 'pg_restore', '--create', + '-d', 'postgres', $dump_file + ]; + } + $dst_node->command_ok(@restore_command, + "restored dump taken in $format format on destination instance"); + + my $dst_dump = + get_dump_for_comparison($dst_node, 'regression', + 'dest_dump.' . $format, 0); + + compare_files($src_dump, $dst_dump, + "dump outputs from original and restored regression database (using $format format) match" + ); + + # Rename the restored database so that it is available for debugging in + # case the test fails. + $dst_node->safe_psql('postgres', "ALTER DATABASE regression RENAME TO $restored_db"); + } +} + +# Dump database `db` from the given `node` in plain format and adjust it for +# comparing dumps from the original and the restored database. +# +# `file_prefix` is used to create unique names for all dump files so that they +# remain available for debugging in case the test fails. +# +# `adjust_child_columns` is passed to adjust_regress_dumpfile() which actually +# adjusts the dump output. +# +# The name of the file containting adjusted dump is returned. +sub get_dump_for_comparison +{ + my ($node, $db, $file_prefix, $adjust_child_columns) = @_; + + my $dumpfile = $tempdir . '/' . $file_prefix . '.sql'; + my $dump_adjusted = "${dumpfile}_adjusted"; + + # Usually we avoid comparing statistics in our tests since it is flaky by + # nature. However, if statistics is dumped and restored it is expected to be + # restored as it is i.e. the statistics from the original database and that + # from the restored database should match. We turn off autovacuum on the + # source and the target database to avoid any statistics update during + # restore operation. Hence we do not exclude statistics from dump. + $node->command_ok( + [ + 'pg_dump', '--no-sync', '-d', $node->connstr($db), '-f', + $dumpfile + ], + 'dump for comparison succeeded'); + + open(my $dh, '>', $dump_adjusted) + || die + "could not open $dump_adjusted for writing the adjusted dump: $!"; + print $dh adjust_regress_dumpfile(slurp_file($dumpfile), + $adjust_child_columns); + close($dh); + + return $dump_adjusted; +} + done_testing(); diff --git a/src/test/perl/Makefile b/src/test/perl/Makefile index d82fb67540e..def89650ead 100644 --- a/src/test/perl/Makefile +++ b/src/test/perl/Makefile @@ -26,6 +26,7 @@ install: all installdirs $(INSTALL_DATA) $(srcdir)/PostgreSQL/Test/Cluster.pm '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Test/Cluster.pm' $(INSTALL_DATA) $(srcdir)/PostgreSQL/Test/BackgroundPsql.pm '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Test/BackgroundPsql.pm' $(INSTALL_DATA) $(srcdir)/PostgreSQL/Test/AdjustUpgrade.pm '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Test/AdjustUpgrade.pm' + $(INSTALL_DATA) $(srcdir)/PostgreSQL/Test/AdjustDump.pm '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Test/AdjustDump.pm' $(INSTALL_DATA) $(srcdir)/PostgreSQL/Version.pm '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Version.pm' uninstall: @@ -36,6 +37,7 @@ uninstall: rm -f '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Test/Cluster.pm' rm -f '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Test/BackgroundPsql.pm' rm -f '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Test/AdjustUpgrade.pm' + rm -f '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Test/AdjustDump.pm' rm -f '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Version.pm' endif diff --git a/src/test/perl/PostgreSQL/Test/AdjustDump.pm b/src/test/perl/PostgreSQL/Test/AdjustDump.pm new file mode 100644 index 00000000000..74b9a60cf34 --- /dev/null +++ b/src/test/perl/PostgreSQL/Test/AdjustDump.pm @@ -0,0 +1,167 @@ + +# Copyright (c) 2024-2025, PostgreSQL Global Development Group + +=pod + +=head1 NAME + +PostgreSQL::Test::AdjustDump - helper module for dump and restore tests + +=head1 SYNOPSIS + + use PostgreSQL::Test::AdjustDump; + + # Adjust contents of dump output file so that dump output from original + # regression database and that from the restored regression database match + $dump = adjust_regress_dumpfile($dump, $adjust_child_columns); + +=head1 DESCRIPTION + +C<PostgreSQL::Test::AdjustDump> encapsulates various hacks needed to +compare the results of dump and restore tests + +=cut + +package PostgreSQL::Test::AdjustDump; + +use strict; +use warnings FATAL => 'all'; + +use Exporter 'import'; +use Test::More; + +our @EXPORT = qw( + adjust_regress_dumpfile +); + +=pod + +=head1 ROUTINES + +=over + +=item $dump = adjust_regress_dumpfile($dump, $adjust_child_columns) + +If we take dump of the regression database left behind after running regression +tests, restore the dump, and take dump of the restored regression database, the +outputs of both the dumps differ in the following cases. This routine adjusts +the given dump so that dump outputs from the original and restored database, +respectively, match. + +Case 1: Some regression tests purposefully create child tables in such a way +that the order of their inherited columns differ from column orders of their +respective parents. In the restored database, however, the order of their +inherited columns are same as that of their respective parents. Thus the column +orders of these child tables in the original database and those in the restored +database differ, causing difference in the dump outputs. See MergeAttributes() +and dumpTableSchema() for details. This routine rearranges the column +declarations in the relevant C<CREATE TABLE... INHERITS> statements in the dump +file from original database to match those from the restored database. We could, +instead, adjust the statements in the dump from the restored database to match +those from original database or adjust both to a canonical order. But we have +chosen to adjust the statements in the dump from original database for no +particular reason. + +Case 2: When dumping COPY statements the columns are ordered by their attribute +number by fmtCopyColumnList(). If a column is added to a parent table after a +child has inherited the parent and the child has its own columns, the attribute +number of the column changes after restoring the child table. This is because +when executing the dumped C<CREATE TABLE... INHERITS> statement all the parent +attributes are created before any child attributes. Thus the order of columns in +COPY statements dumped from the original and the restored databases, +respectively, differs. Such tables in regression tests are listed below. It is +hard to adjust the column order in the COPY statement along with the data. Hence +we just remove such COPY statements from the dump output. + +Additionally the routine adjusts blank and new lines to avoid noise. + +Note: Usually we avoid comparing statistics in our tests since it is flaky by +nature. However, if statistics is dumped and restored it is expected to be +restored as it is i.e. the statistics from the original database and that from +the restored database should match. Hence we do not filter statistics from dump, +if it's dumped. + +Arguments: + +=over + +=item C<dump>: Contents of dump file + +=item C<adjust_child_columns>: 1 indicates that the given dump file requires +adjusting columns in the child tables; usually when the dump is from original +database. 0 indicates no such adjustment is needed; usually when the dump is +from restored database. + +=back + +Returns the adjusted dump text. + +=cut + +sub adjust_regress_dumpfile +{ + my ($dump, $adjust_child_columns) = @_; + + # use Unix newlines + $dump =~ s/\r\n/\n/g; + + # Adjust the CREATE TABLE ... INHERITS statements. + if ($adjust_child_columns) + { + my $saved_dump = $dump; + + $dump =~ s/(^CREATE\sTABLE\sgenerated_stored_tests\.gtestxx_4\s\() + (\n\s+b\sinteger), + (\n\s+a\sinteger\sNOT\sNULL)/$1$3,$2/mgx; + ok($saved_dump ne $dump, + 'applied generated_stored_tests.gtestxx_4 adjustments'); + + $saved_dump = $dump; + $dump =~ s/(^CREATE\sTABLE\sgenerated_virtual_tests\.gtestxx_4\s\() + (\n\s+b\sinteger), + (\n\s+a\sinteger\sNOT\sNULL)/$1$3,$2/mgx; + ok($saved_dump ne $dump, + 'applied generated_virtual_tests.gtestxx_4 adjustments'); + + $saved_dump = $dump; + $dump =~ s/(^CREATE\sTABLE\spublic\.test_type_diff2_c1\s\() + (\n\s+int_four\sbigint), + (\n\s+int_eight\sbigint), + (\n\s+int_two\ssmallint)/$1$4,$2,$3/mgx; + ok($saved_dump ne $dump, + 'applied public.test_type_diff2_c1 adjustments'); + + $saved_dump = $dump; + $dump =~ s/(^CREATE\sTABLE\spublic\.test_type_diff2_c2\s\() + (\n\s+int_eight\sbigint), + (\n\s+int_two\ssmallint), + (\n\s+int_four\sbigint)/$1$3,$4,$2/mgx; + ok($saved_dump ne $dump, + 'applied public.test_type_diff2_c2 adjustments'); + } + + # Remove COPY statements with differing column order + for my $table ( + 'public\.b_star', 'public\.c_star', + 'public\.cc2', 'public\.d_star', + 'public\.e_star', 'public\.f_star', + 'public\.renamecolumnanother', 'public\.renamecolumnchild', + 'public\.test_type_diff2_c1', 'public\.test_type_diff2_c2', + 'public\.test_type_diff_c') + { + $dump =~ s/^COPY\s$table\s\(.+?^\\\.$//sm; + } + + # Suppress blank lines, as some places in pg_dump emit more or fewer. + $dump =~ s/\n\n+/\n/g; + + return $dump; +} + +=pod + +=back + +=cut + +1; diff --git a/src/test/perl/meson.build b/src/test/perl/meson.build index 58e30f15f9d..492ca571ff8 100644 --- a/src/test/perl/meson.build +++ b/src/test/perl/meson.build @@ -14,4 +14,5 @@ install_data( 'PostgreSQL/Test/Cluster.pm', 'PostgreSQL/Test/BackgroundPsql.pm', 'PostgreSQL/Test/AdjustUpgrade.pm', + 'PostgreSQL/Test/AdjustDump.pm', install_dir: dir_pgxs / 'src/test/perl/PostgreSQL/Test') base-commit: 190dc27998d5b7b4c36e12bebe62f7176f4b4507 -- 2.34.1