On 2/29/24 16:55, Michael Paquier wrote:
On Thu, Feb 29, 2024 at 10:30:52AM +1300, David Steele wrote:
On 11/12/23 08:21, David Steele wrote:
As promised in [1], attached are some basic tests for the low-level
backup method.

Added to the 2024-03 CF.

There is already 040_standby_failover_slots_sync.pl in recovery/ that
uses the number of your test script.  You may want to bump it, that's
a nit.

Renamed to 042_low_level_backup.pl.

+unlink("$backup_dir/postmaster.pid")
+       or BAIL_OUT("unable to unlink $backup_dir/postmaster.pid");
+unlink("$backup_dir/postmaster.opts")
+       or BAIL_OUT("unable to unlink $backup_dir/postmaster.opts");
+unlink("$backup_dir/global/pg_control")
+       or BAIL_OUT("unable to unlink $backup_dir/global/pg_control");

RELCACHE_INIT_FILENAME as well?

I'm not trying to implement the full exclusion list here, just enough to get the test working. Since exclusions are optional according to the docs I don't think we need them for a valid tests.

+# Rather than writing out backup_label, try to recover the backup without
+# backup_label to demonstrate that recovery will not work correctly without it,
+# i.e. the canary table will be missing and the cluster will be corrupt. 
Provide
+# only the WAL segment that recovery will think it needs.

Okay, why not.  No objections to this addition.  I am a bit surprised
that this is not scanning some of the logs produced by the startup
process for particular patterns.

Not sure what to look for here. There are no distinct messages for crash recovery. Perhaps there should be?

+# Save backup_label into the backup directory and recover using the primary's
+# archive. This time recovery will succeed and the canary table will be
+# present.

Here are well, I think that we should add some log_contains() with
pre-defined patterns to show that recovery has completed the way we
want it with a backup_label up to the end-of-backup record.

Sure, I added a check for the new log message when recovering with a backup_label.

Regards,
-David
From 076429a578ece3c213525220a4961cb5b56c190a Mon Sep 17 00:00:00 2001
From: David Steele <da...@pgmasters.net>
Date: Wed, 13 Mar 2024 00:01:55 +0000
Subject: Add basic tests for the low-level backup method.

There are currently no tests for the low-level backup method. pg_backup_start()
and pg_backup_stop() are called but not exercised in any real fashion.

There is a lot more that can be done, but this at least provides some basic
tests and provides a place for future improvement.
---
 src/test/recovery/t/042_low_level_backup.pl | 108 ++++++++++++++++++++
 1 file changed, 108 insertions(+)
 create mode 100644 src/test/recovery/t/042_low_level_backup.pl

diff --git a/src/test/recovery/t/042_low_level_backup.pl 
b/src/test/recovery/t/042_low_level_backup.pl
new file mode 100644
index 0000000000..22668bdc0d
--- /dev/null
+++ b/src/test/recovery/t/042_low_level_backup.pl
@@ -0,0 +1,108 @@
+
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Test low-level backup method by using pg_backup_start() and pg_backup_stop()
+# to create backups.
+
+use strict;
+use warnings;
+
+use File::Copy qw(copy);
+use File::Path qw(remove_tree);
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Start primary node with archiving
+my $node_primary = PostgreSQL::Test::Cluster->new('primary');
+$node_primary->init(has_archiving => 1, allows_streaming => 1);
+$node_primary->start;
+
+# Start backup
+my $backup_name = 'backup1';
+my $psql = $node_primary->background_psql('postgres');
+
+$psql->query_safe("SET client_min_messages TO WARNING");
+$psql->set_query_timer_restart();
+$psql->query_safe("select pg_backup_start('test label')");
+
+# Copy files
+my $backup_dir = $node_primary->backup_dir() . '/' . $backup_name;
+
+PostgreSQL::Test::RecursiveCopy::copypath(
+       $node_primary->data_dir(), $backup_dir);
+
+# Cleanup some files/paths that should not be in the backup. There is no
+# attempt to all the exclusions done by pg_basebackup here, in part because
+# they are not required, but also to keep the test simple.
+#
+# Also remove pg_control because it needs to be copied later and it will be
+# obvious if the copy fails.
+unlink("$backup_dir/postmaster.pid")
+       or BAIL_OUT("unable to unlink $backup_dir/postmaster.pid");
+unlink("$backup_dir/postmaster.opts")
+       or BAIL_OUT("unable to unlink $backup_dir/postmaster.opts");
+unlink("$backup_dir/global/pg_control")
+       or BAIL_OUT("unable to unlink $backup_dir/global/pg_control");
+
+remove_tree("$backup_dir/pg_wal", {keep_root => 1})
+       or BAIL_OUT("unable to unlink contents of $backup_dir/pg_wal");
+
+# Create a table that will be used to verify that recovery started at the
+# correct location, rather than the location recorded in pg_control
+$psql->query_safe("create table canary (id int)");
+
+# Advance the checkpoint location in pg_control past the location where the
+# backup started. Switch the WAL to make it really obvious that the location
+# is different and to put the checkpoint in a new WAL segment.
+$psql->query_safe("select pg_switch_wal()");
+$psql->query_safe("checkpoint");
+
+# Copy pg_control last so it contains the new checkpoint
+copy($node_primary->data_dir() . '/global/pg_control',
+               "$backup_dir/global/pg_control")
+       or BAIL_OUT("unable to copy global/pg_control");
+
+# Stop backup and get backup_label
+my $backup_label = $psql->query_safe("select labelfile from pg_backup_stop()");
+
+# Rather than writing out backup_label, try to recover the backup without
+# backup_label to demonstrate that recovery will not work correctly without it,
+# i.e. the canary table will be missing and the cluster will be corrupt. 
Provide
+# only the WAL segment that recovery will think it needs.
+#
+# The point of this test is to explicitly demonstrate that backup_label is
+# being used in a later test to get the correct recovery info.
+my $node_replica = PostgreSQL::Test::Cluster->new('replica_fail');
+$node_replica->init_from_backup($node_primary, $backup_name);
+my $canary_query = "select count(*) from pg_class where relname = 'canary'";
+
+copy($node_primary->archive_dir() . '/000000010000000000000003',
+               $node_replica->data_dir() . '/pg_wal/000000010000000000000003')
+       or BAIL_OUT("unable to copy 000000010000000000000003");
+
+$node_replica->start;
+
+ok($node_replica->safe_psql('postgres', $canary_query) == 0,
+       'canary is missing');
+
+$node_replica->teardown_node();
+$node_replica->clean_node();
+
+# Save backup_label into the backup directory and recover using the primary's
+# archive. This time recovery will succeed and the canary table will be
+# present.
+append_to_file("$backup_dir/backup_label", $backup_label);
+
+$node_replica = PostgreSQL::Test::Cluster->new('replica_success');
+$node_replica->init_from_backup($node_primary, $backup_name,
+       has_restoring => 1);
+$node_replica->start;
+
+ok($node_replica->log_contains('completed backup recovery with redo LSN'),
+       'verify backup recovery performed with backup_label');
+
+ok($node_replica->safe_psql('postgres', $canary_query) == 1,
+       'canary is present');
+
+done_testing();
-- 
2.34.1

Reply via email to