On Wed, Jan 17, 2024 at 8:40 PM Robert Haas <robertmh...@gmail.com> wrote:

> On Wed, Jan 17, 2024 at 6:31 AM Amul Sul <sula...@gmail.com> wrote:
> > With the attached patch, the backup manifest will have a new key item as
> > "System-Identifier" 64-bit integer whose value is derived from
> pg_control while
> > generating it, and the manifest version bumps to 2.
> >
> > This helps to identify the correct database server and/or backup for the
> > subsequent backup operations.  pg_verifybackup validates the manifest
> system
> > identifier against the backup control file and fails if they don’t match.
> > Similarly, pg_basebackup increment backup will fail if the manifest
> system
> > identifier does not match with the server system identifier.  The
> > pg_combinebackup is already a bit smarter -- checks the system
> identifier from
> > the pg_control of all the backups, with this patch the manifest system
> > identifier also validated.
>
> Thanks for working on this. Without this, I think what happens is that
> you can potentially take an incremental backup from the "wrong"
> server, if the states of the systems are such that all of the other
> sanity checks pass. When you run pg_combinebackup, it'll discover the
> problem and tell you, but you ideally want to discover such errors at
> backup time rather than at restore time. This addresses that. And,
> overall, I think it's a pretty good patch. But I nonetheless have a
> bunch of comments.
>

Thank you for the review.


>
> -      The associated value is always the integer 1.
> +      The associated value is the integer, either 1 or 2.
>
> is an integer. Beginning in <productname>PostgreSQL</productname> 17,
> it is 2; in older versions, it is 1.
>

Ok,


> + context.identity_cb = manifest_process_identity;
>
> I'm not really on board with calling the system identifier "identity"
> throughout the patch. I think it should just say system_identifier. If
> we were going to abbreviate, I'd prefer something like "sysident" that
> looks like it's derived from "system identifier" rather than
> "identity" which is a different word altogether. But I don't think we
> should abbreviate unless doing so creates *ridiculously* long
> identifier names.
>

Ok, used "system identifier" at all the places.


> +static void
> +manifest_process_identity(JsonManifestParseContext *context,
> +   int manifest_version,
> +   uint64 manifest_system_identifier)
> +{
> + uint64 system_identifier;
> +
> + /* Manifest system identifier available in version 2 or later */
> + if (manifest_version == 1)
> + return;
>
> I think you've got the wrong idea here. I think this function would
> only get called if System-Identifier is present in the manifest, so if
> it's a v1 manifest, this would never get called, so this if-statement
> would not ever do anything useful. I think what you should do is (1)
> if the client supplies a v1 manifest, reject it, because surely that's
> from an older server version that doesn't support incremental backup;
> but do that when the version is parsed rather than here; and (2) also
> detect and reject the case when it's supposedly a v2 manifest but this
> is absent.
>
> (1) should really be done when the version number is parsed, so I
> suspect you may need to add manifest_version_cb.
>
> +static void
> +combinebackup_identity_cb(JsonManifestParseContext *context,
> +   int manifest_version,
> +   uint64 manifest_system_identifier)
> +{
> + parser_context *private_context = context->private_data;
> + uint64 system_identifier = private_context->system_identifier;
> +
> + /* Manifest system identifier available in version 2 or later */
> + if (manifest_version == 1)
> + return;
>
> Very similar to the above case. Just reject a version 1 manifest as
> soon as we see the version number. In this function, set a flag
> indicating we saw the system identifier; if at the end of parsing that
> flag is not set, kaboom.
>

Ok, I added a version_cb callback. Using this pg_combinebackup &
pg_basebackup
will report an error for manifest version 1, whereas pg_verifybackup
doesn't (not needed IIUC).


>
> - parse_manifest_file(manifest_path, &context.ht, &first_wal_range);
> + parse_manifest_file(manifest_path, &context.ht, &first_wal_range,
> + context.backup_directory);
>
> Don't do this! parse_manifest_file() should just record everything
> found in the manifest in the context object. Syntax validation should
> happen while parsing the manifest (e.g. "CAT/DOG" is not a valid LSN
> and we should reject that at this stage) but semantic validation
> should happen later (e.g. "0/0" can't be a the correct backup end LSN
> but we don't figure that out while parsing, but rather later). I think
> you should actually move validation of the system identifier to the
> point where the directory walk encounters the control file (and update
> the docs and tests to match that decision). Imagine if you wanted to
> validate a tar-format backup; then you wouldn't have random access to
> the directory. You'd see the manifest file first, and then all the
> files in a random order, with one chance to look at each one.
>
>
Agree.  I have moved the system identifier validation after manifest
parsing.
But, not in the directory walkthrough since in pg_combinebackup, we don't
really needed to open the pg_control file to get the system identifier,
which we
have from the check_control_files().


> (This is, in fact, a feature I think we should implement.)
>
> - if (strcmp(token, "1") != 0)
> + parse->manifest_version = atoi(token);
> + if (parse->manifest_version != 1 && parse->manifest_version != 2)
>   json_manifest_parse_failure(parse->context,
>   "unexpected manifest version");
>
> Please either (a) don't do a string-to-integer conversion and just
> strcmp() twice or (b) use strtol so that you can check that it
> succeeded. I don't want to accept manifest version 1a as 1.
>

Understood, corrected in the attached version.


> +/*
> + * Validate manifest system identifier against the database server system
> + * identifier.
> + */
>
> This comment assumes you know what the callback is going to do, but
> you don't. This should be more like the comment for
> json_manifest_finalize_file or json_manifest_finalize_wal_range.
>

Ok.

Updated version is attached.

Regards,
Amul
From 567b498dab623b60279c4f5df909bcb564b4f81a Mon Sep 17 00:00:00 2001
From: Amul Sul <amul.sul@enterprisedb.com>
Date: Fri, 19 Jan 2024 22:21:12 +0530
Subject: [PATCH v2] Add system identifier to backup manifest

The backup manifest will be having a new item as "System-Identifier"
and its value is from "ControlFileData.system_identifier" when
manifest generated.  This help identify the correct database server
and/or backup while taking subsequent backup.
---
 doc/src/sgml/backup-manifest.sgml             | 15 +++-
 doc/src/sgml/ref/pg_verifybackup.sgml         |  3 +-
 src/backend/backup/backup_manifest.c          |  9 +-
 src/backend/backup/basebackup.c               |  3 +-
 src/backend/backup/basebackup_incremental.c   | 39 ++++++++
 src/bin/pg_basebackup/t/010_pg_basebackup.pl  | 20 +++++
 src/bin/pg_combinebackup/load_manifest.c      | 62 +++++++++++--
 src/bin/pg_combinebackup/load_manifest.h      |  1 +
 src/bin/pg_combinebackup/pg_combinebackup.c   | 33 +++++--
 src/bin/pg_combinebackup/t/005_integrity.pl   | 14 +++
 src/bin/pg_combinebackup/write_manifest.c     |  8 +-
 src/bin/pg_combinebackup/write_manifest.h     |  3 +-
 src/bin/pg_verifybackup/pg_verifybackup.c     | 90 ++++++++++++++++++-
 src/bin/pg_verifybackup/t/003_corruption.pl   | 31 ++++++-
 src/bin/pg_verifybackup/t/004_options.pl      |  2 +-
 src/bin/pg_verifybackup/t/005_bad_manifest.pl | 11 +++
 src/common/parse_manifest.c                   | 83 ++++++++++++++++-
 src/include/backup/backup_manifest.h          |  3 +-
 src/include/common/parse_manifest.h           |  6 ++
 19 files changed, 404 insertions(+), 32 deletions(-)

diff --git a/doc/src/sgml/backup-manifest.sgml b/doc/src/sgml/backup-manifest.sgml
index 771be1310a..a67462e3eb 100644
--- a/doc/src/sgml/backup-manifest.sgml
+++ b/doc/src/sgml/backup-manifest.sgml
@@ -37,7 +37,20 @@
     <term><literal>PostgreSQL-Backup-Manifest-Version</literal></term>
     <listitem>
      <para>
-      The associated value is always the integer 1.
+      The associated value is an integer. Beginning in
+      <productname>PostgreSQL</productname> 17, it is 2; in older versions,
+      it is 1.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
+    <term><literal>System-Identifier</literal></term>
+    <listitem>
+     <para>
+      The associated integer value is an unique system identifier to ensure
+      file match up with the installation that produced them. Available only
+      when <literal>PostgreSQL-Backup-Manifest-Version</literal> is 2 and later.
      </para>
     </listitem>
    </varlistentry>
diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index 36335e0a18..3051517d92 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -53,7 +53,8 @@ PostgreSQL documentation
    Backup verification proceeds in four stages. First,
    <literal>pg_verifybackup</literal> reads the
    <literal>backup_manifest</literal> file. If that file
-   does not exist, cannot be read, is malformed, or fails verification
+   does not exist, cannot be read, is malformed, fails to match the system
+   identifier with pg_control of the backup directory or fails verification
    against its own internal checksum, <literal>pg_verifybackup</literal>
    will terminate with a fatal error.
   </para>
diff --git a/src/backend/backup/backup_manifest.c b/src/backend/backup/backup_manifest.c
index 2c34e59752..612ff3add2 100644
--- a/src/backend/backup/backup_manifest.c
+++ b/src/backend/backup/backup_manifest.c
@@ -56,7 +56,8 @@ IsManifestEnabled(backup_manifest_info *manifest)
 void
 InitializeBackupManifest(backup_manifest_info *manifest,
 						 backup_manifest_option want_manifest,
-						 pg_checksum_type manifest_checksum_type)
+						 pg_checksum_type manifest_checksum_type,
+						 uint64 system_identifier)
 {
 	memset(manifest, 0, sizeof(backup_manifest_info));
 	manifest->checksum_type = manifest_checksum_type;
@@ -79,8 +80,10 @@ InitializeBackupManifest(backup_manifest_info *manifest,
 
 	if (want_manifest != MANIFEST_OPTION_NO)
 		AppendToManifest(manifest,
-						 "{ \"PostgreSQL-Backup-Manifest-Version\": 1,\n"
-						 "\"Files\": [");
+						 "{ \"PostgreSQL-Backup-Manifest-Version\": 2,\n"
+						 "\"System-Identifier\": " UINT64_FORMAT ",\n"
+						 "\"Files\": [",
+						 system_identifier);
 }
 
 /*
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index d5b8ca21b7..315efc7536 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -256,7 +256,8 @@ perform_base_backup(basebackup_options *opt, bbsink *sink,
 	backup_started_in_recovery = RecoveryInProgress();
 
 	InitializeBackupManifest(&manifest, opt->manifest,
-							 opt->manifest_checksum_type);
+							 opt->manifest_checksum_type,
+							 GetSystemIdentifier());
 
 	total_checksum_failures = 0;
 
diff --git a/src/backend/backup/basebackup_incremental.c b/src/backend/backup/basebackup_incremental.c
index 0504c465db..c5e3a017f8 100644
--- a/src/backend/backup/basebackup_incremental.c
+++ b/src/backend/backup/basebackup_incremental.c
@@ -112,6 +112,10 @@ struct IncrementalBackupInfo
 	BlockRefTable *brtab;
 };
 
+static void manifest_process_version(JsonManifestParseContext *context,
+									 int manifest_version);
+static void manifest_process_system_identifier(JsonManifestParseContext *context,
+											   uint64 manifest_system_identifier);
 static void manifest_process_file(JsonManifestParseContext *context,
 								  char *pathname,
 								  size_t size,
@@ -198,6 +202,8 @@ FinalizeIncrementalManifest(IncrementalBackupInfo *ib)
 
 	/* Parse the manifest. */
 	context.private_data = ib;
+	context.version_cb = manifest_process_version;
+	context.system_identifier_cb = manifest_process_system_identifier;
 	context.per_file_cb = manifest_process_file;
 	context.per_wal_range_cb = manifest_process_wal_range;
 	context.error_cb = manifest_report_error;
@@ -910,6 +916,39 @@ hash_string_pointer(const char *s)
 	return hash_bytes(ss, strlen(s));
 }
 
+/*
+ * This callback to validate the manifest version for incremental backup.
+ */
+static void
+manifest_process_version(JsonManifestParseContext *context,
+						 int manifest_version)
+{
+	/* Incremental backups supported on manifest version 2 or later */
+	if (manifest_version == 1)
+		context->error_cb(context,
+						  "incremental backups cannot be taken for this backup");
+}
+
+/*
+ * This callback to validate the manifest system identifier against the current
+ * database server.
+ */
+static void
+manifest_process_system_identifier(JsonManifestParseContext *context,
+								   uint64 manifest_system_identifier)
+{
+	uint64		system_identifier;
+
+	/* Get system identifier of current system */
+	system_identifier = GetSystemIdentifier();
+
+	if (manifest_system_identifier != system_identifier)
+		context->error_cb(context,
+						  "manifest is from different database system: manifest database system identifier is %llu, pg_control database system identifier is %llu",
+						  (unsigned long long) manifest_system_identifier,
+						  (unsigned long long) system_identifier);
+}
+
 /*
  * This callback is invoked for each file mentioned in the backup manifest.
  *
diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index 42a09d0da3..ec663dfcd4 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -949,4 +949,24 @@ $backupdir = $node->backup_dir . '/backup3';
 my @dst_tblspc = glob "$backupdir/pg_tblspc/$tblspc_oid/PG_*";
 is(@dst_tblspc, 1, 'tblspc directory copied');
 
+# Can't take backup with referring manifest of different cluster
+#
+# Set up another new database instance. We don't want to use the cached
+# INITDB_TEMPLATE for this, because we want it to be a separate cluster
+# with a different system ID.
+my $node2;
+{
+	local $ENV{'INITDB_TEMPLATE'} = undef;
+
+	$node2 = PostgreSQL::Test::Cluster->new('node2');
+	$node2->init(has_archiving => 1, allows_streaming => 1);
+	$node2->append_conf('postgresql.conf', 'summarize_wal = on');
+	$node2->start;
+}
+$node2->command_fails_like(
+	[ @pg_basebackup_defs, '-D', "$tempdir" . '/diff_sysid',
+		'--incremental', "$backupdir" . '/backup_manifest' ],
+	qr/manifest is from different database system/,
+	"pg_basebackup fails with different database system manifest");
+
 done_testing();
diff --git a/src/bin/pg_combinebackup/load_manifest.c b/src/bin/pg_combinebackup/load_manifest.c
index 2b8e74fcf3..4756707fe1 100644
--- a/src/bin/pg_combinebackup/load_manifest.c
+++ b/src/bin/pg_combinebackup/load_manifest.c
@@ -50,6 +50,19 @@ static uint32 hash_string_pointer(char *s);
 #define SH_DEFINE
 #include "lib/simplehash.h"
 
+/*
+ * Details we need in callbacks that occur while parsing a backup manifest.
+ */
+typedef struct parser_context
+{
+	manifest_data *manifest;
+	char	   *backup_directory;
+} parser_context;
+
+static void combinebackup_version_cb(JsonManifestParseContext *context,
+									 int manifest_version);
+static void combinebackup_system_identifier_cb(JsonManifestParseContext *context,
+											   uint64 manifest_system_identifier);
 static void combinebackup_per_file_cb(JsonManifestParseContext *context,
 									  char *pathname, size_t size,
 									  pg_checksum_type checksum_type,
@@ -104,7 +117,7 @@ load_backup_manifest(char *backup_directory)
 	char	   *buffer;
 	int			rc;
 	JsonManifestParseContext context;
-	manifest_data *result;
+	parser_context private_context;
 
 	/* Open the manifest file. */
 	snprintf(pathname, MAXPGPATH, "%s/backup_manifest", backup_directory);
@@ -150,9 +163,12 @@ load_backup_manifest(char *backup_directory)
 	close(fd);
 
 	/* Parse the manifest. */
-	result = pg_malloc0(sizeof(manifest_data));
-	result->files = ht;
-	context.private_data = result;
+	private_context.manifest = pg_malloc0(sizeof(manifest_data));
+	private_context.manifest->files = ht;
+	private_context.backup_directory = backup_directory;
+	context.private_data = &private_context;
+	context.version_cb = combinebackup_version_cb;
+	context.system_identifier_cb = combinebackup_system_identifier_cb;
 	context.per_file_cb = combinebackup_per_file_cb;
 	context.per_wal_range_cb = combinebackup_per_wal_range_cb;
 	context.error_cb = report_manifest_error;
@@ -160,7 +176,7 @@ load_backup_manifest(char *backup_directory)
 
 	/* All done. */
 	pfree(buffer);
-	return result;
+	return private_context.manifest;
 }
 
 /*
@@ -181,6 +197,36 @@ report_manifest_error(JsonManifestParseContext *context, const char *fmt,...)
 	exit(1);
 }
 
+/*
+ * This callback to validate the manifest version number for incremental backup.
+ */
+static void
+combinebackup_version_cb(JsonManifestParseContext *context,
+						 int manifest_version)
+{
+	parser_context *private_context = context->private_data;
+
+	/* Incremental backups supported on manifest version 2 or later */
+	if (manifest_version == 1)
+		context->error_cb(context,
+						  "%s: backup manifest doesn't support incremental changes",
+						  private_context->backup_directory);
+}
+
+/*
+ * Record system identifier extracted from the backup manifest.
+ */
+static void
+combinebackup_system_identifier_cb(JsonManifestParseContext *context,
+								   uint64 manifest_system_identifier)
+{
+	parser_context *private_context = context->private_data;
+	manifest_data *manifest = private_context->manifest;
+
+	/* Validation will be at the later stage */
+	manifest->system_identifier = manifest_system_identifier;
+}
+
 /*
  * Record details extracted from the backup manifest for one file.
  */
@@ -190,7 +236,8 @@ combinebackup_per_file_cb(JsonManifestParseContext *context,
 						  pg_checksum_type checksum_type,
 						  int checksum_length, uint8 *checksum_payload)
 {
-	manifest_data *manifest = context->private_data;
+	parser_context *private_context = context->private_data;
+	manifest_data *manifest = private_context->manifest;
 	manifest_file *m;
 	bool		found;
 
@@ -214,7 +261,8 @@ combinebackup_per_wal_range_cb(JsonManifestParseContext *context,
 							   TimeLineID tli,
 							   XLogRecPtr start_lsn, XLogRecPtr end_lsn)
 {
-	manifest_data *manifest = context->private_data;
+	parser_context *private_context = context->private_data;
+	manifest_data *manifest = private_context->manifest;
 	manifest_wal_range *range;
 
 	/* Allocate and initialize a struct describing this WAL range. */
diff --git a/src/bin/pg_combinebackup/load_manifest.h b/src/bin/pg_combinebackup/load_manifest.h
index 9163e071af..8a5a70e447 100644
--- a/src/bin/pg_combinebackup/load_manifest.h
+++ b/src/bin/pg_combinebackup/load_manifest.h
@@ -55,6 +55,7 @@ typedef struct manifest_wal_range
  */
 typedef struct manifest_data
 {
+	uint64		system_identifier;
 	manifest_files_hash *files;
 	manifest_wal_range *first_wal_range;
 	manifest_wal_range *last_wal_range;
diff --git a/src/bin/pg_combinebackup/pg_combinebackup.c b/src/bin/pg_combinebackup/pg_combinebackup.c
index 31ead7f405..62633ff445 100644
--- a/src/bin/pg_combinebackup/pg_combinebackup.c
+++ b/src/bin/pg_combinebackup/pg_combinebackup.c
@@ -92,7 +92,7 @@ cb_cleanup_dir *cleanup_dir_list = NULL;
 
 static void add_tablespace_mapping(cb_options *opt, char *arg);
 static StringInfo check_backup_label_files(int n_backups, char **backup_dirs);
-static void check_control_files(int n_backups, char **backup_dirs);
+static uint64 check_control_files(int n_backups, char **backup_dirs);
 static void check_input_dir_permissions(char *dir);
 static void cleanup_directories_atexit(void);
 static void create_output_directory(char *dirname, cb_options *opt);
@@ -134,11 +134,13 @@ main(int argc, char *argv[])
 
 	const char *progname;
 	char	   *last_input_dir;
+	int			i;
 	int			optindex;
 	int			c;
 	int			n_backups;
 	int			n_prior_backups;
 	int			version;
+	uint64		system_identifier;
 	char	  **prior_backup_dirs;
 	cb_options	opt;
 	cb_tablespace *tablespaces;
@@ -216,7 +218,7 @@ main(int argc, char *argv[])
 
 	/* Sanity-check control files. */
 	n_backups = argc - optind;
-	check_control_files(n_backups, argv + optind);
+	system_identifier = check_control_files(n_backups, argv + optind);
 
 	/* Sanity-check backup_label files, and get the contents of the last one. */
 	last_backup_label = check_backup_label_files(n_backups, argv + optind);
@@ -231,6 +233,25 @@ main(int argc, char *argv[])
 	/* Load backup manifests. */
 	manifests = load_backup_manifests(n_backups, prior_backup_dirs);
 
+	/*
+	 * Validate the manifest system identifier against the backup system
+	 * identifier.
+	 */
+	for (i = 0; i < n_backups; i++)
+	{
+		if (manifests[i]->system_identifier != system_identifier)
+		{
+			char	   *controlpath;
+
+			controlpath = psprintf("%s/%s", prior_backup_dirs[i], "global/pg_control");
+
+			pg_fatal("manifest is from different database system: manifest database system identifier is %llu, %s system identifier is %llu",
+					 (unsigned long long) manifests[i]->system_identifier,
+					 controlpath,
+					 (unsigned long long) system_identifier);
+		}
+	}
+
 	/* Figure out which tablespaces are going to be included in the output. */
 	last_input_dir = argv[argc - 1];
 	check_input_dir_permissions(last_input_dir);
@@ -256,7 +277,7 @@ main(int argc, char *argv[])
 	/* If we need to write a backup_manifest, prepare to do so. */
 	if (!opt.dry_run && !opt.no_manifest)
 	{
-		mwriter = create_manifest_writer(opt.output);
+		mwriter = create_manifest_writer(opt.output, system_identifier);
 
 		/*
 		 * Verify that we have a backup manifest for the final backup; else we
@@ -517,9 +538,9 @@ check_backup_label_files(int n_backups, char **backup_dirs)
 }
 
 /*
- * Sanity check control files.
+ * Sanity check control files and return system_identifier.
  */
-static void
+static uint64
 check_control_files(int n_backups, char **backup_dirs)
 {
 	int			i;
@@ -564,6 +585,8 @@ check_control_files(int n_backups, char **backup_dirs)
 	 */
 	pg_log_debug("system identifier is %llu",
 				 (unsigned long long) system_identifier);
+
+	return system_identifier;
 }
 
 /*
diff --git a/src/bin/pg_combinebackup/t/005_integrity.pl b/src/bin/pg_combinebackup/t/005_integrity.pl
index 3b445d0e30..875ffbf525 100644
--- a/src/bin/pg_combinebackup/t/005_integrity.pl
+++ b/src/bin/pg_combinebackup/t/005_integrity.pl
@@ -8,6 +8,7 @@ use strict;
 use warnings FATAL => 'all';
 use File::Compare;
 use File::Path qw(rmtree);
+use File::Copy;
 use PostgreSQL::Test::Cluster;
 use PostgreSQL::Test::Utils;
 use Test::More;
@@ -85,6 +86,19 @@ $node1->command_fails_like(
 	qr/expected system identifier.*but found/,
 	"can't combine backups from different nodes");
 
+# Can't combine when different manifest system identifier
+rename("$backup2path/backup_manifest", "$backup2path/backup_manifest.orig")
+  or BAIL_OUT "could not move $backup2path/backup_manifest";
+copy("$backupother2path/backup_manifest", "$backup2path/backup_manifest")
+  or BAIL_OUT "could not copy $backupother2path/backup_manifest";
+$node1->command_fails_like(
+	[ 'pg_combinebackup', $backup1path, $backup2path, $backup3path, '-o', $resultpath ],
+	qr/manifest is from different database system/,
+	"can't combine backups with different manifest system identifier ");
+# Restore the backup state
+move("$backup2path/backup_manifest.orig", "$backup2path/backup_manifest")
+  or BAIL_OUT "could not move $backup2path/backup_manifest";
+
 # Can't omit a required backup.
 $node1->command_fails_like(
 	[ 'pg_combinebackup', $backup1path, $backup3path, '-o', $resultpath ],
diff --git a/src/bin/pg_combinebackup/write_manifest.c b/src/bin/pg_combinebackup/write_manifest.c
index 01deb82cc9..7a2065e1db 100644
--- a/src/bin/pg_combinebackup/write_manifest.c
+++ b/src/bin/pg_combinebackup/write_manifest.c
@@ -45,7 +45,7 @@ static size_t hex_encode(const uint8 *src, size_t len, char *dst);
  * in the specified directory.
  */
 manifest_writer *
-create_manifest_writer(char *directory)
+create_manifest_writer(char *directory, uint64 system_identifier)
 {
 	manifest_writer *mwriter = pg_malloc(sizeof(manifest_writer));
 
@@ -57,8 +57,10 @@ create_manifest_writer(char *directory)
 	pg_checksum_init(&mwriter->manifest_ctx, CHECKSUM_TYPE_SHA256);
 
 	appendStringInfo(&mwriter->buf,
-					 "{ \"PostgreSQL-Backup-Manifest-Version\": 1,\n"
-					 "\"Files\": [");
+					 "{ \"PostgreSQL-Backup-Manifest-Version\": 2,\n"
+					 "\"System-Identifier\": " UINT64_FORMAT ",\n"
+					 "\"Files\": [",
+					 system_identifier);
 
 	return mwriter;
 }
diff --git a/src/bin/pg_combinebackup/write_manifest.h b/src/bin/pg_combinebackup/write_manifest.h
index de0f742779..ebc4f9441a 100644
--- a/src/bin/pg_combinebackup/write_manifest.h
+++ b/src/bin/pg_combinebackup/write_manifest.h
@@ -19,7 +19,8 @@ struct manifest_wal_range;
 struct manifest_writer;
 typedef struct manifest_writer manifest_writer;
 
-extern manifest_writer *create_manifest_writer(char *directory);
+extern manifest_writer *create_manifest_writer(char *directory,
+											   uint64 system_identifier);
 extern void add_file_to_manifest(manifest_writer *mwriter,
 								 const char *manifest_path,
 								 size_t size, time_t mtime,
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index ae8c18f373..29c1cec38e 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -18,6 +18,7 @@
 #include <sys/stat.h>
 #include <time.h>
 
+#include "common/controldata_utils.h"
 #include "common/hashfn.h"
 #include "common/logging.h"
 #include "common/parse_manifest.h"
@@ -98,9 +99,11 @@ typedef struct manifest_wal_range
  */
 typedef struct parser_context
 {
+	uint64		system_identifier;
 	manifest_files_hash *ht;
 	manifest_wal_range *first_wal_range;
 	manifest_wal_range *last_wal_range;
+	char		*backup_directory;
 } parser_context;
 
 /*
@@ -117,8 +120,12 @@ typedef struct verifier_context
 
 static void parse_manifest_file(char *manifest_path,
 								manifest_files_hash **ht_p,
-								manifest_wal_range **first_wal_range_p);
-
+								manifest_wal_range **first_wal_range_p,
+								uint64 *manifest_system_identifier);
+static void verifybackup_version_cb(JsonManifestParseContext *context,
+									int manifest_version);
+static void verifybackup_system_identifier(JsonManifestParseContext *context,
+										   uint64 manifest_system_identifier);
 static void verifybackup_per_file_cb(JsonManifestParseContext *context,
 									 char *pathname, size_t size,
 									 pg_checksum_type checksum_type,
@@ -132,6 +139,8 @@ static void report_manifest_error(JsonManifestParseContext *context,
 								  const char *fmt,...)
 			pg_attribute_printf(2, 3) pg_attribute_noreturn();
 
+static void verify_system_identifier(char *backup_directory,
+									 uint64 manifest_system_identifier);
 static void verify_backup_directory(verifier_context *context,
 									char *relpath, char *fullpath);
 static void verify_backup_file(verifier_context *context,
@@ -191,6 +200,7 @@ main(int argc, char **argv)
 	bool		quiet = false;
 	char	   *wal_directory = NULL;
 	char	   *pg_waldump_path = NULL;
+	uint64		manifest_system_identifier;
 
 	pg_logging_init(argv[0]);
 	set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_verifybackup"));
@@ -338,7 +348,12 @@ main(int argc, char **argv)
 	 * the manifest as fatal; there doesn't seem to be much point in trying to
 	 * verify the backup directory against a corrupted manifest.
 	 */
-	parse_manifest_file(manifest_path, &context.ht, &first_wal_range);
+	parse_manifest_file(manifest_path, &context.ht, &first_wal_range,
+						&manifest_system_identifier);
+
+	/* Validate the manifest system identifier */
+	verify_system_identifier(context.backup_directory,
+							 manifest_system_identifier);
 
 	/*
 	 * Now scan the files in the backup directory. At this stage, we verify
@@ -387,7 +402,8 @@ main(int argc, char **argv)
  */
 static void
 parse_manifest_file(char *manifest_path, manifest_files_hash **ht_p,
-					manifest_wal_range **first_wal_range_p)
+					manifest_wal_range **first_wal_range_p,
+					uint64 *manifest_system_identifier)
 {
 	int			fd;
 	struct stat statbuf;
@@ -439,7 +455,10 @@ parse_manifest_file(char *manifest_path, manifest_files_hash **ht_p,
 	private_context.ht = ht;
 	private_context.first_wal_range = NULL;
 	private_context.last_wal_range = NULL;
+	private_context.system_identifier = 0;
 	context.private_data = &private_context;
+	context.version_cb = verifybackup_version_cb;
+	context.system_identifier_cb = verifybackup_system_identifier;
 	context.per_file_cb = verifybackup_per_file_cb;
 	context.per_wal_range_cb = verifybackup_per_wal_range_cb;
 	context.error_cb = report_manifest_error;
@@ -451,6 +470,7 @@ parse_manifest_file(char *manifest_path, manifest_files_hash **ht_p,
 	/* Return the file hash table and WAL range list we constructed. */
 	*ht_p = ht;
 	*first_wal_range_p = private_context.first_wal_range;
+	*manifest_system_identifier = private_context.system_identifier;
 }
 
 /*
@@ -471,6 +491,31 @@ report_manifest_error(JsonManifestParseContext *context, const char *fmt,...)
 	exit(1);
 }
 
+/*
+ * Callback for the backup manifest version.
+ */
+static void
+verifybackup_version_cb(JsonManifestParseContext *context,
+						int manifest_version)
+{
+	/*
+	 * pg_verifybackup doesn't have any specific action for the version number
+	 */
+}
+
+/*
+ * Record details extracted from the backup manifest.
+ */
+static void
+verifybackup_system_identifier(JsonManifestParseContext *context,
+						 uint64 manifest_system_identifier)
+{
+	parser_context *parseContext = (parser_context *) context->private_data;
+
+	/* Validation will be at the later stage */
+	parseContext->system_identifier = manifest_system_identifier;
+}
+
 /*
  * Record details extracted from the backup manifest for one file.
  */
@@ -527,6 +572,43 @@ verifybackup_per_wal_range_cb(JsonManifestParseContext *context,
 	pcxt->last_wal_range = range;
 }
 
+/*
+ * Sanity check control file of the backup and validate system identifier
+ * against manifest system identifier.
+ */
+static void
+verify_system_identifier(char *backup_directory,
+						 uint64 manifest_system_identifier)
+{
+	ControlFileData *control_file;
+	bool		crc_ok;
+	char	   *controlpath;
+
+	controlpath = psprintf("%s/%s", backup_directory, "global/pg_control");
+	pg_log_debug("reading \"%s\"", controlpath);
+	control_file = get_controlfile(backup_directory, &crc_ok);
+
+	/* Control file contents not meaningful if CRC is bad. */
+	if (!crc_ok)
+		report_fatal_error("%s: CRC is incorrect", controlpath);
+
+	/* Can't interpret control file if not current version. */
+	if (control_file->pg_control_version != PG_CONTROL_VERSION)
+		report_fatal_error("%s: unexpected control file version",
+						   controlpath);
+
+	/* System identifiers should match. */
+	if (manifest_system_identifier != control_file->system_identifier)
+		report_fatal_error("manifest is from different database system: manifest database system identifier is %llu, %s system identifier is %llu",
+						   (unsigned long long) manifest_system_identifier,
+						   controlpath,
+						   (unsigned long long) control_file->system_identifier);
+
+	/* Release memory. */
+	pfree(control_file);
+	pfree(controlpath);
+}
+
 /*
  * Verify one directory.
  *
diff --git a/src/bin/pg_verifybackup/t/003_corruption.pl b/src/bin/pg_verifybackup/t/003_corruption.pl
index 11bd577081..90b6966e38 100644
--- a/src/bin/pg_verifybackup/t/003_corruption.pl
+++ b/src/bin/pg_verifybackup/t/003_corruption.pl
@@ -6,6 +6,7 @@
 use strict;
 use warnings FATAL => 'all';
 use File::Path qw(rmtree);
+use File::Copy;
 use PostgreSQL::Test::Cluster;
 use PostgreSQL::Test::Utils;
 use Test::More;
@@ -68,6 +69,11 @@ my @scenario = (
 		'mutilate' => \&mutilate_replace_file,
 		'fails_like' => qr/checksum mismatch for file/
 	},
+	{
+		'name' => 'system_identifier',
+		'mutilate' => \&mutilate_system_identifier,
+		'fails_like' => qr/manifest is from different database system/
+	},
 	{
 		'name' => 'bad_manifest',
 		'mutilate' => \&mutilate_bad_manifest,
@@ -216,7 +222,7 @@ sub mutilate_append_to_file
 sub mutilate_truncate_file
 {
 	my ($backup_path) = @_;
-	my $pathname = "$backup_path/global/pg_control";
+	my $pathname = "$backup_path/pg_hba.conf";
 	open(my $fh, '>', $pathname) || die "open $pathname: $!";
 	close($fh);
 	return;
@@ -236,6 +242,29 @@ sub mutilate_replace_file
 	return;
 }
 
+# Copy manifest of other backups to demonstrate the case where the wrong
+# manifest is referred
+sub mutilate_system_identifier
+{
+	my ($backup_path) = @_;
+
+	# Set up another new database instance with different system identifier and
+	# make backup
+	my $node;
+	{
+		local $ENV{'INITDB_TEMPLATE'} = undef;
+
+		$node = PostgreSQL::Test::Cluster->new('node');
+		$node->init(allows_streaming => 1);
+		$node->start;
+	}
+	$node->backup('backup2');
+	move($node->backup_dir.'/backup2/backup_manifest', $backup_path.'/backup_manifest')
+		or BAIL_OUT "could not copy manifest to $backup_path";
+	$node->teardown_node(fail_ok => 1);
+	return;
+}
+
 # Corrupt the backup manifest.
 sub mutilate_bad_manifest
 {
diff --git a/src/bin/pg_verifybackup/t/004_options.pl b/src/bin/pg_verifybackup/t/004_options.pl
index 8ed2214408..970b7dcd56 100644
--- a/src/bin/pg_verifybackup/t/004_options.pl
+++ b/src/bin/pg_verifybackup/t/004_options.pl
@@ -111,7 +111,7 @@ command_fails_like(
 		'pg_verifybackup', '-m',
 		"$backup_path/backup_manifest", "$backup_path/fake"
 	],
-	qr/could not open directory/,
+	qr/could not open file/,
 	'nonexistent backup directory');
 
 done_testing();
diff --git a/src/bin/pg_verifybackup/t/005_bad_manifest.pl b/src/bin/pg_verifybackup/t/005_bad_manifest.pl
index e278ccea5a..7fac2fa836 100644
--- a/src/bin/pg_verifybackup/t/005_bad_manifest.pl
+++ b/src/bin/pg_verifybackup/t/005_bad_manifest.pl
@@ -156,6 +156,17 @@ test_parse_error('expected at least 2 lines', <<EOM);
 {"PostgreSQL-Backup-Manifest-Version": 1, "Files": [], "Manifest-Checksum": null}
 EOM
 
+test_parse_error('unrecognized top-level field', <<EOM);
+{"PostgreSQL-Backup-Manifest-Version": 1,
+ "System-Identifier": "7320280665284567118"
+ "Manifest-Checksum": "eabdc9caf8a" }
+EOM
+
+test_parse_error('expected system identifier', <<EOM);
+{"PostgreSQL-Backup-Manifest-Version": 2,
+ "Manifest-Checksum": "eabdc9caf8a" }
+EOM
+
 my $manifest_without_newline = <<EOM;
 {"PostgreSQL-Backup-Manifest-Version": 1,
  "Files": [],
diff --git a/src/common/parse_manifest.c b/src/common/parse_manifest.c
index 92a97714f3..6eb6ade560 100644
--- a/src/common/parse_manifest.c
+++ b/src/common/parse_manifest.c
@@ -25,6 +25,7 @@ typedef enum
 	JM_EXPECT_TOPLEVEL_END,
 	JM_EXPECT_TOPLEVEL_FIELD,
 	JM_EXPECT_VERSION_VALUE,
+	JM_EXPECT_SYSTEM_IDENTIFIER_VALUE,
 	JM_EXPECT_FILES_START,
 	JM_EXPECT_FILES_NEXT,
 	JM_EXPECT_THIS_FILE_FIELD,
@@ -85,6 +86,9 @@ typedef struct
 
 	/* Miscellaneous other stuff. */
 	bool		saw_version_field;
+	bool		saw_system_identifier_field;
+	char	   *manifest_version;
+	char	   *manifest_system_identifier;
 	char	   *manifest_checksum;
 } JsonManifestParseState;
 
@@ -96,6 +100,8 @@ static JsonParseErrorType json_manifest_object_field_start(void *state, char *fn
 														   bool isnull);
 static JsonParseErrorType json_manifest_scalar(void *state, char *token,
 											   JsonTokenType tokentype);
+static void json_manifest_finalize_version(JsonManifestParseState *parse);
+static void json_manifest_finalize_system_identifier(JsonManifestParseState *parse);
 static void json_manifest_finalize_file(JsonManifestParseState *parse);
 static void json_manifest_finalize_wal_range(JsonManifestParseState *parse);
 static void verify_manifest_checksum(JsonManifestParseState *parse,
@@ -128,6 +134,7 @@ json_parse_manifest(JsonManifestParseContext *context, char *buffer,
 	parse.context = context;
 	parse.state = JM_EXPECT_TOPLEVEL_START;
 	parse.saw_version_field = false;
+	parse.saw_system_identifier_field = false;
 
 	/* Create a JSON lexing context. */
 	lex = makeJsonLexContextCstringLen(NULL, buffer, size, PG_UTF8, true);
@@ -311,6 +318,16 @@ json_manifest_object_field_start(void *state, char *fname, bool isnull)
 				parse->saw_version_field = true;
 				break;
 			}
+			else if (!parse->saw_system_identifier_field &&
+					 strcmp(parse->manifest_version, "1") != 0)
+			{
+				if (strcmp(fname, "System-Identifier") != 0)
+					json_manifest_parse_failure(parse->context,
+												"expected system identifier");
+				parse->state = JM_EXPECT_SYSTEM_IDENTIFIER_VALUE;
+				parse->saw_system_identifier_field = true;
+				break;
+			}
 
 			/* Is this the list of files? */
 			if (strcmp(fname, "Files") == 0)
@@ -404,9 +421,14 @@ json_manifest_scalar(void *state, char *token, JsonTokenType tokentype)
 	switch (parse->state)
 	{
 		case JM_EXPECT_VERSION_VALUE:
-			if (strcmp(token, "1") != 0)
-				json_manifest_parse_failure(parse->context,
-											"unexpected manifest version");
+			parse->manifest_version = token;
+			json_manifest_finalize_version(parse);
+			parse->state = JM_EXPECT_TOPLEVEL_FIELD;
+			break;
+
+		case JM_EXPECT_SYSTEM_IDENTIFIER_VALUE:
+			parse->manifest_system_identifier = token;
+			json_manifest_finalize_system_identifier(parse);
 			parse->state = JM_EXPECT_TOPLEVEL_FIELD;
 			break;
 
@@ -464,6 +486,61 @@ json_manifest_scalar(void *state, char *token, JsonTokenType tokentype)
 	return JSON_SUCCESS;
 }
 
+/*
+ * Do additional parsing and sanity-checking of the manifest version, and invoke
+ * the callback so that the caller can gets that detail and take actions
+ * accordingly.  This happens for each manifest when the corresponding JSON
+ * object is completely parsed.
+ */
+static void
+json_manifest_finalize_version(JsonManifestParseState *parse)
+{
+	JsonManifestParseContext *context = parse->context;
+	int			version;
+	char	   *ep;
+
+	Assert(parse->saw_version_field);
+
+	if (strcmp(parse->manifest_version, "1") != 0 &&
+		strcmp(parse->manifest_version, "2"))
+		json_manifest_parse_failure(parse->context,
+									"unexpected manifest version");
+
+	/* Parse version. */
+	version = strtoi64(parse->manifest_version, &ep, 10);
+	if (*ep)
+		json_manifest_parse_failure(parse->context,
+									"manifest version not an integer");
+
+	/* Invoke the callback for version */
+	context->version_cb(context, version);
+}
+
+/*
+ * Do additional parsing and sanity-checking of the system identifier, and
+ * invoke the callback so that the caller can gets that detail and take actions
+ * accordingly.  This happens for each manifest when the corresponding JSON
+ * object is completely parsed.
+ */
+static void
+json_manifest_finalize_system_identifier(JsonManifestParseState *parse)
+{
+	JsonManifestParseContext *context = parse->context;
+	uint64		system_identifier;
+	char	   *ep;
+
+	Assert(parse->saw_system_identifier_field);
+
+	/* Parse system identifier. */
+	system_identifier = strtou64(parse->manifest_system_identifier, &ep, 10);
+	if (*ep)
+		json_manifest_parse_failure(parse->context,
+									"manifest system identifier not an integer");
+
+	/* Invoke the callback for system identifier */
+	context->system_identifier_cb(context, system_identifier);
+}
+
 /*
  * Do additional parsing and sanity-checking of the details gathered for one
  * file, and invoke the per-file callback so that the caller gets those
diff --git a/src/include/backup/backup_manifest.h b/src/include/backup/backup_manifest.h
index 3853d2ca90..4a21102506 100644
--- a/src/include/backup/backup_manifest.h
+++ b/src/include/backup/backup_manifest.h
@@ -37,7 +37,8 @@ typedef struct backup_manifest_info
 
 extern void InitializeBackupManifest(backup_manifest_info *manifest,
 									 backup_manifest_option want_manifest,
-									 pg_checksum_type manifest_checksum_type);
+									 pg_checksum_type manifest_checksum_type,
+									 uint64 system_identifier);
 extern void AddFileToBackupManifest(backup_manifest_info *manifest,
 									Oid spcoid,
 									const char *pathname, size_t size,
diff --git a/src/include/common/parse_manifest.h b/src/include/common/parse_manifest.h
index f74be0db35..78b052c045 100644
--- a/src/include/common/parse_manifest.h
+++ b/src/include/common/parse_manifest.h
@@ -21,6 +21,10 @@
 struct JsonManifestParseContext;
 typedef struct JsonManifestParseContext JsonManifestParseContext;
 
+typedef void (*json_manifest_version_callback) (JsonManifestParseContext *,
+												int manifest_version);
+typedef void (*json_manifest_system_identifier_callback) (JsonManifestParseContext *,
+														  uint64 manifest_system_identifier);
 typedef void (*json_manifest_per_file_callback) (JsonManifestParseContext *,
 												 char *pathname,
 												 size_t size, pg_checksum_type checksum_type,
@@ -35,6 +39,8 @@ typedef void (*json_manifest_error_callback) (JsonManifestParseContext *,
 struct JsonManifestParseContext
 {
 	void	   *private_data;
+	json_manifest_version_callback version_cb;
+	json_manifest_system_identifier_callback system_identifier_cb;
 	json_manifest_per_file_callback per_file_cb;
 	json_manifest_per_wal_range_callback per_wal_range_cb;
 	json_manifest_error_callback error_cb;
-- 
2.18.0

Reply via email to