A customer's upgrade failed, and it took me a while to
figure out that the problem was that they had set
"vacuum_defer_cleanup_age=10000" on the new cluster.

The consequence was that the "vacuumdb --freeze" that
takes place before copying commit log files failed to
freeze "pg_database".
That caused later updates to the table to fail with
"Could not open file "pg_xact/0000": No such file or directory."

I think it would increase the robustness of pg_upgrade to
force "vacuum_defer_cleanup_age" to 0 on the new cluster.

Suggested patch attached.

Yours,
Laurenz Albe
From e8f84eaadb132b8647406bc28028862d6d4d8717 Mon Sep 17 00:00:00 2001
From: Laurenz Albe <laurenz.a...@cybertec.at>
Date: Wed, 10 Jun 2020 15:57:36 +0200
Subject: [PATCH] Force vacuum_defer_cleanup_age=0 during pg_upgrade

This is important on the new cluster, because tuples must
be frozen before the commit logs can be safely copied from
the old server.

Author: Laurenz Albe
Backpatch-through: 9.5
---
 src/bin/pg_upgrade/server.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/bin/pg_upgrade/server.c b/src/bin/pg_upgrade/server.c
index 7d17280ecb..79ec3f04c0 100644
--- a/src/bin/pg_upgrade/server.c
+++ b/src/bin/pg_upgrade/server.c
@@ -239,6 +239,9 @@ start_postmaster(ClusterInfo *cluster, bool report_and_exit_on_error)
 	 * we only modify the new cluster, so only use it there.  If there is a
 	 * crash, the new cluster has to be recreated anyway.  fsync=off is a big
 	 * win on ext4.
+	 *
+	 * Force vacuum_defer_cleanup_age to 0 on the new cluster, so that
+	 * vacuumdb --freeze actually freezes the tuples.
 	 */
 	snprintf(cmd, sizeof(cmd),
 			 "\"%s/pg_ctl\" -w -l \"%s\" -D \"%s\" -o \"-p %d%s%s %s%s\" start",
@@ -247,7 +250,7 @@ start_postmaster(ClusterInfo *cluster, bool report_and_exit_on_error)
 			  BINARY_UPGRADE_SERVER_FLAG_CAT_VER) ? " -b" :
 			 " -c autovacuum=off -c autovacuum_freeze_max_age=2000000000",
 			 (cluster == &new_cluster) ?
-			 " -c synchronous_commit=off -c fsync=off -c full_page_writes=off" : "",
+			 " -c synchronous_commit=off -c fsync=off -c full_page_writes=off -c vacuum_defer_cleanup_age=0" : "",
 			 cluster->pgopts ? cluster->pgopts : "", socket_string);
 
 	/*
-- 
2.21.3

Reply via email to