Hello, We found an issue in pg_upgrade on a cluster with a third-party background worker. The upgrade goes fine, but the new cluster is then in an inconsistent state. The background worker comes from the PoWA extension but the issue does not appear to related to this particular code.
Here is a shell script to reproduce the issue (error at the end): OLDBINDIR=/usr/lib/postgresql/11/bin NEWBINDIR=/usr/lib/postgresql/13/bin OLDDATADIR=$(mktemp -d) NEWDATADIR=$(mktemp -d) $OLDBINDIR/initdb -D $OLDDATADIR echo "unix_socket_directories = '/tmp'" >> $OLDDATADIR/postgresql.auto.conf echo "shared_preload_libraries = 'pg_stat_statements, powa'" >> $OLDDATADIR/postgresql.auto.conf $OLDBINDIR/pg_ctl -D $OLDDATADIR -l $OLDDATADIR/pgsql.log start $OLDBINDIR/createdb -h /tmp powa $OLDBINDIR/psql -h /tmp -d powa -c "CREATE EXTENSION powa CASCADE" $OLDBINDIR/pg_ctl -D $OLDDATADIR -m fast stop $NEWBINDIR/initdb -D $NEWDATADIR cp $OLDDATADIR/postgresql.auto.conf $NEWDATADIR/postgresql.auto.conf $NEWBINDIR/pg_upgrade --old-datadir $OLDDATADIR --new-datadir $NEWDATADIR --old-bindir $OLDBINDIR $NEWBINDIR/pg_ctl -D $NEWDATADIR -l $NEWDATADIR/pgsql.log start $NEWBINDIR/psql -h /tmp -d powa -c "SELECT 1 FROM powa_snapshot_metas" # ERROR: MultiXactId 1 has not been created yet -- apparent wraparound (This needs PoWA to be installed; packages are available on pgdg repositories as postgresql-<pgversion>-powa on Debian or powa_<pgversion> on RedHat or directly from source code at https://github.com/powa-team/powa-archivist). As far as I currently understand, this is due to the data to be migrated being somewhat inconsistent (from the perspective of pg_upgrade) when the old cluster and its background workers get started in pg_upgrade during the "checks" step. (The old cluster remains sane, still.) As a solution, it seems that, for similar reasons that we restrict socket access to prevent accidental connections (from commit f763b77193), we should also prevent background workers to start at this step. Please find attached a patch implementing this. Thanks for considering, Denis
>From 31b1f31cd3a822d23ccd5883120a013891ade0f3 Mon Sep 17 00:00:00 2001 From: Denis Laxalde <denis.laxa...@dalibo.com> Date: Wed, 20 Jan 2021 17:25:58 +0100 Subject: [PATCH] Disable background workers during servers start in pg_upgrade We disable shared_preload_libraries to prevent background workers to initialize and start during server start in pg_upgrade. In essence, this is for a similar reason that we use a restricted socket access from f763b77193b04eba03a1f4ce46df34dc0348419e because background workers may produce undesired activities during the upgrade. Author: Denis Laxalde <denis.laxa...@dalibo.com> Co-authored-by: Jehan-Guillaume de Rorthais <j...@dalibo.com> --- src/bin/pg_upgrade/server.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/bin/pg_upgrade/server.c b/src/bin/pg_upgrade/server.c index 31b1425202..fab95a2d24 100644 --- a/src/bin/pg_upgrade/server.c +++ b/src/bin/pg_upgrade/server.c @@ -240,11 +240,13 @@ start_postmaster(ClusterInfo *cluster, bool report_and_exit_on_error) * crash, the new cluster has to be recreated anyway. fsync=off is a big * win on ext4. * + * Turn off background workers by emptying shared_preload_libraries. + * * Force vacuum_defer_cleanup_age to 0 on the new cluster, so that * vacuumdb --freeze actually freezes the tuples. */ snprintf(cmd, sizeof(cmd), - "\"%s/pg_ctl\" -w -l \"%s\" -D \"%s\" -o \"-p %d%s%s %s%s\" start", + "\"%s/pg_ctl\" -w -l \"%s\" -D \"%s\" -o \"-p %d%s%s %s%s -c shared_preload_libraries=''\" start", cluster->bindir, SERVER_LOG_FILE, cluster->pgconfig, cluster->port, (cluster->controldata.cat_ver >= BINARY_UPGRADE_SERVER_FLAG_CAT_VER) ? " -b" : -- 2.20.1