[PATCH] Disable bgworkers during servers start in pg_upgrade

Denis Laxalde Thu, 21 Jan 2021 07:24:34 -0800

Hello,

We found an issue in pg_upgrade on a cluster with a third-party
background worker. The upgrade goes fine, but the new cluster is then in
an inconsistent state. The background worker comes from the PoWA
extension but the issue does not appear to related to this particular
code.


Here is a shell script to reproduce the issue (error at the end):

  OLDBINDIR=/usr/lib/postgresql/11/bin
  NEWBINDIR=/usr/lib/postgresql/13/bin
  
  OLDDATADIR=$(mktemp -d)
  NEWDATADIR=$(mktemp -d)
  
  $OLDBINDIR/initdb -D $OLDDATADIR
  echo "unix_socket_directories = '/tmp'" >> $OLDDATADIR/postgresql.auto.conf
  echo "shared_preload_libraries = 'pg_stat_statements, powa'" >> 
$OLDDATADIR/postgresql.auto.conf
  $OLDBINDIR/pg_ctl -D $OLDDATADIR -l $OLDDATADIR/pgsql.log start
  $OLDBINDIR/createdb -h /tmp powa
  $OLDBINDIR/psql -h /tmp -d powa -c "CREATE EXTENSION powa CASCADE"
  $OLDBINDIR/pg_ctl -D $OLDDATADIR -m fast stop
  
  $NEWBINDIR/initdb -D $NEWDATADIR
  cp $OLDDATADIR/postgresql.auto.conf $NEWDATADIR/postgresql.auto.conf
  
  $NEWBINDIR/pg_upgrade --old-datadir $OLDDATADIR --new-datadir $NEWDATADIR 
--old-bindir $OLDBINDIR
  
  $NEWBINDIR/pg_ctl -D $NEWDATADIR -l $NEWDATADIR/pgsql.log start
  $NEWBINDIR/psql -h /tmp -d powa -c "SELECT 1 FROM powa_snapshot_metas"
  # ERROR:  MultiXactId 1 has not been created yet -- apparent wraparound

(This needs PoWA to be installed; packages are available on pgdg
repositories as postgresql-<pgversion>-powa on Debian or
powa_<pgversion> on RedHat or directly from source code at
https://github.com/powa-team/powa-archivist).

As far as I currently understand, this is due to the data to be migrated
being somewhat inconsistent (from the perspective of pg_upgrade) when
the old cluster and its background workers get started in pg_upgrade
during the "checks" step. (The old cluster remains sane, still.)

As a solution, it seems that, for similar reasons that we restrict
socket access to prevent accidental connections (from commit
f763b77193), we should also prevent background workers to start at this
step.

Please find attached a patch implementing this.

Thanks for considering,
Denis

>From 31b1f31cd3a822d23ccd5883120a013891ade0f3 Mon Sep 17 00:00:00 2001
From: Denis Laxalde <denis.laxa...@dalibo.com>
Date: Wed, 20 Jan 2021 17:25:58 +0100
Subject: [PATCH] Disable background workers during servers start in pg_upgrade

We disable shared_preload_libraries to prevent background workers to
initialize and start during server start in pg_upgrade.

In essence, this is for a similar reason that we use a restricted socket
access from f763b77193b04eba03a1f4ce46df34dc0348419e because background
workers may produce undesired activities during the upgrade.

Author: Denis Laxalde <denis.laxa...@dalibo.com>
Co-authored-by: Jehan-Guillaume de Rorthais <j...@dalibo.com>
---
 src/bin/pg_upgrade/server.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/bin/pg_upgrade/server.c b/src/bin/pg_upgrade/server.c
index 31b1425202..fab95a2d24 100644
--- a/src/bin/pg_upgrade/server.c
+++ b/src/bin/pg_upgrade/server.c
@@ -240,11 +240,13 @@ start_postmaster(ClusterInfo *cluster, bool report_and_exit_on_error)
 	 * crash, the new cluster has to be recreated anyway.  fsync=off is a big
 	 * win on ext4.
 	 *
+	 * Turn off background workers by emptying shared_preload_libraries.
+	 *
 	 * Force vacuum_defer_cleanup_age to 0 on the new cluster, so that
 	 * vacuumdb --freeze actually freezes the tuples.
 	 */
 	snprintf(cmd, sizeof(cmd),
-			 "\"%s/pg_ctl\" -w -l \"%s\" -D \"%s\" -o \"-p %d%s%s %s%s\" start",
+			 "\"%s/pg_ctl\" -w -l \"%s\" -D \"%s\" -o \"-p %d%s%s %s%s -c shared_preload_libraries=''\" start",
 			 cluster->bindir, SERVER_LOG_FILE, cluster->pgconfig, cluster->port,
 			 (cluster->controldata.cat_ver >=
 			  BINARY_UPGRADE_SERVER_FLAG_CAT_VER) ? " -b" :
-- 
2.20.1

[PATCH] Disable bgworkers during servers start in pg_upgrade

Reply via email to