On Fri, Jun 15, 2018 at 03:01:37PM -0700, Vimalraj A wrote:
> Hi,
> 
> After pg_upgrade, the data in Postgres is corrupted. 
> 
> 1. Postgres 9.4, stop db with "immediate" mode
> 2. Run pg_upgrade, to upgrade to Postgres 10.4
> 3. Start Postgres 10 and run vacuum full, got a bunch of "concurrent insert in
> progress". Looks like the data is corrupted. (Loading the old data back on
> Postgres 9.4 works just fine)
> 
> But if I stop the 9.4 DB with "smart" mode, the data after pg_upgrade looks
> fine. 
> 
> pg_upgrade doesn't stop or throw warnings if the upgraded db gets into
> corrupted state. 
> 
> 
> I would like to understand why it happens so. 
> 1. What transient state corrupts the db?
> 2. Is it a known issue with pg_upgrade? 
> 3. Is there a way to get the data from pg_upgrade after "immediate" mode stop
> of previous version?

Well, that's interesting.  We document to shut down the old and new
sever with pg_ctl stop, but don't specify to avoid immediate.  

The reason you are having problems is that pg_upgrade does not copy the
WAL from the old cluster to the new one, so there is no way to replay
the needed WAL during startup of the new server, which leads to
corruption.  Did you find this out in testing or in actual use?

What is also interesting is how pg_upgrade tries to avoid problems with
_crash_ shutdowns --- if it sees a postmaster lock file, it tries to
start the server, and if that works, it then stops it, causing the WAL
to be replayed and cleanly shutdown.  What it _doesn't_ handle is pg_ctl
-m immediate, which removes the lock file, but does leave WAL in need of
replay.  Oops!

Ideally we could detect this before we check pg_controldata and then do
the start/stop trick to fix the WAL, but the ordering of the code makes
that hard.  Instead, I have developed the attached patch which does a
check for the cluster state at the same time we are checking
pg_controldata, and reports an error if there is not a clean shutdown. 
Based on how rare this is, this is probably the cleanest solution, and I
think can be backpatched.

Patch attached.

-- 
  Bruce Momjian  <br...@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +
diff --git a/src/bin/pg_upgrade/controldata.c b/src/bin/pg_upgrade/controldata.c
new file mode 100644
index 0fe98a5..bba3b1b
*** a/src/bin/pg_upgrade/controldata.c
--- b/src/bin/pg_upgrade/controldata.c
*************** get_control_data(ClusterInfo *cluster, b
*** 58,63 ****
--- 58,64 ----
  	bool		got_large_object = false;
  	bool		got_date_is_int = false;
  	bool		got_data_checksum_version = false;
+ 	bool		got_cluster_state = false;
  	char	   *lc_collate = NULL;
  	char	   *lc_ctype = NULL;
  	char	   *lc_monetary = NULL;
*************** get_control_data(ClusterInfo *cluster, b
*** 423,428 ****
--- 424,487 ----
  	pclose(output);
  
  	/*
+ 	 * Check for clean shutdown
+ 	 */
+ 
+ 	/* only pg_controldata outputs the cluster state */
+ 	snprintf(cmd, sizeof(cmd), "\"%s/pg_controldata\" \"%s\"",
+ 			 cluster->bindir, cluster->pgdata);
+ 	fflush(stdout);
+ 	fflush(stderr);
+ 
+ 	if ((output = popen(cmd, "r")) == NULL)
+ 		pg_fatal("could not get control data using %s: %s\n",
+ 				 cmd, strerror(errno));
+ 
+ 	/* we have the result of cmd in "output". so parse it line by line now */
+ 	while (fgets(bufin, sizeof(bufin), output))
+ 	{
+ 		if ((!live_check || cluster == &new_cluster) &&
+ 			(p = strstr(bufin, "Database cluster state:")) != NULL)
+ 		{
+ 			p = strchr(p, ':');
+ 
+ 			if (p == NULL || strlen(p) <= 1)
+ 				pg_fatal("%d: database cluster state problem\n", __LINE__);
+ 
+ 			p++;				/* remove ':' char */
+ 
+ 			/*
+ 			 * We checked earlier for a postmaster lock file, and if we found
+ 			 * one, we tried to start/stop the server to replay the WAL.  However,
+ 			 * pg_ctl -m immediate doesn't leave a lock file, but does require
+ 			 * WAL replay, so we check here that the server was shut down cleanly,
+ 			 * from the controldata perspective.
+ 			 */
+ 			/* remove leading spaces */
+ 			while (*p == ' ')
+ 				p++;
+ 			if (strcmp(p, "shut down\n") != 0)
+ 			{
+ 				if (cluster == &old_cluster)
+ 					pg_fatal("The source cluster was not shut down cleanly.\n");
+ 				else
+ 					pg_fatal("The target cluster was not shut down cleanly.\n");
+ 			}
+ 			got_cluster_state = true;
+ 		}
+ 	}
+ 
+ 	pclose(output);
+ 
+ 	if (!got_cluster_state)
+ 	{
+ 		if (cluster == &old_cluster)
+ 			pg_fatal("The source cluster lacks cluster state information:\n");
+ 		else
+ 			pg_fatal("The target cluster lacks cluster state information:\n");
+ 	}
+ 
+ 	/*
  	 * Restore environment variables
  	 */
  	pg_putenv("LC_COLLATE", lc_collate);

Reply via email to