oonish...@nttdata.co.jp wrote:

> The below error messages were shown in standby server log:
> 
> FATAL:  could not access status of transaction 9009
> DETAIL:  Could not read from file "pg_commit_ts/0000" at offset 90112: 
> Success.
> CONTEXT:  xlog redo Transaction/COMMIT: 2015-09-30 15:52:41.924141+09
> LOG:  startup process (PID 23199) exited with exit code 1
> LOG:  terminating any other active server processes
> 
> Before this FATAL, there were some INFO but annoying messages:
> 
> LOG:  file "pg_commit_ts/0000" doesn't exist, reading as zeroes

Here's a patch.

I went over the commit_ts.c code a few more times.  I eventually
realized that we were trying to update the value of the GUC, which is a
rather unreliable thing to do; this was made worse by the fact that we
were updating it in one process only.

I thought it was better to have a separate boolean flag, affecting the
recovery process only, which is set at startup time or when the
XLOG_PARAMETER_CHANGE message is received.  The module is enabled if
either the GUC is set or we see that the master has the module enabled.
This only enables it as far as replaying xlog records though: if you use
the SQL interface, it will still raise an error that you cannot read
values unless the GUC is enabled.  This seems fine to me.

A curious but benign effect of this patch is that if you have the module
disabled in the master but enabled in the standby, you can actually
query the commit times in the standby, and they will correspond to
whatever the master used in the commit xlog record.

Other small changes:

- Moved some code out of xlog_redo into a new public commit_ts.c
routine; made ActivateCommitTs and Deactivate* statics.

- In the previous commit I added an assert that we're not writing xlog
and replaying xlog at the same time.  This is pointless because xlog.c
already complains about that, so this commit takes it out again.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c
index 78090c5..9613783 100644
--- a/src/backend/access/transam/commit_ts.c
+++ b/src/backend/access/transam/commit_ts.c
@@ -93,6 +93,14 @@ CommitTimestampShared *commitTsShared;
 /* GUC variable */
 bool		track_commit_timestamp;
 
+/*
+ * When this is set, commit_ts is force-enabled during recovery.  This is so
+ * that a standby can replay WAL records coming from a master with the setting
+ * enabled.  (Note that this doesn't enable SQL access to the data; it's
+ * effectively write-only until the GUC itself is enabled.)
+ */
+static bool		enable_during_recovery;
+
 static void SetXidCommitTsInPage(TransactionId xid, int nsubxids,
 					 TransactionId *subxids, TimestampTz ts,
 					 RepOriginId nodeid, int pageno);
@@ -100,6 +108,8 @@ static void TransactionIdSetCommitTs(TransactionId xid, TimestampTz ts,
 						 RepOriginId nodeid, int slotno);
 static int	ZeroCommitTsPage(int pageno, bool writeXlog);
 static bool CommitTsPagePrecedes(int page1, int page2);
+static void ActivateCommitTs(void);
+static void DeactivateCommitTs(bool do_wal);
 static void WriteZeroPageXlogRec(int pageno);
 static void WriteTruncateXlogRec(int pageno);
 static void WriteSetTimestampXlogRec(TransactionId mainxid, int nsubxids,
@@ -122,10 +132,6 @@ static void WriteSetTimestampXlogRec(TransactionId mainxid, int nsubxids,
  * subtrans implementation changes in the future, we might want to revisit the
  * decision of storing timestamp info for each subxid.
  *
- * The replaying_xlog parameter indicates whether the module should execute
- * its write even if the feature is nominally disabled, because we're replaying
- * a record generated from a master where the feature is enabled.
- *
  * The write_xlog parameter tells us whether to include an XLog record of this
  * or not.  Normally, this is called from transaction commit routines (both
  * normal and prepared) and the information will be stored in the transaction
@@ -136,18 +142,17 @@ static void WriteSetTimestampXlogRec(TransactionId mainxid, int nsubxids,
 void
 TransactionTreeSetCommitTsData(TransactionId xid, int nsubxids,
 							   TransactionId *subxids, TimestampTz timestamp,
-							   RepOriginId nodeid,
-							   bool replaying_xlog, bool write_xlog)
+							   RepOriginId nodeid, bool write_xlog)
 {
 	int			i;
 	TransactionId headxid;
 	TransactionId newestXact;
 
-	/* We'd better not try to write xlog during replay */
-	Assert(!(write_xlog && replaying_xlog));
-
-	/* No-op if feature not enabled, unless replaying WAL */
-	if (!track_commit_timestamp && !replaying_xlog)
+	/*
+	 * No-op if the module is not enabled, but allow writes in a standby
+	 * during recovery.
+	 */
+	if (!track_commit_timestamp && !enable_during_recovery)
 		return;
 
 	/*
@@ -534,32 +539,26 @@ ZeroCommitTsPage(int pageno, bool writeXlog)
 /*
  * This must be called ONCE during postmaster or standalone-backend startup,
  * after StartupXLOG has initialized ShmemVariableCache->nextXid.
+ *
+ * Caller may choose to enable the feature even when it is turned off in the
+ * configuration.
  */
 void
-StartupCommitTs(void)
+StartupCommitTs(bool force_enable)
 {
-	TransactionId xid = ShmemVariableCache->nextXid;
-	int			pageno = TransactionIdToCTsPage(xid);
-
-	if (track_commit_timestamp)
-	{
-		ActivateCommitTs();
-		return;
-	}
-
-	LWLockAcquire(CommitTsControlLock, LW_EXCLUSIVE);
-
 	/*
-	 * Initialize our idea of the latest page number.
+	 * If the module is not enabled, there's nothing to do here.  The module
+	 * could still be activated from elsewhere.
 	 */
-	CommitTsCtl->shared->latest_page_number = pageno;
+	if (!track_commit_timestamp && !force_enable)
+		return;
 
-	LWLockRelease(CommitTsControlLock);
+	ActivateCommitTs();
 }
 
 /*
  * This must be called ONCE during postmaster or standalone-backend startup,
- * when commit timestamp is enabled, after recovery has finished.
+ * after recovery has finished.
  */
 void
 CompleteCommitTsInitialization(void)
@@ -569,6 +568,31 @@ CompleteCommitTsInitialization(void)
 }
 
 /*
+ * Activate or deactivate CommitTs' upon reception of a XLOG_PARAMETER_CHANGE
+ * XLog record in a standby.
+ */
+void
+CommitTsParameterChange(bool newvalue, bool oldvalue)
+{
+	/*
+	 * If the commit_ts module is disabled in this server and we get word from
+	 * the master server that it is enabled there, activate it so that we can
+	 * replay future WAL records involving it; also mark it as active on
+	 * pg_control.  If the old value was already set, we already did this, so
+	 * don't do anything.
+	 *
+	 * If the module is disabled in the master, disable it here too.
+	 */
+	if (newvalue)
+	{
+		if (!track_commit_timestamp && !oldvalue)
+			ActivateCommitTs();
+	}
+	else if (oldvalue)
+		DeactivateCommitTs(false);
+}
+
+/*
  * Activate this module whenever necessary.
  *		This must happen during postmaster or standalong-backend startup,
  *		or during WAL replay anytime the track_commit_timestamp setting is
@@ -584,12 +608,13 @@ CompleteCommitTsInitialization(void)
  * running with this module disabled for a while and thus might have skipped
  * the normal creation point.
  */
-void
+static void
 ActivateCommitTs(void)
 {
 	TransactionId xid = ShmemVariableCache->nextXid;
 	int			pageno = TransactionIdToCTsPage(xid);
 
+
 	/*
 	 * Re-Initialize our idea of the latest page number.
 	 */
@@ -629,6 +654,9 @@ ActivateCommitTs(void)
 		Assert(!CommitTsCtl->shared->page_dirty[slotno]);
 		LWLockRelease(CommitTsControlLock);
 	}
+
+	/* We can now replay xlog records from this module */
+	enable_during_recovery = true;
 }
 
 /*
@@ -641,7 +669,7 @@ ActivateCommitTs(void)
  * Resets CommitTs into invalid state to make sure we don't hand back
  * possibly-invalid data; also removes segments of old data.
  */
-void
+static void
 DeactivateCommitTs(bool do_wal)
 {
 	TransactionId xid = ShmemVariableCache->nextXid;
@@ -660,6 +688,9 @@ DeactivateCommitTs(bool do_wal)
 	LWLockRelease(CommitTsLock);
 
 	TruncateCommitTs(ReadNewTransactionId(), do_wal);
+
+	/* No longer enabled on recovery */
+	enable_during_recovery = false;
 }
 
 /*
@@ -699,7 +730,7 @@ ExtendCommitTs(TransactionId newestXact)
 	int			pageno;
 
 	/* nothing to do if module not enabled */
-	if (!track_commit_timestamp)
+	if (!track_commit_timestamp || !enable_during_recovery)
 		return;
 
 	/*
@@ -916,8 +947,7 @@ commit_ts_redo(XLogReaderState *record)
 			subxids = NULL;
 
 		TransactionTreeSetCommitTsData(setts->mainxid, nsubxids, subxids,
-									   setts->timestamp, setts->nodeid, false,
-									   true);
+									   setts->timestamp, setts->nodeid, true);
 		if (subxids)
 			pfree(subxids);
 	}
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index e005cc5..8c47e0f 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -2131,7 +2131,7 @@ RecordTransactionCommitPrepared(TransactionId xid,
 
 	TransactionTreeSetCommitTsData(xid, nchildren, children,
 								   replorigin_session_origin_timestamp,
-								   replorigin_session_origin, false, false);
+								   replorigin_session_origin, false);
 
 	/*
 	 * We don't currently try to sleep before flush here ... nor is there any
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 8f56a44..e8aafba 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -1237,8 +1237,7 @@ RecordTransactionCommit(void)
 
 		TransactionTreeSetCommitTsData(xid, nchildren, children,
 									   replorigin_session_origin_timestamp,
-									   replorigin_session_origin,
-									   false, false);
+									   replorigin_session_origin, false);
 	}
 
 	/*
@@ -5333,8 +5332,7 @@ xact_redo_commit(xl_xact_parsed_commit *parsed,
 
 	/* Set the transaction commit timestamp and metadata */
 	TransactionTreeSetCommitTsData(xid, parsed->nsubxacts, parsed->subxacts,
-								   commit_time, origin_id,
-								   true, false);
+								   commit_time, origin_id, false);
 
 	if (standbyState == STANDBY_DISABLED)
 	{
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 0266d61..08d1682 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -6567,7 +6567,7 @@ StartupXLOG(void)
 			 * maintained during recovery and need not be started yet.
 			 */
 			StartupCLOG();
-			StartupCommitTs();
+			StartupCommitTs(ControlFile->track_commit_timestamp);
 			StartupSUBTRANS(oldestActiveXID);
 
 			/*
@@ -7336,7 +7336,7 @@ StartupXLOG(void)
 	if (standbyState == STANDBY_DISABLED)
 	{
 		StartupCLOG();
-		StartupCommitTs();
+		StartupCommitTs(false);
 		StartupSUBTRANS(oldestActiveXID);
 	}
 
@@ -9456,25 +9456,9 @@ xlog_redo(XLogReaderState *record)
 			ControlFile->minRecoveryPointTLI = ThisTimeLineID;
 		}
 
-		/*
-		 * Update the commit timestamp tracking. If there was a change it
-		 * needs to be activated or deactivated accordingly.
-		 */
-		if (track_commit_timestamp != xlrec.track_commit_timestamp)
-		{
-			track_commit_timestamp = xlrec.track_commit_timestamp;
-			ControlFile->track_commit_timestamp = track_commit_timestamp;
-			if (track_commit_timestamp)
-				ActivateCommitTs();
-			else
-
-				/*
-				 * We can't create a new WAL record here, but that's OK as
-				 * master did the WAL logging already and we will replay the
-				 * record from master in case we crash.
-				 */
-				DeactivateCommitTs(false);
-		}
+		CommitTsParameterChange(xlrec.track_commit_timestamp,
+								ControlFile->track_commit_timestamp);
+		ControlFile->track_commit_timestamp = xlrec.track_commit_timestamp;
 
 		UpdateControlFile();
 		LWLockRelease(ControlFileLock);
diff --git a/src/include/access/commit_ts.h b/src/include/access/commit_ts.h
index dc865d1..1b95b58 100644
--- a/src/include/access/commit_ts.h
+++ b/src/include/access/commit_ts.h
@@ -24,8 +24,7 @@ extern bool check_track_commit_timestamp(bool *newval, void **extra,
 
 extern void TransactionTreeSetCommitTsData(TransactionId xid, int nsubxids,
 							   TransactionId *subxids, TimestampTz timestamp,
-							   RepOriginId nodeid,
-							   bool replaying_xlog, bool write_xlog);
+							   RepOriginId nodeid, bool write_xlog);
 extern bool TransactionIdGetCommitTsData(TransactionId xid,
 							 TimestampTz *ts, RepOriginId *nodeid);
 extern TransactionId GetLatestCommitTsData(TimestampTz *ts,
@@ -35,9 +34,8 @@ extern Size CommitTsShmemBuffers(void);
 extern Size CommitTsShmemSize(void);
 extern void CommitTsShmemInit(void);
 extern void BootStrapCommitTs(void);
-extern void StartupCommitTs(void);
-extern void ActivateCommitTs(void);
-extern void DeactivateCommitTs(bool do_wal);
+extern void StartupCommitTs(bool force_enable);
+extern void CommitTsParameterChange(bool xlrecvalue, bool pgcontrolvalue);
 extern void CompleteCommitTsInitialization(void);
 extern void ShutdownCommitTs(void);
 extern void CheckPointCommitTs(void);
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to