On 27.03.2025 01:13, Michael Paquier wrote:
Tracking if the configuration has been properly loaded in
WalSndCtlData makes sense here, but I am confused by the patch: you
have defined SYNCSTANDBYSDEFINED but sync_standbys_defined never sets
it.  It may be simpler to use a separate boolean flag for this
purpose.  WalSndCtlData is a private structure holding the state of
the WAL senders internally, so ABI does not stress me much here
(flexible array at the end of the structure, as well..).

Thanks for your responses.
Yes, I agree that it looks more complicated then addition new bool field,
but there is place in SyncRepWaitForLSN method, where we check
WalSndCtlData->sync_standbys_defined out of acquire/release block.

void
SyncRepWaitForLSN(XLogRecPtr lsn, bool commit)
{
    int            mode;
...............................................
*    if (!SyncRepRequested() ||
        !((volatile WalSndCtlData *) WalSndCtl)->sync_standbys_defined)
        return;*
..........................

If we just add new bool field, we can't atomically check sync_standbys_defined flag and new bool field, only in lock, so we lose fast exit optimization. But I agree
that in patch I lost SYNCSTANDBYSDEFINED setting, thanks for attentiveness,
so I've attached new patch with fix,also I've done bits setting more evident.

Anyway, if you still thinks that my arguments not enough, please notify, I will do patch with separate flag.


On 27.03.2025 01:13, Michael Paquier wrote:
Also mentioned by Kirill downthread: let's add a regression test with
an injection point that does a wait.  This race condition is worth
having a test for.  You could define the point of where you have added
your hardcoded sleep(), for example.
Sorry, but for me it seems that we can't use injection points toolset here, because we are talking about startup process, and we can't call injection_points_attach on client backend before checkpointer.

Anyway, if you know how to do it, please share details, I will do.


Best regards,

Maksim Melnikov
From c889fd8119642a4e9d3fb3e24fe5d6cd0e656ca5 Mon Sep 17 00:00:00 2001
From: Maksim Melnikov <m.melni...@postgrespro.ru>
Date: Wed, 26 Mar 2025 16:44:09 +0300
Subject: [PATCH] Fix sync_standbys_defined race on startup.

WalSndCtl->sync_standbys_defined, can be updated
by checkpointer process only and in backends it
can be read. So, on postgres startup, we have race
between checkpointer process and backends, that check
sync_standbys_defined for sync replications.

To avoid this, now we are working with sync_standbys_defined
not as bool field, but as bitmask, where 7th bit check
standbys_defined or not, 8th bit check
WalSndCtlData->sync_standbys_defined field has been
initialized by checkpointer. So if in synchronous replication functions
was detected that sync_standbys_defined is not initialized,
check SyncStandbysDefined() value.
---
 src/backend/replication/syncrep.c           | 45 +++++++++++++++++----
 src/include/replication/walsender_private.h | 10 ++++-
 2 files changed, 47 insertions(+), 8 deletions(-)

diff --git a/src/backend/replication/syncrep.c b/src/backend/replication/syncrep.c
index d75e3968035..d2ba4d604ed 100644
--- a/src/backend/replication/syncrep.c
+++ b/src/backend/replication/syncrep.c
@@ -148,6 +148,7 @@ void
 SyncRepWaitForLSN(XLogRecPtr lsn, bool commit)
 {
 	int			mode;
+	bool		sync_standbys_defined;
 
 	/*
 	 * This should be called while holding interrupts during a transaction
@@ -170,7 +171,8 @@ SyncRepWaitForLSN(XLogRecPtr lsn, bool commit)
 	 * it's false, the lock is not necessary because we don't touch the queue.
 	 */
 	if (!SyncRepRequested() ||
-		!((volatile WalSndCtlData *) WalSndCtl)->sync_standbys_defined)
+		(((volatile WalSndCtlData *) WalSndCtl)->sync_standbys_defined & SYNCSTANDBYSDEFINEDINITIALIZED)
+		== SYNCSTANDBYSINITIALIZED)
 		return;
 
 	/* Cap the level for anything other than commit to remote flush only. */
@@ -184,6 +186,8 @@ SyncRepWaitForLSN(XLogRecPtr lsn, bool commit)
 
 	LWLockAcquire(SyncRepLock, LW_EXCLUSIVE);
 	Assert(MyProc->syncRepState == SYNC_REP_NOT_WAITING);
+	sync_standbys_defined = WalSndCtl->sync_standbys_defined & SYNCSTANDBYSINITIALIZED
+		? WalSndCtl->sync_standbys_defined & SYNCSTANDBYSDEFINED : SyncStandbysDefined();
 
 	/*
 	 * We don't wait for sync rep if WalSndCtl->sync_standbys_defined is not
@@ -193,7 +197,7 @@ SyncRepWaitForLSN(XLogRecPtr lsn, bool commit)
 	 * condition but we'll be fetching that cache line anyway so it's likely
 	 * to be a low cost check.
 	 */
-	if (!WalSndCtl->sync_standbys_defined ||
+	if (!sync_standbys_defined ||
 		lsn <= WalSndCtl->lsn[mode])
 	{
 		LWLockRelease(SyncRepLock);
@@ -920,18 +924,19 @@ SyncRepWakeQueue(bool all, int mode)
 void
 SyncRepUpdateSyncStandbysDefined(void)
 {
-	bool		sync_standbys_defined = SyncStandbysDefined();
+	bool		new_sync_standbys_defined = SyncStandbysDefined();
+	bool		old_sync_standbys_defined = (WalSndCtl->sync_standbys_defined & SYNCSTANDBYSDEFINED);
 
-	if (sync_standbys_defined != WalSndCtl->sync_standbys_defined)
+	if (new_sync_standbys_defined != old_sync_standbys_defined)
 	{
+		uint8 nv;
 		LWLockAcquire(SyncRepLock, LW_EXCLUSIVE);
-
 		/*
 		 * If synchronous_standby_names has been reset to empty, it's futile
 		 * for backends to continue waiting.  Since the user no longer wants
 		 * synchronous replication, we'd better wake them up.
 		 */
-		if (!sync_standbys_defined)
+		if (!new_sync_standbys_defined)
 		{
 			int			i;
 
@@ -946,8 +951,34 @@ SyncRepUpdateSyncStandbysDefined(void)
 		 * backend that hasn't yet reloaded its config might go to sleep on
 		 * the queue (and never wake up).  This prevents that.
 		 */
-		WalSndCtl->sync_standbys_defined = sync_standbys_defined;
 
+		/* set initialized bit.*/
+		nv = WalSndCtl->sync_standbys_defined | SYNCSTANDBYSINITIALIZED;
+		if (new_sync_standbys_defined)
+		{
+  			/* set defined bit.*/
+			nv |= SYNCSTANDBYSDEFINED;
+		}
+		else
+		{
+			/* reset defined bit.*/
+			nv &= (~SYNCSTANDBYSDEFINED);
+		}
+
+		WalSndCtl->sync_standbys_defined = nv;
+
+		LWLockRelease(SyncRepLock);
+	}
+	else if (!(WalSndCtl->sync_standbys_defined & SYNCSTANDBYSINITIALIZED))
+	{
+		/*
+		 * Even if new_sync_standbys_defined == old_sync_standbys_defined, we
+		 * should check WalSndCtl->sync_standbys_defined is initialized and
+		 * set its initialize bit, if it reset. Anyway this bit should be set
+		 * only on start and then it will be always setted.
+		 */
+		LWLockAcquire(SyncRepLock, LW_EXCLUSIVE);
+		WalSndCtl->sync_standbys_defined |= SYNCSTANDBYSINITIALIZED;
 		LWLockRelease(SyncRepLock);
 	}
 }
diff --git a/src/include/replication/walsender_private.h b/src/include/replication/walsender_private.h
index 0fc77f1b4af..cc6a98b7e31 100644
--- a/src/include/replication/walsender_private.h
+++ b/src/include/replication/walsender_private.h
@@ -100,7 +100,7 @@ typedef struct
 	 * config file safely, so checkpointer updates this value as needed.
 	 * Protected by SyncRepLock.
 	 */
-	bool		sync_standbys_defined;
+	uint8		sync_standbys_defined;
 
 	/* used as a registry of physical / logical walsenders to wake */
 	ConditionVariable wal_flush_cv;
@@ -116,6 +116,14 @@ typedef struct
 	WalSnd		walsnds[FLEXIBLE_ARRAY_MEMBER];
 } WalSndCtlData;
 
+/*
+ * WalSndCtlData->sync_standbys_defined bit flags. 7th bit check standbys_defined or not,
+ * 8th bit check WalSndCtlData->sync_standbys_defined field has been initialized.
+ */
+#define SYNCSTANDBYSINITIALIZED	(1 << 0)
+#define SYNCSTANDBYSDEFINED		(1 << 1)
+#define SYNCSTANDBYSDEFINEDINITIALIZED	(SYNCSTANDBYSINITIALIZED & SYNCSTANDBYSDEFINED)
+
 extern PGDLLIMPORT WalSndCtlData *WalSndCtl;
 
 
-- 
2.43.0

Reply via email to