On 2012-12-13 18:29:00 +0100, Andres Freund wrote:
> On 2012-12-13 00:05:41 +0000, Peter Geoghegan wrote:
> > Initialisation means finding a free “logical slot” from shared memory,
> > then looping until the new magic xmin horizon for logical walsenders
> > (stored in their “slot”) is that of the weakest link (think local
> > global xmin).
> >
> > +    * FIXME: think about solving the race conditions in a nicer way.
> > +    */
> > + recompute_xmin:
> > +   walsnd->xmin = GetOldestXmin(true, true);
> > +   ComputeLogicalXmin();
> > +   if (walsnd->xmin != GetOldestXmin(true, true))
> > +           goto recompute_xmin;
> >
> > Apart from the race conditions that I'm not confident are addressed
> > here, I think that the above could easily get stuck indefinitely in
> > the event of contention.
>
> I don't like that part the slightest bit but I don't think its actually
> in danger of looping forever. In fact I think its so broken it won't
> ever loop ;). (ComputeLogicalXmin() will set the current global minimum
> which will then be returned by GetOldestXmin()).
>
> I would like to solve this properly without copying GetOldestXmin once
> more (so we can compute and set the logical xmin while holding
> ProcArrayLock), but I am not yet sure how a nice way to do that would
> look like.
>
> I guess GetOldestXminNoLock? That gets called while we already hold the
> procarray lock? Yuck.

Does anybody have an opinion on the attached patches? Especially 0001,
which contains the procarray changes?

It moves a computation of the sort of:

result -= vacuum_defer_cleanup_age;
if (!TransactionIdIsNormal(result))
   result = FirstNormalTransactionId;

inside ProcArrayLock. But I can't really imagine that to be relevant...


Another alternative to this would be to get a snapshot with
GetSnapshotData(), copy the xmin to the logical slot, then call
ProcArrayEndTransaction(). But that doesn't really seem to be nicer to
me.

Greetings,

Andres Freund

--
 Andres Freund                     http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
>From 660b6995c0260eb112d7b6f7158e3b4654c3d9bf Mon Sep 17 00:00:00 2001
From: Andres Freund <and...@anarazel.de>
Date: Thu, 13 Dec 2012 20:47:57 +0100
Subject: [PATCH 1/2] Add GetOldestXminNoLock as a variant (and
 implementation) of GetOldestXmin

This is useful because it allows to compute the current OldestXmin while
already holding the procarray lock which enables setting the own xmin horizon
safely.
---
 src/backend/storage/ipc/procarray.c | 30 +++++++++++++++++++++---------
 src/include/storage/procarray.h     |  1 +
 2 files changed, 22 insertions(+), 9 deletions(-)

diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 985350e..a704c9c 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -1104,6 +1104,27 @@ TransactionIdIsActive(TransactionId xid)
 TransactionId
 GetOldestXmin(bool allDbs, bool ignoreVacuum)
 {
+	TransactionId res;
+
+	/* Cannot look for individual databases during recovery */
+	Assert(allDbs || !RecoveryInProgress());
+
+	LWLockAcquire(ProcArrayLock, LW_SHARED);
+	res = GetOldestXminNoLock(allDbs, ignoreVacuum);
+	LWLockRelease(ProcArrayLock);
+	return res;
+}
+
+/*
+ * GetOldestXminNoLock -- worker routine for GetOldestXmin and others
+ *
+ * Requires ProcArrayLock to be already locked!
+ *
+ * Check GetOldestXmin for the semantics of this.
+ */
+TransactionId
+GetOldestXminNoLock(bool allDbs, bool ignoreVacuum)
+{
 	ProcArrayStruct *arrayP = procArray;
 	TransactionId result;
 	int			index;
@@ -1111,8 +1132,6 @@ GetOldestXmin(bool allDbs, bool ignoreVacuum)
 	/* Cannot look for individual databases during recovery */
 	Assert(allDbs || !RecoveryInProgress());
 
-	LWLockAcquire(ProcArrayLock, LW_SHARED);
-
 	/*
 	 * We initialize the MIN() calculation with latestCompletedXid + 1. This
 	 * is a lower bound for the XIDs that might appear in the ProcArray later,
@@ -1174,8 +1193,6 @@ GetOldestXmin(bool allDbs, bool ignoreVacuum)
 		 */
 		TransactionId kaxmin = KnownAssignedXidsGetOldestXmin();
 
-		LWLockRelease(ProcArrayLock);
-
 		if (TransactionIdIsNormal(kaxmin) &&
 			TransactionIdPrecedes(kaxmin, result))
 			result = kaxmin;
@@ -1183,11 +1200,6 @@ GetOldestXmin(bool allDbs, bool ignoreVacuum)
 	else
 	{
 		/*
-		 * No other information needed, so release the lock immediately.
-		 */
-		LWLockRelease(ProcArrayLock);
-
-		/*
 		 * Compute the cutoff XID by subtracting vacuum_defer_cleanup_age,
 		 * being careful not to generate a "permanent" XID.
 		 *
diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h
index 9933dad..ce8d98b 100644
--- a/src/include/storage/procarray.h
+++ b/src/include/storage/procarray.h
@@ -50,6 +50,7 @@ extern RunningTransactions GetRunningTransactionData(void);
 extern bool TransactionIdIsInProgress(TransactionId xid);
 extern bool TransactionIdIsActive(TransactionId xid);
 extern TransactionId GetOldestXmin(bool allDbs, bool ignoreVacuum);
+extern TransactionId GetOldestXminNoLock(bool allDbs, bool ignoreVacuum);
 extern TransactionId GetOldestActiveTransactionId(void);
 
 extern VirtualTransactionId *GetVirtualXIDsDelayingChkpt(int *nvxids);
-- 
1.7.12.289.g0ce9864.dirty

>From 3f5bf1e2fad99081edfcb31601398af3b953cf15 Mon Sep 17 00:00:00 2001
From: Andres Freund <and...@anarazel.de>
Date: Thu, 13 Dec 2012 20:49:59 +0100
Subject: [PATCH 2/2] wal decoding: Use GetOldestXminNoLock to compute the
 initial logical xmin safely

---
 src/backend/replication/walsender.c | 16 +++++++---------
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index c1ec0a3..2204c7a 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -544,16 +544,12 @@ InitLogicalReplication(InitLogicalReplicationCmd *cmd)
 	 * cannot go backwards anymore, as ComputeLogicalXmin() nails the value
 	 * down.
 	 *
-	 * We need to do this *after* releasing the spinlock, otherwise
-	 * GetOldestXmin will deadlock with ourselves.
-	 *
-	 * FIXME: think about solving the race conditions in a nicer way.
+	 * FIXME: this should probably be in procarray.c?
 	 */
-recompute_xmin:
-	walsnd->xmin = GetOldestXmin(true, true);
+	LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+	walsnd->xmin = GetOldestXminNoLock(true, true);
+	LWLockRelease(ProcArrayLock);
 	ComputeLogicalXmin();
-	if (walsnd->xmin != GetOldestXmin(true, true))
-		goto recompute_xmin;
 
 	decoding_ctx = AllocSetContextCreate(TopMemoryContext,
 										 "decoding context",
@@ -1133,8 +1129,10 @@ ProcessStandbyReplyMessage(void)
 	}
 
 	/*
-	 * Do an unlocked check for candidate_xmin first.
+	 * Advance our local xmin horizin when the client confirmed a flush.
 	 */
+
+	/* Do an unlocked check for candidate_xmin first.*/
 	if (MyLogicalWalSnd &&
 		TransactionIdIsValid(MyLogicalWalSnd->candidate_xmin))
 	{
-- 
1.7.12.289.g0ce9864.dirty

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to