On 04/13/2014 11:39 PM, Heikki Linnakangas wrote:
However, I just noticed that there's a race condition between PREPARE
TRANSACTION and COMMIT/ROLLBACK PREPARED. PostPrepare_Locks runs after
the prepared transaction is already marked as fully prepared. That means
that by the time we get to PostPrepare_Locks, another backend might
already have finished and removed the prepared transaction. That leads
to a PANIC (put a breakpoint just before PostPrepare_Locks):

postgres=# commit prepared 'foo';
PANIC:  failed to re-find shared proclock object
PANIC:  failed to re-find shared proclock object
The connection to the server was lost. Attempting reset: Failed.

FinishPrepareTransaction reads the list of locks from the two-phase
state file, but PANICs when it doesn't find the corresponding locks in
the lock manager (because PostPrepare_Locks hasn't transfered them to
the dummy PGPROC yet).

I think we'll need to transfer of the locks earlier, before the
transaction is marked as fully prepared. I'll take a closer look at this
tomorrow.

Here's a patch to do that. It's very straightforward, I just moved the calls to transfer locks earlier, before ProcArrayClearTransaction. PostPrepare_MultiXact had a similar race - it also transfer state from the old PGPROC entry to the new, and needs to be done before allowing another backend to remove the new PGPROC entry. I changed the names of the functions to distinguish them from the other PostPrepare_* functions that now happen at a different time.

The patch is simple, but it's a bit scary to change the order of things like this. Looking at all the calls that now happen after transferring the locks, I believe this is OK. The change also applies to the callbacks called by the RegisterXactCallback mechanism, which means that in theory there might be a 3rd party extension out there that's affected. All the callbacks in contrib and plpgsql are OK, and it's questionable to do anything complicated that would depend on heavy-weight locks to be held in those callbacks, so I think this is OK. Warrants a note in the release notes, though.

- Heikki

diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index d4ad678..b505c62 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -1580,11 +1580,12 @@ AtPrepare_MultiXact(void)
 }
 
 /*
- * PostPrepare_MultiXact
- *		Clean up after successful PREPARE TRANSACTION
+ * TransferPrepare_MultiXact
+ *		Called after successful PREPARE TRANSACTION, before releasing our
+ *		PGPROC entry.
  */
 void
-PostPrepare_MultiXact(TransactionId xid)
+TransferPrepare_MultiXact(TransactionId xid)
 {
 	MultiXactId myOldestMember;
 
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index b20d973..c60edf1 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -2224,6 +2224,16 @@ PrepareTransaction(void)
 	XactLastRecEnd = 0;
 
 	/*
+	 * Transfer any heavy-weight locks we're holding to the dummy ProcArray.
+	 *
+	 * NB: this has to be done before clearing our own ProcArray entry.
+	 * This is different from normal commit, where locks are released after
+	 * clearing the ProcArray entry!
+	 */
+	TransferPrepare_MultiXact(xid);	  /* also transfer our multixact state */
+	TransferPrepare_Locks(xid);
+
+	/*
 	 * Let others know about no transaction in progress by me.	This has to be
 	 * done *after* the prepared transaction has been marked valid, else
 	 * someone may think it is unlocked and recyclable.
@@ -2234,6 +2244,10 @@ PrepareTransaction(void)
 	 * This is all post-transaction cleanup.  Note that if an error is raised
 	 * here, it's too late to abort the transaction.  This should be just
 	 * noncritical resource releasing.	See notes in CommitTransaction.
+	 *
+	 * NB: we already transfered the locks to the prepared ProcArray entry,
+	 * so even the cleanup before ResourceOwnerRelease(RESOURCE_RELEASE_LOCKS)
+	 * cannot rely on any heavy-weight locks being held!
 	 */
 
 	CallXactCallbacks(XACT_EVENT_PREPARE);
@@ -2256,11 +2270,12 @@ PrepareTransaction(void)
 
 	PostPrepare_smgr();
 
-	PostPrepare_MultiXact(xid);
-
-	PostPrepare_Locks(xid);
 	PostPrepare_PredicateLocks(xid);
 
+	/*
+	 * we're not actually holding any locks anymore, but clean up any other
+	 * resources that might need to be cleaned up at this stage.
+	 */
 	ResourceOwnerRelease(TopTransactionResourceOwner,
 						 RESOURCE_RELEASE_LOCKS,
 						 true, true);
diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index 6825063..779f0cb 100644
--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -3069,8 +3069,8 @@ AtPrepare_Locks(void)
 }
 
 /*
- * PostPrepare_Locks
- *		Clean up after successful PREPARE
+ * TransferPrepare_Locks
+ *		Transfer locks to prepared transaction after successful PREPARE
  *
  * Here, we want to transfer ownership of our locks to a dummy PGPROC
  * that's now associated with the prepared transaction, and we want to
@@ -3084,7 +3084,7 @@ AtPrepare_Locks(void)
  * but that probably costs more cycles.
  */
 void
-PostPrepare_Locks(TransactionId xid)
+TransferPrepare_Locks(TransactionId xid)
 {
 	PGPROC	   *newproc = TwoPhaseGetDummyProc(xid);
 	HASH_SEQ_STATUS status;
diff --git a/src/include/access/multixact.h b/src/include/access/multixact.h
index 9729f27..cd40a35 100644
--- a/src/include/access/multixact.h
+++ b/src/include/access/multixact.h
@@ -96,7 +96,7 @@ extern bool MultiXactIdPrecedesOrEquals(MultiXactId multi1,
 
 extern void AtEOXact_MultiXact(void);
 extern void AtPrepare_MultiXact(void);
-extern void PostPrepare_MultiXact(TransactionId xid);
+extern void TransferPrepare_MultiXact(TransactionId xid);
 
 extern Size MultiXactShmemSize(void);
 extern void MultiXactShmemInit(void);
diff --git a/src/include/storage/lock.h b/src/include/storage/lock.h
index ceeab9f..2639faf 100644
--- a/src/include/storage/lock.h
+++ b/src/include/storage/lock.h
@@ -516,7 +516,7 @@ extern bool LockHasWaiters(const LOCKTAG *locktag,
 extern VirtualTransactionId *GetLockConflicts(const LOCKTAG *locktag,
 				 LOCKMODE lockmode);
 extern void AtPrepare_Locks(void);
-extern void PostPrepare_Locks(TransactionId xid);
+extern void TransferPrepare_Locks(TransactionId xid);
 extern int LockCheckConflicts(LockMethod lockMethodTable,
 				   LOCKMODE lockmode,
 				   LOCK *lock, PROCLOCK *proclock);
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to