On 04/13/2014 11:39 PM, Heikki Linnakangas wrote:
However, I just noticed that there's a race condition between PREPARE
TRANSACTION and COMMIT/ROLLBACK PREPARED. PostPrepare_Locks runs after
the prepared transaction is already marked as fully prepared. That means
that by the time we get to PostPrepare_Locks, another backend might
already have finished and removed the prepared transaction. That leads
to a PANIC (put a breakpoint just before PostPrepare_Locks):
postgres=# commit prepared 'foo';
PANIC: failed to re-find shared proclock object
PANIC: failed to re-find shared proclock object
The connection to the server was lost. Attempting reset: Failed.
FinishPrepareTransaction reads the list of locks from the two-phase
state file, but PANICs when it doesn't find the corresponding locks in
the lock manager (because PostPrepare_Locks hasn't transfered them to
the dummy PGPROC yet).
I think we'll need to transfer of the locks earlier, before the
transaction is marked as fully prepared. I'll take a closer look at this
tomorrow.
Here's a patch to do that. It's very straightforward, I just moved the
calls to transfer locks earlier, before ProcArrayClearTransaction.
PostPrepare_MultiXact had a similar race - it also transfer state from
the old PGPROC entry to the new, and needs to be done before allowing
another backend to remove the new PGPROC entry. I changed the names of
the functions to distinguish them from the other PostPrepare_* functions
that now happen at a different time.
The patch is simple, but it's a bit scary to change the order of things
like this. Looking at all the calls that now happen after transferring
the locks, I believe this is OK. The change also applies to the
callbacks called by the RegisterXactCallback mechanism, which means that
in theory there might be a 3rd party extension out there that's
affected. All the callbacks in contrib and plpgsql are OK, and it's
questionable to do anything complicated that would depend on
heavy-weight locks to be held in those callbacks, so I think this is OK.
Warrants a note in the release notes, though.
- Heikki
diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index d4ad678..b505c62 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -1580,11 +1580,12 @@ AtPrepare_MultiXact(void)
}
/*
- * PostPrepare_MultiXact
- * Clean up after successful PREPARE TRANSACTION
+ * TransferPrepare_MultiXact
+ * Called after successful PREPARE TRANSACTION, before releasing our
+ * PGPROC entry.
*/
void
-PostPrepare_MultiXact(TransactionId xid)
+TransferPrepare_MultiXact(TransactionId xid)
{
MultiXactId myOldestMember;
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index b20d973..c60edf1 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -2224,6 +2224,16 @@ PrepareTransaction(void)
XactLastRecEnd = 0;
/*
+ * Transfer any heavy-weight locks we're holding to the dummy ProcArray.
+ *
+ * NB: this has to be done before clearing our own ProcArray entry.
+ * This is different from normal commit, where locks are released after
+ * clearing the ProcArray entry!
+ */
+ TransferPrepare_MultiXact(xid); /* also transfer our multixact state */
+ TransferPrepare_Locks(xid);
+
+ /*
* Let others know about no transaction in progress by me. This has to be
* done *after* the prepared transaction has been marked valid, else
* someone may think it is unlocked and recyclable.
@@ -2234,6 +2244,10 @@ PrepareTransaction(void)
* This is all post-transaction cleanup. Note that if an error is raised
* here, it's too late to abort the transaction. This should be just
* noncritical resource releasing. See notes in CommitTransaction.
+ *
+ * NB: we already transfered the locks to the prepared ProcArray entry,
+ * so even the cleanup before ResourceOwnerRelease(RESOURCE_RELEASE_LOCKS)
+ * cannot rely on any heavy-weight locks being held!
*/
CallXactCallbacks(XACT_EVENT_PREPARE);
@@ -2256,11 +2270,12 @@ PrepareTransaction(void)
PostPrepare_smgr();
- PostPrepare_MultiXact(xid);
-
- PostPrepare_Locks(xid);
PostPrepare_PredicateLocks(xid);
+ /*
+ * we're not actually holding any locks anymore, but clean up any other
+ * resources that might need to be cleaned up at this stage.
+ */
ResourceOwnerRelease(TopTransactionResourceOwner,
RESOURCE_RELEASE_LOCKS,
true, true);
diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index 6825063..779f0cb 100644
--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -3069,8 +3069,8 @@ AtPrepare_Locks(void)
}
/*
- * PostPrepare_Locks
- * Clean up after successful PREPARE
+ * TransferPrepare_Locks
+ * Transfer locks to prepared transaction after successful PREPARE
*
* Here, we want to transfer ownership of our locks to a dummy PGPROC
* that's now associated with the prepared transaction, and we want to
@@ -3084,7 +3084,7 @@ AtPrepare_Locks(void)
* but that probably costs more cycles.
*/
void
-PostPrepare_Locks(TransactionId xid)
+TransferPrepare_Locks(TransactionId xid)
{
PGPROC *newproc = TwoPhaseGetDummyProc(xid);
HASH_SEQ_STATUS status;
diff --git a/src/include/access/multixact.h b/src/include/access/multixact.h
index 9729f27..cd40a35 100644
--- a/src/include/access/multixact.h
+++ b/src/include/access/multixact.h
@@ -96,7 +96,7 @@ extern bool MultiXactIdPrecedesOrEquals(MultiXactId multi1,
extern void AtEOXact_MultiXact(void);
extern void AtPrepare_MultiXact(void);
-extern void PostPrepare_MultiXact(TransactionId xid);
+extern void TransferPrepare_MultiXact(TransactionId xid);
extern Size MultiXactShmemSize(void);
extern void MultiXactShmemInit(void);
diff --git a/src/include/storage/lock.h b/src/include/storage/lock.h
index ceeab9f..2639faf 100644
--- a/src/include/storage/lock.h
+++ b/src/include/storage/lock.h
@@ -516,7 +516,7 @@ extern bool LockHasWaiters(const LOCKTAG *locktag,
extern VirtualTransactionId *GetLockConflicts(const LOCKTAG *locktag,
LOCKMODE lockmode);
extern void AtPrepare_Locks(void);
-extern void PostPrepare_Locks(TransactionId xid);
+extern void TransferPrepare_Locks(TransactionId xid);
extern int LockCheckConflicts(LockMethod lockMethodTable,
LOCKMODE lockmode,
LOCK *lock, PROCLOCK *proclock);
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers