Re: [HACKERS] Problem with txid_snapshot_in/out() functionality

Heikki Linnakangas Mon, 14 Apr 2014 02:16:30 -0700

On 04/12/2014 05:03 PM, Andres Freund wrote:

On 2014-04-12 09:47:24 -0400, Tom Lane wrote:

Heikki Linnakangas <hlinnakan...@vmware.com> writes:

On 04/12/2014 12:07 AM, Jan Wieck wrote:

the Slony team has been getting seldom reports of a problem with the
txid_snapshot data type.
The symptom is that a txid_snapshot on output lists the same txid
multiple times in the xip list part of the external representation.

It's two-phase commit. When preparing a transaction, the state of the
transaction is first transfered to a dummy PGXACT entry, and then the
PGXACT entry of the backend is cleared. There is a transient state when
both PGXACT entries have the same xid.


Hm, yeah, but why is that intermediate state visible to anyone else?
Don't we have exclusive lock on the PGPROC array while we're doing this?


It's done outside the remit of ProcArray lock :(. And documented to lead
to duplicate xids in PGXACT.
EndPrepare():
        /*
         * Mark the prepared transaction as valid.      As soon as xact.c marks
         * MyPgXact as not running our XID (which it will do immediately after
         * this function returns), others can commit/rollback the xact.
         *
         * NB: a side effect of this is to make a dummy ProcArray entry for the
         * prepared XID.  This must happen before we clear the XID from 
MyPgXact,
         * else there is a window where the XID is not running according to
         * TransactionIdIsInProgress, and onlookers would be entitled to assume
         * the xact crashed.  Instead we have a window where the same XID 
appears
         * twice in ProcArray, which is OK.
         */
        MarkAsPrepared(gxact);

It doesn't sound too hard to essentially move PrepareTransaction()'s
ProcArrayClearTransaction() into MarkAsPrepared() and rejigger the
locking to remove the intermediate state. But I think it'll lead to
bigger changes than we'd be comfortable backpatching.

Hmm. There's a field in GlobalTransactionData called locking_xid, whichis used to mark the XID of the transaction that's currently operating onthe prepared transaction. At prepare, that ensures that the transactioncannot be committed or rolled back by another backend until the originalbackend has cleared its PGPROC entry. At COMMIT/ROLLBACK PREPARED, itensures that only one backend can commit/rollback the transaction.

I wonder why we don't use a VirtualTransactionId there. AFAICS there isno reason for COMMIT/ROLLBACK PREPARED to be assigned an XID of its own.And if we used a VirtualTransactionId there, prepare could clear the xidfield of the PGPROC entry earlier.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Problem with txid_snapshot_in/out() functionality

Reply via email to