Re: [HACKERS] foreign key locks, 2nd attempt

Simon Riggs Tue, 06 Mar 2012 12:28:55 -0800

On Tue, Mar 6, 2012 at 7:39 PM, Alvaro Herrera
<[email protected]> wrote:


> We provide four levels of tuple locking strength: SELECT FOR KEY UPDATE is
> super-exclusive locking (used to delete tuples and more generally to update
> tuples modifying the values of the columns that make up the key of the tuple);
> SELECT FOR UPDATE is a standards-compliant exclusive lock; SELECT FOR SHARE
> implements shared locks; and finally SELECT FOR KEY SHARE is a super-weak mode
> that does not conflict with exclusive mode, but conflicts with SELECT FOR KEY
> UPDATE.  This last mode implements a mode just strong enough to implement RI
> checks, i.e. it ensures that tuples do not go away from under a check, without
> blocking when some other transaction that want to update the tuple without
> changing its key.

So there are 4 lock types, but we only have room for 3 on the tuple
header, so we store the least common/deprecated of the 4 types as a
multixactid. Some rewording would help there.

Neat scheme!


My understanding is that all of theses workloads will change

* Users of explicit SHARE lockers will be slightly worse in the case
of the 1st locker, but then after that they'll be the same as before.

* Updates against an RI locked table will be dramatically faster
because of reduced lock waits


...and that these previous workloads are effectively unchanged:

* Stream of RI checks causes mxacts

* Multi row deadlocks still possible

* Queues of writers still wait in the same way

* Deletes don't cause mxacts unless by same transaction



> In earlier PostgreSQL releases, a MultiXact always meant that the tuple was
> locked in shared mode by multiple transactions.  This is no longer the case; a
> MultiXact may contain an update or delete Xid.  (Keep in mind that tuple locks
> in a transaction do not conflict with other tuple locks in the same
> transaction, so it's possible to have otherwise conflicting locks in a
> MultiXact if they belong to the same transaction).

Somewhat confusing, but am getting there.

> Note that each lock is attributed to the subtransaction that acquires it.
> This means that a subtransaction that aborts is seen as though it releases the
> locks it acquired; concurrent transactions can then proceed without having to
> wait for the main transaction to finish.  It also means that a subtransaction
> can upgrade to a stronger lock level than an earlier transaction had, and if
> the subxact aborts, the earlier, weaker lock is kept.

OK

> The possibility of having an update within a MultiXact means that they must
> persist across crashes and restarts: a future reader of the tuple needs to
> figure out whether the update committed or aborted.  So we have a requirement
> that pg_multixact needs to retain pages of its data until we're certain that
> the MultiXacts in them are no longer of interest.

I think the "no longer of interest" aspect needs to be tracked more
closely because it will necessarily lead to more I/O.

If we store the LSN on each mxact page, as I think we need to, we can
get rid of pages more quickly if we know they don't have an LSN set.
So its possible we can optimise that more.

> VACUUM is in charge of removing old MultiXacts at the time of tuple freezing.

You mean mxact segments?

Surely we set hint bits on tuples same as now? Hope so.

> This works in the same way that pg_clog segments are removed: we have a
> pg_class column that stores the earliest multixact that could possibly be
> stored in the table; the minimum of all such values is stored in a pg_database
> column.  VACUUM computes the minimum across all pg_database values, and
> removes pg_multixact segments older than the minimum.

-- 
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] foreign key locks, 2nd attempt

Reply via email to