On 2013-12-03 00:47:07 -0500, Noah Misch wrote:
> On Sat, Nov 30, 2013 at 01:06:09AM +0000, Alvaro Herrera wrote:
> > Fix a couple of bugs in MultiXactId freezing
> > 
> > Both heap_freeze_tuple() and heap_tuple_needs_freeze() neglected to look
> > into a multixact to check the members against cutoff_xid.
> 
> > !                   /*
> > !                    * This is a multixact which is not marked LOCK_ONLY, 
> > but which
> > !                    * is newer than the cutoff_multi.  If the update_xid 
> > is below the
> > !                    * cutoff_xid point, then we can just freeze the Xmax 
> > in the
> > !                    * tuple, removing it altogether.  This seems simple, 
> > but there
> > !                    * are several underlying assumptions:
> > !                    *
> > !                    * 1. A tuple marked by an multixact containing a very 
> > old
> > !                    * committed update Xid would have been pruned away by 
> > vacuum; we
> > !                    * wouldn't be freezing this tuple at all.
> > !                    *
> > !                    * 2. There cannot possibly be any live locking members 
> > remaining
> > !                    * in the multixact.  This is because if they were 
> > alive, the
> > !                    * update's Xid would had been considered, via the 
> > lockers'
> > !                    * snapshot's Xmin, as part the cutoff_xid.
> 
> READ COMMITTED transactions can reset MyPgXact->xmin between commands,
> defeating that assumption; see SnapshotResetXmin().  I have attached an
> isolationtester spec demonstrating the problem.

Any idea how to cheat our way out of that one given the current way
heap_freeze_tuple() works (running on both primary and standby)? My only
idea was to MultiXactIdWait() if !InRecovery but that's extremly grotty.
We can't even realistically create a new multixact with fewer members
with the current format of xl_heap_freeze.

> The test spec additionally
> covers a (probably-related) assertion failure, new in 9.3.2.

Too bad it's too late to do anthing about it for 9.3.2. :(. At least the
last seems actually unrelated, I am not sure why it's 9.3.2
only. Alvaro, are you looking?

> That was the only concrete runtime problem I found during a study of the
> newest heap_freeze_tuple() and heap_tuple_needs_freeze() code.

I'd even be interested in fuzzy problems ;). If 9.3. wouldn't have been
released the interactions between cutoff_xid/multi would have caused me
to say "back to the drawing" board... I'm not suprised if further things
are lurking there.

>  One thing that
> leaves me unsure is the fact that vacuum_set_xid_limits() does no locking to
> ensure a consistent result between GetOldestXmin() and GetOldestMultiXactId().
> Transactions may start or end between those calls, making the
> GetOldestMultiXactId() result represent a later set of transactions than the
> GetOldestXmin() result.  I suspect that's fine.  New transactions have no
> immediate effect on either cutoff, and transaction end can only increase a
> cutoff.  Using a slightly-lower cutoff than the maximum safe cutoff is always
> okay; consider vacuum_defer_cleanup_age.

Yes, that seems fine to me, with the same reasoning.

Greetings,

Andres Freund

-- 
 Andres Freund                     http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to