Andres Freund wrote:

> Instead of calculating the multixact cutoff xid by using the global
> minimum of OldestMemberMXactId[] and OldestVisibleMXactId[] and then
> subtracting vacuum_freeze_min_age compute it solely as the minimum of
> OldestMemberMXactId[]. If we do that computation *after* doing the
> GetOldestXmin() in vacuum_set_xid_limits() we can be sure no mxact above
> the new mxact cutoff will contain a xid below the xid cutoff. This is so
> since it would otherwise have been reported as running by
> GetOldestXmin().
> With that change we can leave heap_tuple_needs_freeze() and
> heap_freeze_tuple() unchanged since using the mxact cutoff is
> sufficient.

Some thoughts here:

1. Using vacuum_freeze_min_age was clearly a poor choice.  Normally
(XIDs are normally consumed much faster than multis), it's far too
large.  In your reported case (per IM discussion), the customer is
approaching 4 billion Xids but is still at 15 million multixids; so the
relminmxid is still 1, because the default freeze_min_age is 50 million
... so at their current rate, they will wrap around the Xid counter 3-4
times before seeing this minmxid value advance at all.

2. Freezing too much has the disadvantage that you lose info possibly
useful for forensics.  And I believe that freezing just after a multi
has gone below the immediate visibility horizon will make them live far
too little.  Now the performance guys are always saying how they would
like tuples to even start life frozen, let alone delay any number of
transactions before them being frozen; but to help the case for those
who investigate and fix corrupted databases, we need a higher freeze
horizon.  Heck, maybe even 100k multis would be enough to keep enough
evidence to track bugs down.  I propose we keep at least a million.
This is an even more important argument currently, given how buggy the
current multixact code has proven to be.

2a. Freezing less also means less thrashing ...

3. I'm not sure I understand how the proposal above fixes things during
recovery.  If we keep the multi values above the freeze horizon you
propose above, are we certain no old Xid values will remain?

4. Maybe it would be useful to emit a more verbose freezing record in
HEAD, even if we introduce some dirty ugly hack in 9.3 to avoid having
to change WAL format.

4a. Maybe we can introduce a new WAL record in 9.3 anyway and tell
people to always upgrade the replicas before the masters.  (I think we
did this in the past once.)

3 and 4 in combination: maybe we can change 9.3 to not have any
breathing room for freezing, to fix the current swarm of bugs without
having to change WAL format, and do something more invasive in HEAD to
keep more multis around for forensics.

5. the new multixact stuff seems way too buggy.  Should we rip it all
out and return to the old tuple locking scheme?  We spent a huge amount
of time writing it and reviewing it and now maintaining, but I haven't
seen a *single* performance report saying how awesome 9.3 is compared to
older releases due to this change; the 9.3 request for testing, at the
start of the beta period, didn't even mention to try it out *at all*.

Álvaro Herrera      
PostgreSQL Development, 24x7 Support, Training & Services

Sent via pgsql-hackers mailing list (
To make changes to your subscription:

Reply via email to