On Tue, Apr 24, 2012 at 09:08:36AM +0100, Simon Riggs wrote: > On Sat, Apr 21, 2012 at 5:52 PM, Noah Misch <n...@leadboat.com> wrote: > > The fix is to compare the stored XID to RecentGlobalXmin, not RecentXmin. > > ?We > > already use RecentGlobalXmin when wal_level = hot_standby. ?If no running > > transaction has an XID and all running transactions began since the last > > transaction that did bear an XID, RecentGlobalXmin == > > ReadNewTransactionId(). > > Therefore, the correct test is btpo.xact < RecentGlobalXmin, not btpo.xact > > <= > > RecentGlobalXmin as we have today. ?This also cleanly removes the need for > > the > > bit of code in _bt_getbuf() that decrements btpo.xact before sending it down > > for ResolveRecoveryConflictWithSnapshot(). ?I suggested[2] that decrement on > > an unprincipled basis; it was just masking the off-by-one of using "<= > > RecentGlobalXmin" instead of "< RecentGlobalXmin" in _bt_page_recyclable(). > > Looks like the right fix. I'll apply this to 9.0/9.1/HEAD.
Thanks. 8.3 and 8.4 need it, too, albeit simplified due to some of the affected code not existing yet. > > This change makes empty B-tree pages wait through two generations of running > > transactions before reuse, so some additional bloat will arise. > > Could arise, in some circumstances. But that assumes VACUUMs are > fairly frequent and that they would be delayed/rendered less effective > by this, which I don't think will be the case. > > I note that we don't take any account of the number of pages that may > be reused when we VACUUM, so when HOT avoids a VACUUM we may > accumulate pages for a longer period. Looks like there is more work to > be done yet in cleaning indexes. Agreed. Taking one VACUUM visit to mark the page BTP_DELETED and another to add it to the FSM is fairly pessimistic; perhaps we can do better. nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers