On Thu, Jun 16, 2011 at 12:02:47AM +0100, Simon Riggs wrote: > On Tue, Jun 14, 2011 at 5:28 AM, Noah Misch <n...@leadboat.com> wrote: > > On Mon, Jun 13, 2011 at 04:16:06PM +0100, Simon Riggs wrote: > >> On Mon, Jun 13, 2011 at 3:11 AM, Robert Haas <robertmh...@gmail.com> wrote: > >> > On Sun, Jun 12, 2011 at 3:01 PM, Noah Misch <n...@leadboat.com> wrote: > >> >> Assuming that conclusion, I do think it's worth starting > >> >> with something simple, even if it means additional bloat on the master > >> >> in the > >> >> wal_level=hot_standby + vacuum_defer_cleanup_age / hot_standby_feedback > >> >> case. > >> >> In choosing those settings, the administrator has taken constructive > >> >> steps to > >> >> accept master-side bloat in exchange for delaying recovery conflict. > >> >> ?What's > >> >> your opinion? > >> > > >> > I'm pretty disinclined to go tinkering with 9.1 at this point, too. > >> > >> Not least because a feature already exists in 9.1 to cope with this > >> problem: hot standby feedback. > > > > A standby's receipt of an XLOG_BTREE_REUSE_PAGE record implies that the > > accompanying latestRemovedXid preceded or equaled the master's RecentXmin > > at the > > time of issue (see _bt_page_recyclable()). ?Neither hot_standby_feedback nor > > vacuum_defer_cleanup_age affect RecentXmin. ?Therefore, neither facility > > delays > > conflicts arising directly from B-tree page reuse. ?See attached test > > script, > > which yields a snapshot conflict despite active hot_standby_feedback. > > OK, agreed. Bug. Good catch, Noah. > > Fix is to use RecentGlobalXmin for the cutoff when in Hot Standby > mode, so that it is under user control. > > Attached patch will be applied to head and backpatched to 9.1 and 9.0 > to fix this.
Thanks. We still hit a conflict when btpo.xact == RecentGlobalXmin and the standby has a transaction older than any master transaction. This happens because the tests at nbtpage.c:704 and procarray.c:1843 both pass when the xid exactly is that of the oldest standby transaction (line numbers as of git cb94db91b). I only know this because the test script from my last message hits this case; it might never get hit in real usage. Still, seems like a hole not worth leaving. I think the most-correct fix is to TransactionIdRetreat the btpo.xact before using it as xl_btree_reuse_page.lastestRemovedXid. btpo.xact is the first known-safe xid, but latestRemovedXid is the last known-unsafe xmin. nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers