I'm looking at the most recent version of the Hot Standby patch at Robert Haas' GIT repository. The conflict cache code is broken:
> +void > +SetDeferredRecoveryConflicts(TransactionId latestRemovedXid, RelFileNode > node, > + XLogRecPtr conflict_lsn) > +{ > + ProcArrayStruct *arrayP = procArray; > + int index; > + Oid dbOid = node.dbNode; > + > + Assert(InRecovery); > + > + if (!LatestRemovedXidAdvances(latestRemovedXid)) > + return; > + The idea of LatestRemoveXidAdvances() is to exit quickly when we're called with a latestRemovedXid value <= the previous latestRemovedXid value. However, the conflict caches store information per relation. If you first call e.g "SetDeferredRecoveryConflicts(1000, 'rel_A', 1234)", followed by "SetDeferredRecoveryConflicts(1000, 'rel_B', 1234)", the latter call exits quickly. If a transaction that holds a "too old" snapshot then accesses rel_B, it won't fail as it should. Something else must be severly broken in the conflict resolution code as well: while testing with just one tiny table, I can easily reproduce a violation of serializable snapshot: postgres=# begin ISOLATION LEVEL serializable; BEGIN postgres=# SELECT * FROM foo; id ----- 101 102 (2 rows) (In master: UPDATE foo SET id = id + 10; VACUUM foo; SELECT pg_xlog_switch()) postgres=# SELECT * FROM foo; id ---- (0 rows) And it looks like the recovery cache is not reset properly: when I start a new backend after one that just got a "canceling statement due to recent buffer changes during recovery" error, and run a query, I get that error again: psql (8.5devel) Type "help" for help. postgres=# SELECT * FROM foo; postgres=# begin ISOLATION LEVEL serializable; BEGIN postgres=# SELECT * FROM foo; ERROR: canceling statement due to recent buffer changes during recovery I haven't dug deeper into those, but before I do, I want to ask if we really need to bother with a per-relation conflict cache at all? I'd really like to keep it simple for now, and tracking the conflicts per-relation only alleviates the situation somewhat. The nature of the cache is such that it's quite unpredictable to a regular user when it will save you, so you can't rely on it. You need to set max_standby_delay and/or other such settings correctly anyway, so it doesn't really help with usability. Another thing: I'm quite surprised to see that the logic in WAL redo to stop the redo and wait for read-only queries to finish before applying a WAL record that would cause conflicts, and thus cause a read-only query to be killed, is only used with a few WAL record types like database or tablespace creation. Not the usual VACUUM records. I was under the impression that max_standby_delay option and logic would apply to all operations. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers