On 21 February 2016 at 23:18, Thomas Munro <thomas.mu...@enterprisedb.com> wrote: > On Mon, Feb 22, 2016 at 2:10 AM, Thom Brown <t...@linux.com> wrote: >> On 3 February 2016 at 10:46, Thomas Munro <thomas.mu...@enterprisedb.com> >> wrote: >>> On Wed, Feb 3, 2016 at 10:59 PM, Amit Langote >>> <langote_amit...@lab.ntt.co.jp> wrote: >>>> There seems to be a copy-pasto there - shouldn't that be: >>>> >>>> + if (walsndctl->lsn[SYNC_REP_WAIT_FLUSH] < MyWalSnd->flush) >>> >>> Indeed, thanks! New patch attached. >> >> I've given this a test drive, and it works exactly as described. > > Thanks for trying it out! > >> But one thing which confuses me is when a standby, with causal_reads >> enabled, has just finished starting up. I can't connect to it >> because: >> >> FATAL: standby is not available for causal reads >> >> However, this is the same message when I'm successfully connected, but >> it's lagging, and the primary is still waiting for the standby to >> catch up: >> >> ERROR: standby is not available for causal reads >> >> What is the difference here? The problem being reported appears to be >> identical, but in the first case I can't connect, but in the second >> case I can (although I still can't issue queries). > > Right, you get the error at login before it has managed to connect to > the primary, and for a short time after while it's in 'joining' state, > or potentially longer if there is a backlog of WAL to apply. > > The reason is that when causal_reads = on in postgresql.conf (as > opposed to being set for an individual session or role), causal reads > logic is used for snapshots taken during authentication (in fact the > error is generated when trying to take a snapshot slightly before > authentication proper begins, in InitPostgres). I think that's a > desirable feature: if you have causal reads on and you create/alter a > database/role (for example setting a new password) and commit, and > then you immediately try to connect to that database/role on a standby > where you have causal reads enabled system-wide, then you get the > causal reads guarantee during authentication: you either see the > effects of your earlier transaction or you see the error. As you have > discovered, there is a small window after a standby comes up where it > will show the error because it hasn't got a lease yet so it can't let > you log in yet because it could be seeing a stale catalog (your user > may not exist on the standby yet, or have been altered in some way, or > your database may not exist yet, etc). > > Does that make sense?
Ah, alles klar. Yes, that makes sense now. I've been trying to break it the past few days, and this was the only thing which I wasn't clear on. The parameters all work as described The replay_lag is particularly cool. Didn't think it was possible to glean this information on the primary, but the timings are correct in my tests. +1 for this patch. Looks like this solves the problem that semi-synchronous replication tries to solve, although arguably in a more sensible way. Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers