Re: [HACKERS] Proposal: "Causal reads" mode for load balancing reads without stale data

Thom Brown Sun, 21 Feb 2016 16:41:08 -0800

On 21 February 2016 at 23:18, Thomas Munro
<thomas.mu...@enterprisedb.com> wrote:
> On Mon, Feb 22, 2016 at 2:10 AM, Thom Brown <t...@linux.com> wrote:
>> On 3 February 2016 at 10:46, Thomas Munro <thomas.mu...@enterprisedb.com> 
>> wrote:
>>> On Wed, Feb 3, 2016 at 10:59 PM, Amit Langote
>>> <langote_amit...@lab.ntt.co.jp> wrote:
>>>> There seems to be a copy-pasto there - shouldn't that be:
>>>>
>>>> + if (walsndctl->lsn[SYNC_REP_WAIT_FLUSH] < MyWalSnd->flush)
>>>
>>> Indeed, thanks!  New patch attached.
>>
>> I've given this a test drive, and it works exactly as described.
>
> Thanks for trying it out!
>
>> But one thing which confuses me is when a standby, with causal_reads
>> enabled, has just finished starting up.  I can't connect to it
>> because:
>>
>> FATAL:  standby is not available for causal reads
>>
>> However, this is the same message when I'm successfully connected, but
>> it's lagging, and the primary is still waiting for the standby to
>> catch up:
>>
>> ERROR:  standby is not available for causal reads
>>
>> What is the difference here?  The problem being reported appears to be
>> identical, but in the first case I can't connect, but in the second
>> case I can (although I still can't issue queries).
>
> Right, you get the error at login before it has managed to connect to
> the primary, and for a short time after while it's in 'joining' state,
> or potentially longer if there is a backlog of WAL to apply.
>
> The reason is that when causal_reads = on in postgresql.conf (as
> opposed to being set for an individual session or role), causal reads
> logic is used for snapshots taken during authentication (in fact the
> error is generated when trying to take a snapshot slightly before
> authentication proper begins, in InitPostgres).  I think that's a
> desirable feature: if you have causal reads on and you create/alter a
> database/role (for example setting a new password) and commit, and
> then you immediately try to connect to that database/role on a standby
> where you have causal reads enabled system-wide, then you get the
> causal reads guarantee during authentication: you either see the
> effects of your earlier transaction or you see the error.  As you have
> discovered, there is a small window after a standby comes up where it
> will show the error because it hasn't got a lease yet so it can't let
> you log in yet because it could be seeing a stale catalog (your user
> may not exist on the standby yet, or have been altered in some way, or
> your database may not exist yet, etc).
>
> Does that make sense?


Ah, alles klar.  Yes, that makes sense now.  I've been trying to break
it the past few days, and this was the only thing which I wasn't clear
on.  The parameters all work as described

The replay_lag is particularly cool.  Didn't think it was possible to
glean this information on the primary, but the timings are correct in
my tests.

+1 for this patch.  Looks like this solves the problem that
semi-synchronous replication tries to solve, although arguably in a
more sensible way.

Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: "Causal reads" mode for load balancing reads without stale data

Reply via email to