On Wed, Dec 28, 2016 at 4:21 PM, Craig Ringer <cr...@2ndquadrant.com> wrote:
> On 28 December 2016 at 08:14, Thomas Munro
> <thomas.mu...@enterprisedb.com> wrote:
>> 3.  No server must allow a transaction to be visible that hasn't been
>> flushed on N standby servers.  We already prevent that on the primary
> Only if the primary doesn't restart. We don't persist the xact masking
> used by sync rep at the moment.

Right.  Maybe you could fix that gap by making the primary wait until
the rule in synchronous_standby_names would be satisfied by the most
conservative possible synchronous_commit level (remote_apply) after
recovery and before allowing any queries to run?

> I suspect that solving that is probably tied to solving it on standbys.

Hmm.  I was imagining that for standbys it might involve extra
messages flowing from the primary carrying the consensus write and
flush LSN locations (since there isn't any other kind of inter-node
communication possible, today), and then somehow teaching the standby
to see only the transactions whose commit record is <= the lastest
consensus commit LSN (precisely, and no more!) when acquiring a
snapshot if K is > 2 and N > 1 and you have synchronous_commit set to
a level >= remote_write on the standby.  That could be done by simply
waiting for the consensus write or flush LSN (as appropriate) to be
applied before taking a snapshot, but aside from complicated
interlocking requirements, that would slow down snapshot acquisition
unacceptably on write-heavy systems.  Another way to do it could be to
maintain separate versions of the snapshot data somehow for each
synchronous_commit level on standbys, so that you can get quickly your
hands on a snapshot that can only see xids whose commit record was <=
consensus write or flush location as appropriate.  That interacts
weirdly with synchronous_commit = remote_apply on the primary though
because (assuming we want it to be useful) it needs to wait until the
LSN is applied on the standby(s) *and* they can see it in this weird
new time-delayed snapshot thing; perhaps it would require a new level
remote_apply_consensus_flush which waits for the standby(s) to apply
and and also know that the transaction has been flushed on enough
nodes to allow it to be seen...

Thomas Munro

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to