Re: [HACKERS] Timeline following for logical slots

Robert Haas Mon, 04 Apr 2016 13:00:50 -0700

On Mon, Apr 4, 2016 at 10:59 AM, Craig Ringer <cr...@2ndquadrant.com> wrote:
> To allow logical rep and failover to be a reasonable substitute for physical
> rep and failover IMO *need*:
>
> * Robust sequence decoding and replication. If you were following the later
> parts of that discussion you will've seen how fun that's going to be, but
> it's the simplest of all of the problems.
>
> * Logical decoding and sending of in-progress xacts, so the logical client
> can already be most of the way through receiving a big xact when it commits.
> Without this we have a huge lag spike whenever a big xact happens, since we
> must first finish decoding it in to a reorder buffer and can only then
> *begin* to send it to the client. During which time no later xacts may be
> decoded or replayed to the client. If you're running that rare thing, the
> perfect pure OLTP system, you won't care... but good luck finding one in the
> real world.
>
> * Either parallel apply on the client side or at least buffering of
> in-progress xacts on the client side so they can be safely flushed to disk
> and confirmed, allowing receive to continue while replay is done on the
> client. Otherwise sync rep is completely impractical... and there's no
> shortage of systems out there that can't afford to lose any transactions. Or
> at least have some crucial transactions they can't lose.
>
> * Robust, seamless DDL replication, so things don't just break randomly.
> This makes the other points above look nice and simple by comparison.
> Logical decoding of 2PC xacts with DDL would help here, as would the ability
> to transparently convert an xact into a prepare-xact on client commit and
> hold the client waiting while we replicate it, confirm the successful
> prepare on the replica, then commit prepared on the upstream.
>
> * oh, and some way to handle replication of shared catalog changes like
> pg_authid, so the above DDL replication doesn't just randomly break if it
> happens to refer to a global object that doesn't exist on the downstream.


In general, I think we'd be a lot better off if we got some kind of
logical replication into core first and then worked on lifting these
types of limitations afterwards.  If I had to pick an order in which
to do the things you list, I'd focus first on the one you list second:
being able to stream and begin applying transactions before they've
committed is a really big deal for large transactions, and lots of
people have some large transactions.  DDL replication is nice, but
realistically, there are a lot of people who simply don't change their
schema all that often, and who could (and might even prefer to) manage
that process in other ways - e.g. change nodes one by one while they
are off-line, then bring them on-line.

I don't necessarily disagree with your statement that we'd need all of
this stuff to make logical replication a substitute for physical
replication as far as failover is concerned.  But I don't think that's
the most important goal, and even to the extent that it is the goal, I
don't think we need to meet every need before we can consider
ourselves to have met some needs.  I don't think that we need every
physical replication feature plus some before logical replication can
start to be useful to PostgreSQL users generally.  We do, however,
need the functionality to be accessible to people who are using only
the PostgreSQL core distribution.  The thing that is going to get
people excited about making logical replication better is getting to a
point where they can use it at all - and that is not going to be true
as long as you can't use it without having to download something from
an external website.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Timeline following for logical slots

Reply via email to