Re: [HACKERS] Timeline following for logical slots

Oleksii Kliukin Tue, 05 Apr 2016 02:41:02 -0700

> On 05 Apr 2016, at 09:51, Craig Ringer <cr...@2ndquadrant.com> wrote:
> 
> On 5 April 2016 at 04:00, Robert Haas <robertmh...@gmail.com 
> <mailto:robertmh...@gmail.com>> wrote:
> 
> In general, I think we'd be a lot better off if we got some kind of
> logical replication into core first and then worked on lifting these
> types of limitations afterwards.
> 
> First, I'd like to remind everyone that logical decoding is useful for more 
> than replication. You can consume the change stream for audit 
> logging/archival, to feed into a different DBMS, etc etc. This is not just 
> about replicating from one PostgreSQL to another, though to be sure that'll 
> be the main use in the near term.
> 
> The Zalando guys at least are already using it for other things, and interest 
> in the json support suggests they're not alone.


We are actually interested in both the streaming part and the logical 
replication provided at the moment by BDR. The reason we cannot use BDR 
efficiently is that there is no way to provide a HA for one of the BDR nodes 
using physical replication, meaning that we have to run 2x nodes with a 
requirement of each node communicating to each other. Since the only use case 
we have is to run things in multiple data-centers with latencies of 100ms and 
above, running without the physical HA limits us to only couple of nodes, with 
a manual repair mechanism.

The other use case we have is streaming data from many, sometimes rather big 
(TBs) databases to the consumers interested in storing subsets of that data in 
order to run analytical queries on them. It’s hard to imagine a robust system 
like this that is built around the feature that doesn't support a failover 
between the master and a physical replica, forcing to stream again a set of 
selected tables at best, and the complete database at most (did I mention the 
terabytes already?), and, potentially, doing some very funny tricks to merge 
the newly streamed data with something that is already there. 


> 
> Right now if you're doing any kind of logical deocoding from a master server 
> that fails over to a standby the client just dies. The slot vanishes. You're 
> stuffed. Gee, I hope you didn't need all that nice consistent ordering, 
> because you're going to have to start from scratch and somehow reconcile the 
> data in the new master with what you've already received ... and haven’t.

+1

> 
> We could certainly require clients to jump through all sorts of extra hoops 
> to make sure they can follow replay over physical failover. Or we could say 
> that you shouldn't actually expect to use logical decoding in real world 
> environments where HA is a necessity until we get around to supporting 
> realistic, usable logical-rep based failover in a few years.

And faster than in a few years your typical big organization might decide to 
move on to some other data storage solution, promising  HA right now, at the 
expense of using SQL and strong consistency and transactions. It would be a bad 
choice, but the one that developers (especially those looking to build 
“scalable micro services” with only couple of CRUD queries) will be willing to 
make.


> Or we could make it "just work" for the physical failover systems everyone 
> already uses and relies on, just like sequences, indexes, and everything else 
> in PostgreSQL that's expected to survive failover.
> 

--
Oleksii

Re: [HACKERS] Timeline following for logical slots

Reply via email to