Re: [HACKERS] Logical replication and multimaster

Simon Riggs Thu, 03 Dec 2015 04:40:10 -0800

On 30 November 2015 at 17:20, Konstantin Knizhnik <[email protected]
> wrote:



> But looks like there is not so much sense in having multiple network
> connection between one pair of nodes.
> It seems to be better to have one connection between nodes, but provide
> parallel execution of received transactions at destination side. But it
> seems to be also nontrivial. We have now in PostgreSQL some infrastructure
> for background works, but there is still no abstraction of workers pool and
> job queue which can provide simple way to organize parallel execution of
> some jobs. I wonder if somebody is working now on it or we should try to
> propose our solution?
>

There are definitely two clear places where additional help would be useful
and welcome right now.

1. Allowing logical decoding to have a "speculative pre-commit data"
option, to allow some data to be made available via the decoding api,
allowing data to be transferred prior to commit. This would allow us to
reduce the delay that occurs at commit, especially for larger transactions
or very low latency requirements for smaller transactions. Some heuristic
or user interface would be required to decide whether to and which
transactions might make their data available prior to commit. And we would
need to send abort messages should the transactions not commit as expected.
That would be a patch on logical decoding and is an essentially separate
feature to anything currently being developed.

2. Some mechanism/theory to decide when/if to allow parallel apply. That
could be used for both physical and logical replication. Since the apply
side of logical replication is still being worked on there is a code
dependency there, so a working solution isn't what is needed yet. But the
general principles and any changes to the data content (wal_level) or
protocol (pglogical_output) would be useful.

We already have working multi-master that has been contributed to PGDG, so
contributing that won't gain us anything. There is a lot of code and
pglogical is the most useful piece of code to be carved off and reworked
for submission. The bottleneck is review and commit, not initial
development - which applies both to this area and most others in PostgreSQL.

Having a single network connection between nodes would increase efficiency
but also increase replication latency, so its not useful in all cases.

I think having some kind of message queue between nodes would also help,
since there are many cases for which we want to transfer data, not just a
replication data flow. For example, consensus on DDL, or MPP query traffic.
But that is open to wider debate.

-- 
Simon Riggs                http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Logical replication and multimaster

Reply via email to