On 10/07/2017 10:42 PM, Andres Freund wrote:
Hi,

On 2017-10-07 22:39:09 +0300, konstantin knizhnik wrote:
In our sharded cluster project we are trying to use logical relication for 
providing HA (maintaining redundant shard copies).
Using asynchronous logical replication has not so much sense in context of HA. 
This is why we try to use synchronous logical replication.
Unfortunately it shows very bad performance. With 50 shards and level of 
redundancy=1 (just one copy) cluster is 20 times slower then without logical 
replication.
With asynchronous replication it is "only" two times slower.

As far as I understand, the reason of such bad performance is that synchronous replication 
mechanism was originally developed for streaming replication, when all replicas have the same 
content and LSNs. When it is used for logical replication, it behaves very inefficiently. 
Commit has to wait confirmations from all receivers mentioned in 
"synchronous_standby_names" list. So we are waiting not only for our own single 
logical replication standby, but all other standbys as well. Number of synchronous standbyes is 
equal to number of shards divided by number of nodes. To provide uniform distribution number of 
shards should >> than number of nodes, for example for 10 nodes we usually create 100 
shards. As a result we get awful performance and blocking of any replication channel blocks all 
backends.

So my question is whether my understanding is correct and synchronous logical 
replication can not be efficiently used in such manner.
If so, the next question is how difficult it will be to make synchronous 
replication mechanism for logical replication more efficient and are there some 
plans to  work in this direction?
This seems to be a question that is a) about a commercial project we
don't know much about b) hasn't received a lot of investigation.

Sorry, If I was not clear.
The question was about logical replication mechanism in mainstream version of 
Postgres.
I think that most of people are using asynchronous logical replication and 
synchronous LR is something exotic and not well tested and investigated.
It will be great if I am wrong:)

Concerning our sharded cluster (pg_shardman) - it is not a commercial product 
yet, it is in development phase.
We are going to open its sources when it will be more or less stable.
But unlike multimaster, this sharded cluster is mostly built from existed 
components: pg_pathman  + postgres_fdw + logical replication.
So we are just trying to combine them all into some integrated system.
But currently the most obscure point is logical replication.

And the main goal of my e-mail was to know the opinion of authors and users of 
LR whether it is good idea to use LR to provide fault tolerance in sharded 
cluster.
Or some other approaches, for example sharding with redundancy or using 
streaming replication are preferable?


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to