qihua wu <staywith...@gmail.com> writes:
> We are using patroni to set up 1 primary and 5 slaves, and using ANY 2 (*)
> to commit a transaction if any 2 standbys receive the WAL. If there is a
> network partitioning between the primary and the slave, then commit will
> hang from user perspective, but the commit is actually done locally, just
> waiting for remote ack which is not possible because of network split. And
> if patroni promotes a slave to primary, then we will lost data. Do you
> think of a different design:  first wait for remote ACK, and then commit
> locally, this will only failed if local commit failed, but local commit
> fail is much rarer than network partitioning in a cloud env: it will only
> fail when IO issue or disk is full. So I am thinking of the possibility of
> switch the order: first wait for remote ACK, and then commit locally.

That just gives you a different set of failure modes.  It'd be
particularly bad if you have more than one standby, because you could
easily get into a situation where *none* of the nodes represent truth.

                        regards, tom lane


Reply via email to