Re: [HACKERS] More replication race conditions

2017-09-01 Thread Michael Paquier
On Sat, Sep 2, 2017 at 12:03 AM, Alvaro Herrera wrote: > Michael Paquier wrote: >> On Mon, Aug 28, 2017 at 8:25 AM, Michael Paquier >> wrote: >> > Today's run has finished with the same failure: >> >

Re: [HACKERS] More replication race conditions

2017-09-01 Thread Alvaro Herrera
Michael Paquier wrote: > On Mon, Aug 28, 2017 at 8:25 AM, Michael Paquier > wrote: > > Today's run has finished with the same failure: > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dangomushi=2017-08-27%2018%3A00%3A13 > > Attached is a patch to make this

Re: [HACKERS] More replication race conditions

2017-09-01 Thread Simon Riggs
On 31 August 2017 at 12:54, Simon Riggs wrote: >> The above-described topic is currently a PostgreSQL 10 open item. Simon, >> since you committed the patch believed to have created it, you own this open >> item. If some other commit is more relevant or if this does not

Re: [HACKERS] More replication race conditions

2017-08-31 Thread Simon Riggs
On 27 August 2017 at 03:32, Noah Misch wrote: > On Fri, Aug 25, 2017 at 12:09:00PM +0200, Petr Jelinek wrote: >> On 24/08/17 19:54, Tom Lane wrote: >> > sungazer just failed with >> > >> > pg_recvlogical exited with code '256', stdout '' and stderr >> > 'pg_recvlogical: could

Re: [HACKERS] More replication race conditions

2017-08-30 Thread Noah Misch
On Tue, Aug 29, 2017 at 08:44:42PM +0900, Michael Paquier wrote: > On Mon, Aug 28, 2017 at 8:25 AM, Michael Paquier > wrote: > > Today's run has finished with the same failure: > >

Re: [HACKERS] More replication race conditions

2017-08-30 Thread Noah Misch
On Sun, Aug 27, 2017 at 02:32:49AM +, Noah Misch wrote: > On Fri, Aug 25, 2017 at 12:09:00PM +0200, Petr Jelinek wrote: > > On 24/08/17 19:54, Tom Lane wrote: > > > sungazer just failed with > > > > > > pg_recvlogical exited with code '256', stdout '' and stderr > > > 'pg_recvlogical: could

Re: [HACKERS] More replication race conditions

2017-08-29 Thread Michael Paquier
On Mon, Aug 28, 2017 at 8:25 AM, Michael Paquier wrote: > Today's run has finished with the same failure: > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dangomushi=2017-08-27%2018%3A00%3A13 > Attached is a patch to make this code path wait that the

Re: [HACKERS] More replication race conditions

2017-08-27 Thread Petr Jelinek
On 28/08/17 01:36, Michael Paquier wrote: > On Sun, Aug 27, 2017 at 6:32 PM, Petr Jelinek > wrote: >> Attached should fix this. > > +$node_master->poll_query_until('postgres', > +"SELECT EXISTS (SELECT 1 FROM pg_replication_slots WHERE slot_name = > 'test_slot' AND

Re: [HACKERS] More replication race conditions

2017-08-27 Thread Michael Paquier
On Mon, Aug 28, 2017 at 8:33 AM, Tom Lane wrote: > Michael Paquier writes: >> Attached is a patch to make this code path wait that the transaction >> has been replayed. We could use as well synchronous_commit = apply, >> but I prefer the solution of

Re: [HACKERS] More replication race conditions

2017-08-27 Thread Michael Paquier
On Sun, Aug 27, 2017 at 6:32 PM, Petr Jelinek wrote: > Attached should fix this. +$node_master->poll_query_until('postgres', +"SELECT EXISTS (SELECT 1 FROM pg_replication_slots WHERE slot_name = 'test_slot' AND active_pid IS NULL)" +) + or die "slot never became

Re: [HACKERS] More replication race conditions

2017-08-27 Thread Tom Lane
Michael Paquier writes: > Attached is a patch to make this code path wait that the transaction > has been replayed. We could use as well synchronous_commit = apply, > but I prefer the solution of this patch with a wait query. Petr proposed a different patch to fix the

Re: [HACKERS] More replication race conditions

2017-08-27 Thread Michael Paquier
On Sun, Aug 27, 2017 at 3:34 PM, Michael Paquier wrote: > On Sun, Aug 27, 2017 at 12:03 PM, Tom Lane wrote: >> contains exactly no means of ensuring that the master's transaction has >> been replayed on the standby before we check for its results.

Re: [HACKERS] More replication race conditions

2017-08-27 Thread Petr Jelinek
On 27/08/17 04:32, Noah Misch wrote: > On Fri, Aug 25, 2017 at 12:09:00PM +0200, Petr Jelinek wrote: >> On 24/08/17 19:54, Tom Lane wrote: >>> sungazer just failed with >>> >>> pg_recvlogical exited with code '256', stdout '' and stderr >>> 'pg_recvlogical: could not send replication command

Re: [HACKERS] More replication race conditions

2017-08-27 Thread Michael Paquier
On Sun, Aug 27, 2017 at 12:03 PM, Tom Lane wrote: > And *another* replication test race condition just now: > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dangomushi=2017-08-26%2019%3A37%3A08 > > As best I can interpret this, it's pointing out that this bit in >

Re: [HACKERS] More replication race conditions

2017-08-26 Thread Tom Lane
And *another* replication test race condition just now: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dangomushi=2017-08-26%2019%3A37%3A08 As best I can interpret this, it's pointing out that this bit in src/test/recovery/t/009_twophase.pl: $cur_master->psql( 'postgres', "

Re: [HACKERS] More replication race conditions

2017-08-26 Thread Noah Misch
On Fri, Aug 25, 2017 at 12:09:00PM +0200, Petr Jelinek wrote: > On 24/08/17 19:54, Tom Lane wrote: > > sungazer just failed with > > > > pg_recvlogical exited with code '256', stdout '' and stderr > > 'pg_recvlogical: could not send replication command "START_REPLICATION SLOT > > "test_slot"

Re: [HACKERS] More replication race conditions

2017-08-25 Thread Petr Jelinek
On 24/08/17 19:54, Tom Lane wrote: > sungazer just failed with > > pg_recvlogical exited with code '256', stdout '' and stderr 'pg_recvlogical: > could not send replication command "START_REPLICATION SLOT "test_slot" > LOGICAL 0/0 ("include-xids" '0', "skip-empty-xacts" '1')": ERROR: >

[HACKERS] More replication race conditions

2017-08-24 Thread Tom Lane
sungazer just failed with pg_recvlogical exited with code '256', stdout '' and stderr 'pg_recvlogical: could not send replication command "START_REPLICATION SLOT "test_slot" LOGICAL 0/0 ("include-xids" '0', "skip-empty-xacts" '1')": ERROR: replication slot "test_slot" is active for PID