Re: [HACKERS] Race-like failure in recovery/t/009_twophase.pl

2017-07-02 Thread Tom Lane
Craig Ringer writes: > That's my bad. > (Insert dark muttering about Perl here). Yeah, pretty much the only good thing about Perl is it's ubiquitous. But you could say the same of C. Or SQL. For a profession that's under 70 years old, we sure spend a lot of time dealing

Re: [HACKERS] Race-like failure in recovery/t/009_twophase.pl

2017-07-02 Thread Craig Ringer
On 3 July 2017 at 05:10, Tom Lane wrote: > I wrote: >> Any ideas what's wrong there? > > Hah: the answer is that query_hash's split() call is broken. > "man perlfunc" quoth > >split Splits the string EXPR into a list of strings and returns that >list.

Re: [HACKERS] Race-like failure in recovery/t/009_twophase.pl

2017-07-02 Thread Tom Lane
Michael Paquier writes: > On Mon, Jul 3, 2017 at 7:02 AM, Tom Lane wrote: >> Anyone have a different view of what to fix here? > No, this sounds like a good plan. What do you think about the attached? Oh, that's a good way. I just finished

Re: [HACKERS] Race-like failure in recovery/t/009_twophase.pl

2017-07-02 Thread Michael Paquier
(catching up test threads) On Mon, Jul 3, 2017 at 7:02 AM, Tom Lane wrote: > I'm now inclined to think that the correct fix is to ensure that we > run synchronous rep in both directions, rather than to insert delays > to substitute for that. Just setting

Re: [HACKERS] Race-like failure in recovery/t/009_twophase.pl

2017-07-02 Thread Tom Lane
I wrote: > Anyway, having vented about that ... it's not very clear to me whether the > test script is at fault for not being careful to let the slave catch up to > the master before we promote it (and then deem the master to be usable as > a slave without rebuilding it first), or whether we

Re: [HACKERS] Race-like failure in recovery/t/009_twophase.pl

2017-07-02 Thread Tom Lane
I wrote: > Any ideas what's wrong there? Hah: the answer is that query_hash's split() call is broken. "man perlfunc" quoth split Splits the string EXPR into a list of strings and returns that list. By default, empty leading fields are preserved, and empty

Re: [HACKERS] Race-like failure in recovery/t/009_twophase.pl

2017-07-02 Thread Tom Lane
I wrote: > The reporting critters are all on the slow side, so I suspected > a timing problem, especially since it only started to show up > after changing pg_ctl's timing behavior. I can't reproduce it > locally on unmodified sources, but I could after putting my thumb > on the scales like this:

Re: [HACKERS] Race-like failure in recovery/t/009_twophase.pl

2017-07-02 Thread Tom Lane
Alvaro Herrera writes: > Tom Lane wrote: >> * Some effort should be put into emitting text to the log showing >> what's going on, eg print("Now london is master."); as appropriate. > Check. Not "print" though; I think using note(" .. ") (from Test::More) > is more

Re: [HACKERS] Race-like failure in recovery/t/009_twophase.pl

2017-07-02 Thread Alvaro Herrera
Tom Lane wrote: > I'd kind of like to fix it now, so I can reason in a less confused way > about the actual problem. OK, no objections here. > Last night I didn't have a clear idea of how > to make it better, but what I'm thinking this morning is: > > * Naming the underlying server objects

Re: [HACKERS] Race-like failure in recovery/t/009_twophase.pl

2017-07-02 Thread Tom Lane
Alvaro Herrera writes: > Tom Lane wrote: >> Part of the reason I'm confused is that the programming technique >> being used in 009_twophase.pl, namely doing >> ($node_master, $node_slave) = ($node_slave, $node_master); >> and then working with the reversed variable

Re: [HACKERS] Race-like failure in recovery/t/009_twophase.pl

2017-07-02 Thread Alvaro Herrera
Tom Lane wrote: > Part of the reason I'm confused is that the programming technique > being used in 009_twophase.pl, namely doing > > ($node_master, $node_slave) = ($node_slave, $node_master); > > and then working with the reversed variable names, is ENTIRELY TOO CUTE > FOR ITS OWN GOOD.