Re: [HACKERS] The real reason why TAP testing isn't ready for prime time

2015-07-29 Thread Tom Lane
Michael Paquier michael.paqu...@gmail.com writes: hamster has not complained for a couple of weeks now, and the issue was reproducible every 4~6 days: http://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=hamsterbr=HEAD Hence let's consider the issue as resolved. Nah, I'm afraid not. We

Re: [HACKERS] The real reason why TAP testing isn't ready for prime time

2015-07-29 Thread Michael Paquier
On Sun, Jun 21, 2015 at 7:06 AM, Michael Paquier michael.paqu...@gmail.com wrote: And, we get a failure: http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hamsterdt=2015-06-20%2017%3A59%3A01 I am not sure why buildfarm runs makes it more easily reproducible, one of the reasons may be the

Re: [HACKERS] The real reason why TAP testing isn't ready for prime time

2015-06-20 Thread Michael Paquier
On Sat, Jun 20, 2015 at 7:06 AM, Michael Paquier michael.paqu...@gmail.com wrote: As far as the rest of this patch goes, it seems like it could be made less invasive if the logs got dumped into a subdirectory of tmp_check rather than adding another top-level directory that has to be cleaned?

Re: [HACKERS] The real reason why TAP testing isn't ready for prime time

2015-06-20 Thread Michael Paquier
On Sat, Jun 20, 2015 at 6:53 AM, Michael Paquier michael.paqu...@gmail.com wrote: On Sat, Jun 20, 2015 at 12:44 AM, Tom Lane t...@sss.pgh.pa.us wrote: Andres Freund and...@anarazel.de writes: On 2015-06-19 11:16:18 -0400, Robert Haas wrote: On Fri, Jun 19, 2015 at 11:07 AM, Tom Lane

Re: [HACKERS] The real reason why TAP testing isn't ready for prime time

2015-06-19 Thread Michael Paquier
On Sat, Jun 20, 2015 at 12:44 AM, Tom Lane t...@sss.pgh.pa.us wrote: Andres Freund and...@anarazel.de writes: On 2015-06-19 11:16:18 -0400, Robert Haas wrote: On Fri, Jun 19, 2015 at 11:07 AM, Tom Lane t...@sss.pgh.pa.us wrote: I wonder whether it's such a good idea for the postmaster to give

Re: [HACKERS] The real reason why TAP testing isn't ready for prime time

2015-06-19 Thread Michael Paquier
On Fri, Jun 19, 2015 at 11:45 PM, Tom Lane t...@sss.pgh.pa.us wrote: Michael Paquier michael.paqu...@gmail.com writes: Attached is a patch fixing those problems and improving the log facility as it really helped me out with those issues. The simplest fix would be to include the -w switch

Re: [HACKERS] The real reason why TAP testing isn't ready for prime time

2015-06-19 Thread Michael Paquier
On Sat, Jun 20, 2015 at 12:07 AM, Tom Lane t...@sss.pgh.pa.us wrote: Michael Paquier michael.paqu...@gmail.com writes: Now if we look at RewindTest.pm, there is the following code: if ($test_master_datadir) { system pg_ctl -D

Re: [HACKERS] The real reason why TAP testing isn't ready for prime time

2015-06-19 Thread Michael Paquier
On Thu, Jun 18, 2015 at 3:52 PM, Michael Paquier wrote: I think that it would be useful as well to improve the buildfarm output. Thoughts? And after running the tests more or less 6~7 times in a row on a PI, I have been able to trigger the problem and I think that I have found its origin.

Re: [HACKERS] The real reason why TAP testing isn't ready for prime time

2015-06-19 Thread Tom Lane
Michael Paquier michael.paqu...@gmail.com writes: Attached is a patch fixing those problems and improving the log facility as it really helped me out with those issues. The simplest fix would be to include the -w switch missing in the tests of pg_rewind and pg_ctl though. I agree with adding

Re: [HACKERS] The real reason why TAP testing isn't ready for prime time

2015-06-19 Thread Robert Haas
On Fri, Jun 19, 2015 at 11:07 AM, Tom Lane t...@sss.pgh.pa.us wrote: I wonder whether it's such a good idea for the postmaster to give up waiting before all children are gone (postmaster.c:1722 in HEAD). I doubt it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise

Re: [HACKERS] The real reason why TAP testing isn't ready for prime time

2015-06-19 Thread Andres Freund
On 2015-06-19 11:16:18 -0400, Robert Haas wrote: On Fri, Jun 19, 2015 at 11:07 AM, Tom Lane t...@sss.pgh.pa.us wrote: I wonder whether it's such a good idea for the postmaster to give up waiting before all children are gone (postmaster.c:1722 in HEAD). I doubt it. Seconded. It's pretty

Re: [HACKERS] The real reason why TAP testing isn't ready for prime time

2015-06-19 Thread Tom Lane
Andres Freund and...@anarazel.de writes: On 2015-06-19 11:16:18 -0400, Robert Haas wrote: On Fri, Jun 19, 2015 at 11:07 AM, Tom Lane t...@sss.pgh.pa.us wrote: I wonder whether it's such a good idea for the postmaster to give up waiting before all children are gone (postmaster.c:1722 in HEAD).

Re: [HACKERS] The real reason why TAP testing isn't ready for prime time

2015-06-19 Thread Alvaro Herrera
Tom Lane wrote: Andres Freund and...@anarazel.de writes: On 2015-06-19 11:16:18 -0400, Robert Haas wrote: On Fri, Jun 19, 2015 at 11:07 AM, Tom Lane t...@sss.pgh.pa.us wrote: I wonder whether it's such a good idea for the postmaster to give up waiting before all children are gone

Re: [HACKERS] The real reason why TAP testing isn't ready for prime time

2015-06-19 Thread Andres Freund
On 2015-06-19 13:56:21 -0300, Alvaro Herrera wrote: We discussed this when that patch got in (82233ce7ea42d6b). The reason for not waiting, it was argued, is that the most likely reason for those processes not to have already gone away by the time we send SIGKILL was that they are stuck

Re: [HACKERS] The real reason why TAP testing isn't ready for prime time

2015-06-19 Thread Tom Lane
Andres Freund and...@anarazel.de writes: On 2015-06-19 13:56:21 -0300, Alvaro Herrera wrote: We discussed this when that patch got in (82233ce7ea42d6b). The reason for not waiting, it was argued, is that the most likely reason for those processes not to have already gone away by the time we

Re: [HACKERS] The real reason why TAP testing isn't ready for prime time

2015-06-19 Thread Alvaro Herrera
Alvaro Herrera wrote: We discussed this when that patch got in (82233ce7ea42d6b). The reason for not waiting, it was argued, is that the most likely reason for those processes not to have already gone away by the time we send SIGKILL was that they are stuck somewhere in the kernel, and so we

Re: [HACKERS] The real reason why TAP testing isn't ready for prime time

2015-06-19 Thread Alvaro Herrera
Tom Lane wrote: Andres Freund and...@anarazel.de writes: On 2015-06-19 13:56:21 -0300, Alvaro Herrera wrote: We discussed this when that patch got in (82233ce7ea42d6b). The reason for not waiting, it was argued, is that the most likely reason for those processes not to have already gone

Re: [HACKERS] The real reason why TAP testing isn't ready for prime time

2015-06-18 Thread Michael Paquier
On Mon, Jun 15, 2015 at 8:26 AM, Michael Paquier wrote: hamster is legendary slow and has a slow disc, hence it improves chances of catching race conditions, and it is the only slow buildfarm machine enabling the TAP tests (by comparison dangomushi has never failed with the TAP tests) hence I

[HACKERS] The real reason why TAP testing isn't ready for prime time

2015-06-14 Thread Tom Lane
Buildfarm member hamster has failed a pretty significant fraction of its recent runs in the BinInstallCheck step: http://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=hamsterbr=HEAD Since other critters aren't equally distressed, it seems likely that this is just an out-of-disk-space type

Re: [HACKERS] The real reason why TAP testing isn't ready for prime time

2015-06-14 Thread Michael Paquier
On Mon, Jun 15, 2015 at 3:37 AM, Tom Lane t...@sss.pgh.pa.us wrote: Buildfarm member hamster has failed a pretty significant fraction of its recent runs in the BinInstallCheck step: http://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=hamsterbr=HEAD Since other critters aren't equally