Re: [HACKERS] test_shm_mq failing on anole (was: Sending out a request for more buildfarm animals?)

2014-11-24 Thread Christoph Berg
Hi, I'm still seeing trouble with test_shm_mq on mipsel (9.4 rc1): https://buildd.debian.org/status/fetch.php?pkg=postgresql-9.4arch=mipselver=9.4~rc1-1stamp=1416547779 mips had the problem as well in the past (9.4 beta3):

Re: [HACKERS] test_shm_mq failing on anole (was: Sending out a request for more buildfarm animals?)

2014-11-24 Thread Robert Haas
On Mon, Nov 24, 2014 at 9:36 AM, Christoph Berg c...@df7cb.de wrote: I'm still seeing trouble with test_shm_mq on mipsel (9.4 rc1): Boy, that test has certainly caught its share of bugs, and not in the places I would have expected. The last round of wrestling with this had to do with working

Re: [HACKERS] test_shm_mq failing on anole (was: Sending out a request for more buildfarm animals?)

2014-11-24 Thread Christoph Berg
Re: Robert Haas 2014-11-24 ca+tgmoacnppmdgg4n14u2bjujndnmou8xxhhpmvo+0u92ck...@mail.gmail.com https://buildd.debian.org/status/fetch.php?pkg=postgresql-9.4arch=mipselver=9.4~rc1-1stamp=1416547779 mips had the problem as well in the past (9.4 beta3):

Re: [HACKERS] test_shm_mq failing on anole (was: Sending out a request for more buildfarm animals?)

2014-10-03 Thread Robert Haas
On Wed, Oct 1, 2014 at 11:10 AM, Robert Haas robertmh...@gmail.com wrote: As far as I can tell, it's configured to run everything. I just checked, though, and found it wedged again. I'm not sure whether it was the same problem, though; I ended up just killing all of the postgres processes to

Re: [HACKERS] test_shm_mq failing on anole (was: Sending out a request for more buildfarm animals?)

2014-10-03 Thread Robert Haas
On Fri, Oct 3, 2014 at 1:09 PM, Robert Haas robertmh...@gmail.com wrote: Further debugging reveals that sigusr1_handler() gets called repeatedly, to start autovacuum workers, and it keeps waking up and starting them. But that doesn't cause the background workers to get started either, because

Re: [HACKERS] test_shm_mq failing on anole (was: Sending out a request for more buildfarm animals?)

2014-10-03 Thread Andres Freund
On 2014-10-03 14:38:10 -0400, Robert Haas wrote: On Fri, Oct 3, 2014 at 1:09 PM, Robert Haas robertmh...@gmail.com wrote: Further debugging reveals that sigusr1_handler() gets called repeatedly, to start autovacuum workers, and it keeps waking up and starting them. But that doesn't cause

Re: [HACKERS] test_shm_mq failing on anole (was: Sending out a request for more buildfarm animals?)

2014-10-01 Thread Andres Freund
On 2014-09-29 14:46:20 -0400, Robert Haas wrote: This happened again, and I investigated further. Uh. Interestingly anole just succeeded twice: http://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=anolebr=REL9_4_STABLE I plan to commit the mask/unmask patch regardless, but it's curious.

Re: [HACKERS] test_shm_mq failing on anole (was: Sending out a request for more buildfarm animals?)

2014-10-01 Thread Robert Haas
On Wed, Oct 1, 2014 at 7:00 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-09-29 14:46:20 -0400, Robert Haas wrote: This happened again, and I investigated further. Uh. Interestingly anole just succeeded twice:

Re: [HACKERS] test_shm_mq failing on anole (was: Sending out a request for more buildfarm animals?)

2014-10-01 Thread Andres Freund
On 2014-10-01 10:45:13 -0400, Robert Haas wrote: On Wed, Oct 1, 2014 at 7:00 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-09-29 14:46:20 -0400, Robert Haas wrote: This happened again, and I investigated further. Uh. Interestingly anole just succeeded twice:

Re: [HACKERS] test_shm_mq failing on anole (was: Sending out a request for more buildfarm animals?)

2014-10-01 Thread Robert Haas
On Wed, Oct 1, 2014 at 10:50 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-10-01 10:45:13 -0400, Robert Haas wrote: On Wed, Oct 1, 2014 at 7:00 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-09-29 14:46:20 -0400, Robert Haas wrote: This happened again, and I investigated

Re: [HACKERS] test_shm_mq failing on anole (was: Sending out a request for more buildfarm animals?)

2014-09-29 Thread Andres Freund
On 2014-09-29 14:46:20 -0400, Robert Haas wrote: On Fri, May 9, 2014 at 10:18 AM, Robert Haas robertmh...@gmail.com wrote: On Sat, May 3, 2014 at 4:31 AM, Dave Page dave.p...@enterprisedb.com wrote: Hamid@EDB; Can you please have someone configure anole to build git head as well as the

Re: [HACKERS] test_shm_mq failing on anole (was: Sending out a request for more buildfarm animals?)

2014-09-29 Thread Robert Haas
On Mon, Sep 29, 2014 at 2:52 PM, Andres Freund and...@2ndquadrant.com wrote: This happened again, and I investigated further. It looks like the postmaster knows full well that it's supposed to start more bgworkers: the ones that never get started are in the postmaster's BackgroundWorkerList,

Re: [HACKERS] test_shm_mq failing on anole (was: Sending out a request for more buildfarm animals?)

2014-09-29 Thread Andres Freund
On 2014-09-29 15:24:55 -0400, Robert Haas wrote: On Mon, Sep 29, 2014 at 2:52 PM, Andres Freund and...@2ndquadrant.com wrote: If that theory is true, wouldn't things get unstuck everytime a new connection comes in? Or 60 seconds have passed? That's not to say this isn't wrong, but still?

Re: [HACKERS] test_shm_mq failing on anole (was: Sending out a request for more buildfarm animals?)

2014-09-29 Thread Andres Freund
On 2014-09-29 14:46:20 -0400, Robert Haas wrote: On Fri, May 9, 2014 at 10:18 AM, Robert Haas robertmh...@gmail.com wrote: On Sat, May 3, 2014 at 4:31 AM, Dave Page dave.p...@enterprisedb.com wrote: Hamid@EDB; Can you please have someone configure anole to build git head as well as the

Re: [HACKERS] test_shm_mq failing on anole (was: Sending out a request for more buildfarm animals?)

2014-09-29 Thread Robert Haas
On Mon, Sep 29, 2014 at 3:37 PM, Andres Freund and...@2ndquadrant.com wrote: Yea :(. Note how signals are blocked in all the signal handlers and only unblocked for a very short time (the sleep). (stare at random shit for far too long) Ah. DetermineSleepTime(), which is called while signals

Re: [HACKERS] test_shm_mq failing on anole (was: Sending out a request for more buildfarm animals?)

2014-09-29 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: On Mon, Sep 29, 2014 at 3:37 PM, Andres Freund and...@2ndquadrant.com wrote: Ah. DetermineSleepTime(), which is called while signals are unblocked!, modifies BackgroundWorkerList. Previously that only iterated the list, without modifying it. That's

Re: [HACKERS] test_shm_mq failing on anole (was: Sending out a request for more buildfarm animals?)

2014-09-29 Thread Andres Freund
On 2014-09-29 16:16:24 -0400, Robert Haas wrote: If you can manually run stuff on that machine, it'd be rather helpful if you could put a PG_SETMASK(BlockSig);...PG_SETMASK(UnBlockSig); in the HaveCrashedWorker() loop. I'd do it the other way around, and adjust ServerLoop to put the

Re: [HACKERS] test_shm_mq failing on anole (was: Sending out a request for more buildfarm animals?)

2014-09-29 Thread Robert Haas
On Mon, Sep 29, 2014 at 4:20 PM, Andres Freund and...@2ndquadrant.com wrote: Let's just check in the fix. It'll either fix anole or not, but we should fix the bug you found either way. Right. Are you going to do it? I can, but it'll be tomorrow. I'm neck deep in another bug right now. I

Re: [HACKERS] test_shm_mq failing on anole (was: Sending out a request for more buildfarm animals?)

2014-09-29 Thread Alvaro Herrera
Andres Freund wrote: I'm generally baffled at all the stuff postmaster does in signal handlers... ProcessConfigFile(), load_hba() et al. It's all done with signals disabled, but still. As far as I recall, the rationale for why this is acceptable is that the whole of postmaster is run with

Re: [HACKERS] test_shm_mq failing on anole (was: Sending out a request for more buildfarm animals?)

2014-09-29 Thread Andres Freund
On 2014-09-29 18:44:34 -0300, Alvaro Herrera wrote: Andres Freund wrote: I'm generally baffled at all the stuff postmaster does in signal handlers... ProcessConfigFile(), load_hba() et al. It's all done with signals disabled, but still. As far as I recall, the rationale for why this is

Re: [HACKERS] test_shm_mq failing on anole (was: Sending out a request for more buildfarm animals?)

2014-09-29 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes: On 2014-09-29 18:44:34 -0300, Alvaro Herrera wrote: As far as I recall, the rationale for why this is acceptable is that the whole of postmaster is run with signals blocked; they are only unblocked during the sleeping select(). Yea, I wrote that

Re: [HACKERS] test_shm_mq failing on anole (was: Sending out a request for more buildfarm animals?)

2014-05-12 Thread Michael Paquier
On Sat, May 10, 2014 at 6:22 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: The test_shm_mq regression tests hung on this machine this morning. It looks like hamster may have a repeatable issue there as well, since the last set of DSM code changes. Yeah, this

Re: [HACKERS] test_shm_mq failing on anole (was: Sending out a request for more buildfarm animals?)

2014-05-09 Thread Robert Haas
On Sat, May 3, 2014 at 4:31 AM, Dave Page dave.p...@enterprisedb.com wrote: Hamid@EDB; Can you please have someone configure anole to build git head as well as the other branches? Thanks. The test_shm_mq regression tests hung on this machine this morning. Hamid was able to give me access to log

Re: [HACKERS] test_shm_mq failing on anole (was: Sending out a request for more buildfarm animals?)

2014-05-09 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: The test_shm_mq regression tests hung on this machine this morning. It looks like hamster may have a repeatable issue there as well, since the last set of DSM code changes. regards, tom lane -- Sent via pgsql-hackers mailing

Re: [HACKERS] test_shm_mq failing on anole (was: Sending out a request for more buildfarm animals?)

2014-05-09 Thread Tom Lane
I wrote: It looks like hamster may have a repeatable issue there as well, since the last set of DSM code changes. Ah, scratch that --- on closer inspection it looks like both failures probably trace to out-of-disk-space. regards, tom lane -- Sent via pgsql-hackers