Re: [HACKERS] Platform-dependent(?) failure in timeout handling

2013-11-27 Thread Martijn van Oosterhout
On Tue, Nov 26, 2013 at 06:50:28PM -0500, Tom Lane wrote: I believe the reason for this is the mechanism that I speculated about in that previous thread. The platform is blocking SIGALRM while it executes handle_sig_alarm(), and that calls LockTimeoutHandler() which does kill(MyProcPid,

Re: [HACKERS] Platform-dependent(?) failure in timeout handling

2013-11-27 Thread Andres Freund
Hi, On 2013-11-26 18:50:28 -0500, Tom Lane wrote: I don't know how many platforms block signals during handlers in this way, but I'm seeing it on Linux (RHEL6.5 to be exact) and we know that at least OpenBSD acts likewise, so that's a pretty darn large chunk of the world. Just as a datapoint,

Re: [HACKERS] Platform-dependent(?) failure in timeout handling

2013-11-27 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote: 3. Establish a coding rule that if you catch an error with PG_TRY() and don't re-throw it, you have to unblock signals in your PG_CATCH block. Could that be done in the PG_END_TRY macro? -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise

Re: [HACKERS] Platform-dependent(?) failure in timeout handling

2013-11-27 Thread Tom Lane
Kevin Grittner kgri...@ymail.com writes: Tom Lane t...@sss.pgh.pa.us wrote: 3. Establish a coding rule that if you catch an error with PG_TRY() and don't re-throw it, you have to unblock signals in your PG_CATCH block. Could that be done in the PG_END_TRY macro? Interesting idea ... [

[HACKERS] Platform-dependent(?) failure in timeout handling

2013-11-26 Thread Tom Lane
Dan Wood sent me off-list the test case mentioned in http://www.postgresql.org/message-id/CAPweHKfExEsbACRXQTBdu_4QxhHk-Cic_iwmbh5XHo_0Z=q...@mail.gmail.com I've been poking at it, and one of the failure modes I'm seeing is that all the backends hang up without crashing. I thought at first it