On Tue, Nov 26, 2013 at 06:50:28PM -0500, Tom Lane wrote: > I believe the reason for this is the mechanism that I speculated about in > that previous thread. The platform is blocking SIGALRM while it executes > handle_sig_alarm(), and that calls LockTimeoutHandler() which does > "kill(MyProcPid, SIGINT)", and that SIGINT is being delivered immediately > (or at least before we can get out of handle_sig_alarm). So now the > platform blocks SIGINT, too, and calls StatementCancelHandler(), which > proceeds to longjmp out of the whole signal-handling call stack. So > the signal unblocking that would have happened after the handlers > returned doesn't happen. In simpler cases we don't see an issue because > the longjmp returns to the setsigjmp(foo,1) call in postgres.c, which > will result in restoring the signal mask that was active at that stack > level, so we're all good. However, PG_TRY() uses setsigjmp(foo,0), > which means that no signal mask restoration happens if we catch the > longjmp and don't ever re-throw it. Which is exactly what happens in > plpgsql because of the EXCEPTION clause in the above example. > > I don't know how many platforms block signals during handlers in this way, > but I'm seeing it on Linux (RHEL6.5 to be exact) and we know that at least > OpenBSD acts likewise, so that's a pretty darn large chunk of the world.
Isn't this why sigsetjmp/siglongjmp where invented? Is there a situation where you don't want the signal mask restored? BTW, seems on BSD systems sigsetjmp == setjmp: http://www.gnu.org/software/libc/manual/html_node/Non_002dLocal-Exits-and-Signals.html Have a nice day, -- Martijn van Oosterhout <klep...@svana.org> http://svana.org/kleptog/ > He who writes carelessly confesses thereby at the very outset that he does > not attach much importance to his own thoughts. -- Arthur Schopenhauer
Description: Digital signature