Andreas Pflug wrote: > This patch reenables pg_terminate_backend, allowing (superuser only, of > course) to terminate a backend. As taken from the discussion some weeks > earlier, SIGTERM seems to be used quite widely, without a report of > misbehavior so while the code path is officially not too well tested, > in practice it's working ok and helpful.
I thought we had a discussion that the places we accept SIGTERM might be places that can exit if the postmaster is shutting down, but might not be places we can exit if the postmaster continues running, e.g. holding locks. Have you checked all the places we honor SIGTERM to check that we are safe to exit? I know Tom had concerns about that. Looking at ProcessInterrupts() and friends, when it is called with QueryCancelPending(), it does elog(ERROR) and longjumps out of elog, and that cleans up some stuff. The problem with SIGTERM/ProcDiePending is that it just does a FATAL and I assume doesn't do the same cleanups that elog(ERROR) does to cancel a query. Ideally we would use another signal number, that would do a query cancel, then up in the recovery code after the longjump, after we had reset everything, we could then exit. The problem, I think, is that we don't have another signal available for use. I see this in postgres.c: pqsignal(SIGHUP, SigHupHandler); /* set flag to read config file */ pqsignal(SIGINT, StatementCancelHandler); /* cancel current query */ pqsignal(SIGTERM, die); /* cancel current query and exit */ pqsignal(SIGQUIT, quickdie); /* hard crash time */ pqsignal(SIGALRM, handle_sig_alarm); /* timeout conditions */ /* * Ignore failure to write to frontend. Note: if frontend closes * connection, we will notice it and exit cleanly when control next * returns to outer loop. This seems safer than forcing exit in the * midst of output during who-knows-what operation... */ pqsignal(SIGPIPE, SIG_IGN); pqsignal(SIGUSR1, CatchupInterruptHandler); pqsignal(SIGUSR2, NotifyInterruptHandler); pqsignal(SIGFPE, FloatExceptionHandler); It would be neat if we could do a combined Cancel/Terminate signal, but signals don't work that way. Any ideas on how we can do a combined cancel/terminate? Do we have a shared area that both the postmaster and the backends can see? Could we set a flag when the postmaster is shutting down and then when a backend sets a SIGTERM, it could either shut down right away or do the cancel and then shut down? I don't think we can do query cancel for server-wide backend shutdowns --- it should be as quick as possible. -- Bruce Momjian | http://candle.pha.pa.us email@example.com | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073 ---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend