On Oct5, 2011, at 15:30 , Magnus Hagander wrote: > When walsender calls out to do_pg_stop_backup() (during base backups), > it is not possible to terminate the process with a SIGTERM - it > requires a SIGKILL. This can leave unkillable backends for example if > archive_mode is on and archive_command is failing (or not set). A > similar thing would happen in other cases if walsender calls out to > something that would block (do_pg_start_backup() for example), but the > stop one is easy to provoke.
Hm, this seems to be related to another buglet I noticed a while ago, but then forgot about again. If one terminates pg_basebackup while it's waiting for all required WAL to be archived, the backend process only exits once that waiting phase is over. If, like in your failure case, archive_command fails indefinity (or isn't set), the backend process stays around forever. Your patch would improve that only insofar as it'd at least allow an immediate shutdown request to succeed - as it stands, that doesn't work because, as you mentioned, the blocked walsender doesn't handle SIGTERM. The question is, should we do more? To me, it'd make sense to terminate a backend once it's connection is gone. We could, for example, make pq_flush() set a global flag, and make CHECK_FOR_INTERRUPTS handle a broken connection that same way as a SIGINT or SIGTERM. Thoughts? > ISTM one way to fix it is the attached, which is to have walsender set > the "global" flags saying that we have received sigterm, which in turn > causes the CHECK_FOR_INTERRUPTS() calls in the routines to properly > exit the process. AFAICT it works fine. Any holes in this approach? Seems sensible, but my knowledge about these code paths is quite limited.. > Second, I wonder if we should add a SIGINT handler as well, that would > make it possible to send a cancel signal. Given that the end result > would be the same (at least if we want to keep with the "walsender is > simple" path), I'm not sure it's necessary - but it would at least > help those doing pg_cancel_backend()... thoughts? If all that's needed is a simple SIGINT handler that sets one flag, it'd say let's add it. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers