On Thu, Aug 3, 2017 at 1:41 PM Robert Haas <robertmh...@gmail.com> wrote: > On Wed, Aug 2, 2017 at 9:33 PM, Peter Eisentraut > <peter.eisentr...@2ndquadrant.com> wrote: > > On 8/2/17 16:52, Robert Haas wrote: > >> I actually don't think it's that unreasonable to get notified when > >> system-wide processes like the autovacuum launcher or the logical > >> replication launcher start or stop. > > > > But we got rid of those start messages recently due to complaints. > > Yeah, true. I'm just talking about what *I* think. :-)
I still think the current situation is non-ideal. I don't have a strong view on whether some or all system-wide processes should say hello and goodbye explicitly in the log, but I do think we need a way to make that not look like an error condition, and ideally without special cases for known built-in processes. I looked into this a bit today, while debugging an extension that runs a background worker. Here are some assorted approaches to shutdown: 0. The default SIGTERM handler for bgworkers is bgworker_die(), which generates a FATAL ereport "terminating background worker \"%s\" due to administrator command", directly in the signal handler (hmm, see also recent discussions about the legality of similar code in quickdie()). 1. Some processes install their own custom SIGTERM handler that sets a flag, and use that to break out of their main loop and exit quietly. Example: autovacuum.c, or the open-source pg_cron extension, and probably many other things that just want a nice quiet shutdown. 2. Some processes install the standard SIGTERM handler die(), and then use CHECK_FOR_INTERRUPTS() to break out of their main loop. By default this looks like "FATAL: terminating connection due to administrator command". This approach is effectively required if the main loop will reach other code that has a CHECK_FOR_INTERRUPTS() wait loop. For example, a "launcher"-type process that calls WaitForBackgroundWorkerStartup() could hang forever if it used approach 1 (ask me how I know). 3. Some system processes generally use approach 2, but have a special case in ProcessInterrupts() to suppress or alter the usual FATAL message or behaviour. This includes the logical replication launcher. A couple of thoughts: * Extensions that need to use die() (or something else that would be compatible with CFI() wait loops) should ideally have a way to control how ProcessInterrupts() reports this to the user, since shutdown is an expected condition for long-lived processes. * We really should get rid of that "exited with exit code 1". -- Thomas Munro http://www.enterprisedb.com