Robert Haas <robertmh...@gmail.com> writes:
I'd support back-porting that commit to 9.1 and 9.2 as a fix for this
problem.  As the commit message says, it's dead simple.

From: "Tom Lane" <t...@sss.pgh.pa.us>
While I have no great objection to back-porting Heikki's patch, it seems
like a very large stretch to call this a root-cause fix.  At best it's
band-aiding one symptom in a rather fragile way.

Thank you, Robert san. I'll be waiting for it to be back-ported to the next 9.1/9.2 release.

Yes, I think this failure is only one potential symptom caused by the implemnentation mistake -- handling both latch wakeup and other tasks that wait on a latch in the SIGUSR1 handler. Although there may be no such tasks now, I'd like to correct and clean up the implementation as follows to avoid similar problems in the future. I think it's enough to do this only for 9.5. Please correct me before I go deeper in the wrong direction.

* The SIGUSR1 handler only does latch wakeup. Any other task is done in other signal handlers such as SIGUSR2. Many daemon postgres processes follow this style, but the normal backend, autovacuum daemons, and background workers don't now.

* InitializeLatchSupport() in unix_latch.c calls pqsignal(SIGUSR1, latch_sigusr1_handler). Change the argument of latch_sigusr1_handler() accordingly.

* Remove SIGUSR1 handler registration and process-specific SIGUSR1 handler functions from all processes. We can eliminate many SIGUSR1 handler functions which have the same contents.

Regards
MauMau





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to