On 2020/11/04 19:27, Heikki Linnakangas wrote:
On 04/11/2020 09:44, Fujii Masao wrote:
Get rid of the dedicated latch for signaling the startup process.
This commit gets rid of the dedicated latch for signaling the startup
process in favor of using its procLatch, since that comports better
with possible generic signal handlers using that latch.
Commit 1e53fe0e70 changed background processes so that they use standard
SIGHUP handler. Like that, this commit also makes the startup process use
standard SIGHUP handler to simplify the code.
This seems to have made buildfarm member 'elver' to segfault. I've got a hunch
that setting recoveryWakeupLatch to NULL, when WakeupRecovery() doesn't check
for NULL, is not OK. It's surprising that we're only seeing this on 'elver',
though.
Thanks for the report! The latch is reset to NULL after ShutdownWalRcv(). So I
thought that there are no processes setting the latch (i.e., walreceiver has
already exited) after it's reset to NULL. But this is not true. There seem a
window between ShutdownWalRcv() and the actual exit of walreceiver. If a signal
is sent during that window, the segmentation fault would happen. This would be
the reason why segv happened in some platforms, but not in others.
I'm thinking to remove the following code to fix this issue. Thought?
/*
* We don't need the latch anymore. It's not strictly necessary to reset
* it to NULL, but let's do it for the sake of tidiness.
*/
if (ArchiveRecoveryRequested)
XLogCtl->recoveryWakeupLatch = NULL;
Regards,
--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION