Hi all

While adding support for logical decoding on standby I noticed that
the walsender doesn't respect
SIGUSR1 with PROCSIG_RECOVERY_CONFLICT_DATABASE set.

It blindly assumes that it means there's new WAL:

WalSndSignals(void)
{
...
        pqsignal(SIGUSR1, WalSndXLogSendHandler);       /* request WAL
sending */
...
}

since all WalSndXLogSendHandler does is set the latch.

Handling recovery conflicts in the walsender is neccessary for logical
decoding on standby, so that we can replay drop database.

All the recovery conflict machinery is currently contained in
postgres.c and not used by, or accessible to, the walsender. It
actually works to just set procsignal_sigusr1_handler as walsender's
SIGUSR1 handler, but I'm not convinced it's safe:

Most of the procsignals don't make sense for walsender and could
possibly attempts things that use state the walsender doesn't have set
up. The comments on procsignal say that callers should tolerate
getting the wrong signal due to possible races:

 * Also, because of race conditions, it's important that all the signals be
 * defined so that no harm is done if a process mistakenly receives one.

(procsignal.h)

I'm wondering about adding a new state flag IsWalSender and have
RecoveryConflictInterrupt() ignore most conflict reasons if
IsWalSender is true. Though it strikes me that during logical decoding
on standby, the walsender could quite possibly conflict with other
things too, so it'd be better to make it safe to handle all the
conflict cases within the walsender.

Anyway, this PoC passes regression tests and allows drop database on a
standby to succeed when a slot is in-use. Not for commit as-is.


-- 
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
From 7b19d2375657bf9b360e4c8feeeaf5f24b7f615a Mon Sep 17 00:00:00 2001
From: Craig Ringer <cr...@2ndquadrant.com>
Date: Fri, 18 Nov 2016 10:24:55 +0800
Subject: [PATCH 1/2] Allow walsender to exit on conflict with recovery

Now that logical decoding on standby is supported, the walsender needs to be
able to exit in response to conflict with recovery so that it can terminate to
allow replay of a DROP DATABASE to proceed.

WIP. The comments on RecoveryConflictInterrupt() still say it's only called
by normal user backends. There's no safeguard to stop walsender from invoking
other recovery conflict clauses that may be unsafe for it to call.
---
 src/backend/replication/walsender.c | 14 +-------------
 1 file changed, 1 insertion(+), 13 deletions(-)

diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 327dbb2..65b38a2 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -187,7 +187,6 @@ static XLogRecPtr logical_startptr = InvalidXLogRecPtr;
 
 /* Signal handlers */
 static void WalSndSigHupHandler(SIGNAL_ARGS);
-static void WalSndXLogSendHandler(SIGNAL_ARGS);
 static void WalSndLastCycleHandler(SIGNAL_ARGS);
 
 /* Prototypes for private functions */
@@ -2666,17 +2665,6 @@ WalSndSigHupHandler(SIGNAL_ARGS)
 	errno = save_errno;
 }
 
-/* SIGUSR1: set flag to send WAL records */
-static void
-WalSndXLogSendHandler(SIGNAL_ARGS)
-{
-	int			save_errno = errno;
-
-	latch_sigusr1_handler();
-
-	errno = save_errno;
-}
-
 /* SIGUSR2: set flag to do a last cycle and shut down afterwards */
 static void
 WalSndLastCycleHandler(SIGNAL_ARGS)
@@ -2710,7 +2698,7 @@ WalSndSignals(void)
 	pqsignal(SIGQUIT, quickdie);	/* hard crash time */
 	InitializeTimeouts();		/* establishes SIGALRM handler */
 	pqsignal(SIGPIPE, SIG_IGN);
-	pqsignal(SIGUSR1, WalSndXLogSendHandler);	/* request WAL sending */
+	pqsignal(SIGUSR1, procsignal_sigusr1_handler);
 	pqsignal(SIGUSR2, WalSndLastCycleHandler);	/* request a last cycle and
 												 * shutdown */
 
-- 
2.5.5

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to