On Tue, Apr 11, 2023 at 11:42 AM Michael Paquier <mich...@paquier.xyz> wrote:
>
> On Mon, Oct 24, 2022 at 08:15:11AM +0530, Bharath Rupireddy wrote:
> > The attached patch (pg_recvlogical_graceful_interrupt.text) has a
> > couple of problems, I believe. We're losing prepareToTerminate() with
> > keepalive true and we're not skipping pg_log_error("unexpected
> > termination of replication stream: %s" upon interrupt, after all we're
> > here discussing how to avoid it.
> >
> > I came up with the attached v2 patch, please have a look.
>
> This thread has slipped through the feature freeze deadline.  Would
> people be OK to do something now on HEAD?  A backpatch is also in
> order, IMO, as the current behavior looks confusing under SIGINT and
> SIGTERM.

IMO, +1 for HEAD/PG16 and +0.5 for backpatching as it may not be so
critical to backpatch all the way down. What may happen without this
patch is that the output file isn't fsync-ed upon SIGINT/SIGTERM.
Well, is it a critical issue on production servers?

On Fri, Apr 7, 2023 at 5:12 AM Cary Huang <cary.hu...@highgo.ca> wrote:
>
> The following review has been posted through the commitfest application:
>
> The patch applies and tests fine. I like the way to have both ready_to_exit 
> and time_to_abort variables to control the exit sequence. I think the (void) 
> cast can be removed in front of PQputCopyEnd(), PQflush for consistency 
> purposes as it does not give warnings and everywhere else does not have those 
> casts.

Thanks for reviewing. I removed the (void) casts like elsewhere in the
code, however, I didn't change such casts in prepareToTerminate() to
not create a diff.

I'm attaching the v4 patch for further review.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From 22195f136440fdadae0c6a0bf04c23fa16b4031c Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Wed, 26 Apr 2023 15:25:03 +0000
Subject: [PATCH v4] Fix pg_recvlogical error message upon SIGINT/SIGTERM

When pg_recvlogical gets SIGINT/SIGTERM, it emits
"unexpected termination of replication stream" error, which is
meant for really unexpected termination or a crash, but not for
SIGINT/SIGTERM. Upon SIGINT/SIGTERM, we want pg_recvlogical to
fsync the output file before exiting cleanly. This commit changes
pg_recvlogical to that.

Reported-by: Andres Freund
Author: Bharath Rupireddy
Reviewed-by: Kyotaro Horiguchi, Andres Freund
Reviewed-by: Cary Huang
Discussion: https://www.postgresql.org/message-id/20221019213953.htdtzikf4f45ywil%40awork3.anarazel.de
---
 src/bin/pg_basebackup/pg_recvlogical.c | 29 ++++++++++++++++++++++----
 1 file changed, 25 insertions(+), 4 deletions(-)

diff --git a/src/bin/pg_basebackup/pg_recvlogical.c b/src/bin/pg_basebackup/pg_recvlogical.c
index f3c7937a1d..337076647b 100644
--- a/src/bin/pg_basebackup/pg_recvlogical.c
+++ b/src/bin/pg_basebackup/pg_recvlogical.c
@@ -54,7 +54,8 @@ static const char *plugin = "test_decoding";
 
 /* Global State */
 static int	outfd = -1;
-static volatile sig_atomic_t time_to_abort = false;
+static bool	time_to_abort = false;
+static volatile sig_atomic_t ready_to_exit = false;
 static volatile sig_atomic_t output_reopen = false;
 static bool output_isfile;
 static TimestampTz output_last_fsync = -1;
@@ -283,6 +284,23 @@ StreamLogicalLog(void)
 			copybuf = NULL;
 		}
 
+		/* When we get SIGINT/SIGTERM, we exit */
+		if (ready_to_exit)
+		{
+			/*
+			 * Try informing the server about our exit, but don't wait around
+			 * or retry on failure.
+			 */
+			PQputCopyEnd(conn, NULL);
+			PQflush(conn);
+			time_to_abort = true;
+
+			if (verbose)
+				pg_log_info("received interrupt signal, exiting");
+
+			break;
+		}
+
 		/*
 		 * Potentially send a status message to the primary.
 		 */
@@ -614,7 +632,10 @@ StreamLogicalLog(void)
 
 		res = PQgetResult(conn);
 	}
-	if (PQresultStatus(res) != PGRES_COMMAND_OK)
+
+	/* It is not unexepected termination error when Ctrl-C'ed. */
+	if (!ready_to_exit &&
+		PQresultStatus(res) != PGRES_COMMAND_OK)
 	{
 		pg_log_error("unexpected termination of replication stream: %s",
 					 PQresultErrorMessage(res));
@@ -656,7 +677,7 @@ error:
 static void
 sigexit_handler(SIGNAL_ARGS)
 {
-	time_to_abort = true;
+	ready_to_exit = true;
 }
 
 /*
@@ -976,7 +997,7 @@ main(int argc, char **argv)
 	while (true)
 	{
 		StreamLogicalLog();
-		if (time_to_abort)
+		if (ready_to_exit || time_to_abort)
 		{
 			/*
 			 * We've been Ctrl-C'ed or reached an exit limit condition. That's
-- 
2.34.1

Reply via email to