On Tue, Apr 11, 2023 at 11:42 AM Michael Paquier <mich...@paquier.xyz> wrote: > > On Mon, Oct 24, 2022 at 08:15:11AM +0530, Bharath Rupireddy wrote: > > The attached patch (pg_recvlogical_graceful_interrupt.text) has a > > couple of problems, I believe. We're losing prepareToTerminate() with > > keepalive true and we're not skipping pg_log_error("unexpected > > termination of replication stream: %s" upon interrupt, after all we're > > here discussing how to avoid it. > > > > I came up with the attached v2 patch, please have a look. > > This thread has slipped through the feature freeze deadline. Would > people be OK to do something now on HEAD? A backpatch is also in > order, IMO, as the current behavior looks confusing under SIGINT and > SIGTERM.
IMO, +1 for HEAD/PG16 and +0.5 for backpatching as it may not be so critical to backpatch all the way down. What may happen without this patch is that the output file isn't fsync-ed upon SIGINT/SIGTERM. Well, is it a critical issue on production servers? On Fri, Apr 7, 2023 at 5:12 AM Cary Huang <cary.hu...@highgo.ca> wrote: > > The following review has been posted through the commitfest application: > > The patch applies and tests fine. I like the way to have both ready_to_exit > and time_to_abort variables to control the exit sequence. I think the (void) > cast can be removed in front of PQputCopyEnd(), PQflush for consistency > purposes as it does not give warnings and everywhere else does not have those > casts. Thanks for reviewing. I removed the (void) casts like elsewhere in the code, however, I didn't change such casts in prepareToTerminate() to not create a diff. I'm attaching the v4 patch for further review. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
From 22195f136440fdadae0c6a0bf04c23fa16b4031c Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Date: Wed, 26 Apr 2023 15:25:03 +0000 Subject: [PATCH v4] Fix pg_recvlogical error message upon SIGINT/SIGTERM When pg_recvlogical gets SIGINT/SIGTERM, it emits "unexpected termination of replication stream" error, which is meant for really unexpected termination or a crash, but not for SIGINT/SIGTERM. Upon SIGINT/SIGTERM, we want pg_recvlogical to fsync the output file before exiting cleanly. This commit changes pg_recvlogical to that. Reported-by: Andres Freund Author: Bharath Rupireddy Reviewed-by: Kyotaro Horiguchi, Andres Freund Reviewed-by: Cary Huang Discussion: https://www.postgresql.org/message-id/20221019213953.htdtzikf4f45ywil%40awork3.anarazel.de --- src/bin/pg_basebackup/pg_recvlogical.c | 29 ++++++++++++++++++++++---- 1 file changed, 25 insertions(+), 4 deletions(-) diff --git a/src/bin/pg_basebackup/pg_recvlogical.c b/src/bin/pg_basebackup/pg_recvlogical.c index f3c7937a1d..337076647b 100644 --- a/src/bin/pg_basebackup/pg_recvlogical.c +++ b/src/bin/pg_basebackup/pg_recvlogical.c @@ -54,7 +54,8 @@ static const char *plugin = "test_decoding"; /* Global State */ static int outfd = -1; -static volatile sig_atomic_t time_to_abort = false; +static bool time_to_abort = false; +static volatile sig_atomic_t ready_to_exit = false; static volatile sig_atomic_t output_reopen = false; static bool output_isfile; static TimestampTz output_last_fsync = -1; @@ -283,6 +284,23 @@ StreamLogicalLog(void) copybuf = NULL; } + /* When we get SIGINT/SIGTERM, we exit */ + if (ready_to_exit) + { + /* + * Try informing the server about our exit, but don't wait around + * or retry on failure. + */ + PQputCopyEnd(conn, NULL); + PQflush(conn); + time_to_abort = true; + + if (verbose) + pg_log_info("received interrupt signal, exiting"); + + break; + } + /* * Potentially send a status message to the primary. */ @@ -614,7 +632,10 @@ StreamLogicalLog(void) res = PQgetResult(conn); } - if (PQresultStatus(res) != PGRES_COMMAND_OK) + + /* It is not unexepected termination error when Ctrl-C'ed. */ + if (!ready_to_exit && + PQresultStatus(res) != PGRES_COMMAND_OK) { pg_log_error("unexpected termination of replication stream: %s", PQresultErrorMessage(res)); @@ -656,7 +677,7 @@ error: static void sigexit_handler(SIGNAL_ARGS) { - time_to_abort = true; + ready_to_exit = true; } /* @@ -976,7 +997,7 @@ main(int argc, char **argv) while (true) { StreamLogicalLog(); - if (time_to_abort) + if (ready_to_exit || time_to_abort) { /* * We've been Ctrl-C'ed or reached an exit limit condition. That's -- 2.34.1