On Tue, Nov 8, 2022 at 4:35 PM Thomas Munro <thomas.mu...@gmail.com> wrote: > > On Sat, Oct 30, 2021 at 7:44 AM Robert Haas <robertmh...@gmail.com> wrote: > > Committed. > > Is it expected that an otherwise idle standby's recovery process > receives SIGALRM every N seconds, or should the timer be canceled at > that point, as there is no further progress to report?
Nice catch. Yeah, that seems unnecessary, see the below standby logs. I think we need to disable_timeout(STARTUP_PROGRESS_TIMEOUT, false);, something like the attached? I think there'll be no issue with the patch since the StandbyMode gets reset only at the end of recovery (in FinishWalRecovery()) but it can very well be set during recovery (in ReadRecord()). Note that I also added an assertion in has_startup_progress_timeout_expired(), just in case. 2022-11-08 11:28:23.563 UTC [980909] LOG: SIGALRM handle_sig_alarm received 2022-11-08 11:28:23.563 UTC [980909] LOG: startup_progress_timeout_handler called 2022-11-08 11:28:33.563 UTC [980909] LOG: SIGALRM handle_sig_alarm received 2022-11-08 11:28:33.563 UTC [980909] LOG: startup_progress_timeout_handler called 2022-11-08 11:28:43.563 UTC [980909] LOG: SIGALRM handle_sig_alarm received 2022-11-08 11:28:43.563 UTC [980909] LOG: startup_progress_timeout_handler called 2022-11-08 11:28:53.563 UTC [980909] LOG: SIGALRM handle_sig_alarm received 2022-11-08 11:28:53.563 UTC [980909] LOG: startup_progress_timeout_handler called Whilte at it, I noticed that we report redo progress for PITR, but we don't report when standby enters archive recovery mode, say due to a failure in the connection to primary or after the promote signal is found. Isn't it useful to report in this case as well to know the recovery progress? -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
From 40ac498a76f38903bbee37108812907a76bb1a78 Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Date: Tue, 8 Nov 2022 12:18:11 +0000 Subject: [PATCH v1] Disable STARTUP_PROGRESS_TIMEOUT in standby mode --- src/backend/access/transam/xlogrecovery.c | 13 ++++++++++++- src/backend/postmaster/startup.c | 2 ++ 2 files changed, 14 insertions(+), 1 deletion(-) diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c index cb07694aea..5326a98633 100644 --- a/src/backend/access/transam/xlogrecovery.c +++ b/src/backend/access/transam/xlogrecovery.c @@ -63,6 +63,7 @@ #include "utils/pg_lsn.h" #include "utils/ps_status.h" #include "utils/pg_rusage.h" +#include "utils/timeout.h" /* Unsupported old recovery command file names (relative to $PGDATA) */ #define RECOVERY_COMMAND_FILE "recovery.conf" @@ -1653,7 +1654,17 @@ PerformWalRecovery(void) */ do { - if (!StandbyMode) + /* + * To avoid server log bloat, we don't report recovery progress in + * a standby as it will always be in recovery unless promoted. We + * also disable the timeout as we don't need it anymore. + */ + if (StandbyMode) + { + if (get_timeout_active(STARTUP_PROGRESS_TIMEOUT)) + disable_timeout(STARTUP_PROGRESS_TIMEOUT, false); + } + else ereport_startup_progress("redo in progress, elapsed time: %ld.%02d s, current LSN: %X/%X", LSN_FORMAT_ARGS(xlogreader->ReadRecPtr)); diff --git a/src/backend/postmaster/startup.c b/src/backend/postmaster/startup.c index f99186eab7..1456e3ad3a 100644 --- a/src/backend/postmaster/startup.c +++ b/src/backend/postmaster/startup.c @@ -347,6 +347,8 @@ has_startup_progress_timeout_expired(long *secs, int *usecs) int useconds; TimestampTz now; + Assert(get_timeout_active(STARTUP_PROGRESS_TIMEOUT) == true); + /* No timeout has occurred. */ if (!startup_progress_timer_expired) return false; -- 2.34.1