Re: when the startup process doesn't (logging startup delays)

2023-02-10 Thread Bharath Rupireddy
On Wed, Feb 8, 2023 at 11:13 PM Tom Lane  wrote:
>
> Bharath Rupireddy  writes:
> > Thanks a lot Robert for taking care of this. The patch is committed on
> > HEAD and reverted on v15. Now that the minor version branches are
> > stamped, is it time for us to get this to v15? I can then close the CF
> > entry - https://commitfest.postgresql.org/42/4012/.
>
> No objection to un-reverting from here.

Thanks Robert, Tom. It is now un-reverted on v15. I've closed the CF entry.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: when the startup process doesn't (logging startup delays)

2023-02-08 Thread Tom Lane
Bharath Rupireddy  writes:
> Thanks a lot Robert for taking care of this. The patch is committed on
> HEAD and reverted on v15. Now that the minor version branches are
> stamped, is it time for us to get this to v15? I can then close the CF
> entry - https://commitfest.postgresql.org/42/4012/.

No objection to un-reverting from here.

regards, tom lane




Re: when the startup process doesn't (logging startup delays)

2023-02-08 Thread Bharath Rupireddy
On Mon, Feb 6, 2023 at 9:39 PM Robert Haas  wrote:
>
> On Mon, Feb 6, 2023 at 11:07 AM Tom Lane  wrote:
> > Umm ... is this really the sort of patch to be committing on a
> > release wrap day?
>
> Oh, shoot, I wasn't thinking about that. Would you like me to revert
> it in v15 for now?

Thanks a lot Robert for taking care of this. The patch is committed on
HEAD and reverted on v15. Now that the minor version branches are
stamped, is it time for us to get this to v15? I can then close the CF
entry - https://commitfest.postgresql.org/42/4012/.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: when the startup process doesn't (logging startup delays)

2023-02-06 Thread Robert Haas
On Mon, Feb 6, 2023 at 11:15 AM Tom Lane  wrote:
> Robert Haas  writes:
> > On Mon, Feb 6, 2023 at 11:07 AM Tom Lane  wrote:
> >> Umm ... is this really the sort of patch to be committing on a
> >> release wrap day?
>
> > Oh, shoot, I wasn't thinking about that. Would you like me to revert
> > it in v15 for now?
>
> Yeah, seems like the safest course.  I wouldn't object to it going in
> after the release is over, but right now there's zero margin for error.

Done.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: when the startup process doesn't (logging startup delays)

2023-02-06 Thread Tom Lane
Robert Haas  writes:
> On Mon, Feb 6, 2023 at 11:07 AM Tom Lane  wrote:
>> Umm ... is this really the sort of patch to be committing on a
>> release wrap day?

> Oh, shoot, I wasn't thinking about that. Would you like me to revert
> it in v15 for now?

Yeah, seems like the safest course.  I wouldn't object to it going in
after the release is over, but right now there's zero margin for error.

regards, tom lane




Re: when the startup process doesn't (logging startup delays)

2023-02-06 Thread Robert Haas
On Mon, Feb 6, 2023 at 11:07 AM Tom Lane  wrote:
> Umm ... is this really the sort of patch to be committing on a
> release wrap day?

Oh, shoot, I wasn't thinking about that. Would you like me to revert
it in v15 for now?

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: when the startup process doesn't (logging startup delays)

2023-02-06 Thread Tom Lane
Robert Haas  writes:
> On Thu, Feb 2, 2023 at 11:22 PM Bharath Rupireddy
>  wrote:
>> I took the v4 patch, added some comments and attached it as the v7
>> patch here. Please find it.

> Committed and back-patched to v15.

Umm ... is this really the sort of patch to be committing on a
release wrap day?

regards, tom lane




Re: when the startup process doesn't (logging startup delays)

2023-02-06 Thread Robert Haas
On Thu, Feb 2, 2023 at 11:22 PM Bharath Rupireddy
 wrote:
> I took the v4 patch, added some comments and attached it as the v7
> patch here. Please find it.

Committed and back-patched to v15.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: when the startup process doesn't (logging startup delays)

2023-02-02 Thread Bharath Rupireddy
On Fri, Feb 3, 2023 at 2:29 AM Robert Haas  wrote:
>

Thanks for looking at this.

> On Tue, Nov 22, 2022 at 6:05 AM Bharath Rupireddy
>  wrote:
> > If we place just the Assert(!StandbyMode); in
> > enable_startup_progress_timeout(), it fails for
> > begin_startup_progress_phase() in ResetUnloggedRelations() because the
> > InitWalRecovery(), that sets StandbyMode to true, is called before
> > ResetUnloggedRelations() . However, with the if (StandbyMode) {
> > return; }, we fail to report progress of ResetUnloggedRelations() in a
> > standby, which isn't a good idea at all because we only want to
> > disable the timeout during the recovery's main loop.
>
> Ugh. Well, in that case, I guess my vote is to forget about this whole
> Assert business and just commit what you had in v4. Does that work for
> you?

Yes, it seems reasonable to me.

> Protecting against specifically the situation where we're in the
> standby's main redo apply loop is not really what I had in mind here,
> but this is already sort of weirdly complicated-looking, and making it
> more weirdly complicated-looking to achieve the kind of protection
> that I had in mind doesn't really seem worth it.

IMHO, the responsibility of whether or not to report progress of any
operation can lie with the developers writing features using the
progress reporting mechanism. The progress reporting mechanism can
just be independent of all that.

I took the v4 patch, added some comments and attached it as the v7
patch here. Please find it.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From 0ff4bc13db8abf70f8da158d11eecdfbe7a7e199 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Fri, 3 Feb 2023 03:30:49 +
Subject: [PATCH v7] Disable STARTUP_PROGRESS_TIMEOUT in standby mode

In standby mode, we actually don't report progress of recovery to
not bloat server logs as the standby is always in recovery unless
promoted. However, startup_progress_timeout_handler() gets called
every log_startup_progress_interval seconds, which is unnecessary.

Therefore, we disable the timeout in standby mode.

Reported-by: Thomas Munro
Author: Bharath Rupireddy
Reviewed-by: Robert Haas, Simon Riggs
Reviewed-by: Thomas Munro
Backpatch-through: 15
Discussion: https://www.postgresql.org/message-id/CA%2BhUKGKCHSffAj8zZJKJvNX7ygnQFxVD6wm1d-2j3fVw%2BMafPQ%40mail.gmail.com
---
 src/backend/access/transam/xlogrecovery.c | 25 ---
 src/backend/postmaster/startup.c  | 30 ---
 src/include/postmaster/startup.h  |  2 ++
 3 files changed, 51 insertions(+), 6 deletions(-)

diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 2a5352f879..dbe9394762 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -385,6 +385,7 @@ static bool recoveryStopAfter;
 /* prototypes for local functions */
 static void ApplyWalRecord(XLogReaderState *xlogreader, XLogRecord *record, TimeLineID *replayTLI);
 
+static void EnableStandbyMode(void);
 static void readRecoverySignalFile(void);
 static void validateRecoveryParameters(void);
 static bool read_backup_label(XLogRecPtr *checkPointLoc,
@@ -469,6 +470,24 @@ XLogRecoveryShmemInit(void)
 	ConditionVariableInit(>recoveryNotPausedCV);
 }
 
+/*
+ * A thin wrapper to enable StandbyMode and do other preparatory work as
+ * needed.
+ */
+static void
+EnableStandbyMode(void)
+{
+	StandbyMode = true;
+
+	/*
+	 * To avoid server log bloat, we don't report recovery progress in a
+	 * standby as it will always be in recovery unless promoted. We disable
+	 * startup progress timeout in standby mode to avoid calling
+	 * startup_progress_timeout_handler() unnecessarily.
+	 */
+	disable_startup_progress_timeout();
+}
+
 /*
  * Prepare the system for WAL recovery, if needed.
  *
@@ -602,7 +621,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
 		 */
 		InArchiveRecovery = true;
 		if (StandbyModeRequested)
-			StandbyMode = true;
+			EnableStandbyMode();
 
 		/*
 		 * When a backup_label file is present, we want to roll forward from
@@ -739,7 +758,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
 		{
 			InArchiveRecovery = true;
 			if (StandbyModeRequested)
-StandbyMode = true;
+EnableStandbyMode();
 		}
 
 		/* Get the last valid checkpoint record. */
@@ -3117,7 +3136,7 @@ ReadRecord(XLogPrefetcher *xlogprefetcher, int emode,
 		(errmsg_internal("reached end of WAL in pg_wal, entering archive recovery")));
 InArchiveRecovery = true;
 if (StandbyModeRequested)
-	StandbyMode = true;
+	EnableStandbyMode();
 
 SwitchIntoArchiveRecovery(xlogreader->EndRecPtr, replayTLI);
 minRecoveryPoint = xlogreader->EndRecPtr;
diff --git a/src/backend/postmaster/startup.c b/src/backend/postmaster/startup.c
index bcd23542f1..efc2580536 100644
--- 

Re: when the startup process doesn't (logging startup delays)

2023-02-02 Thread Robert Haas
On Tue, Nov 22, 2022 at 6:05 AM Bharath Rupireddy
 wrote:
> If we place just the Assert(!StandbyMode); in
> enable_startup_progress_timeout(), it fails for
> begin_startup_progress_phase() in ResetUnloggedRelations() because the
> InitWalRecovery(), that sets StandbyMode to true, is called before
> ResetUnloggedRelations() . However, with the if (StandbyMode) {
> return; }, we fail to report progress of ResetUnloggedRelations() in a
> standby, which isn't a good idea at all because we only want to
> disable the timeout during the recovery's main loop.

Ugh. Well, in that case, I guess my vote is to forget about this whole
Assert business and just commit what you had in v4. Does that work for
you?

Protecting against specifically the situation where we're in the
standby's main redo apply loop is not really what I had in mind here,
but this is already sort of weirdly complicated-looking, and making it
more weirdly complicated-looking to achieve the kind of protection
that I had in mind doesn't really seem worth it.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: when the startup process doesn't (logging startup delays)

2022-11-22 Thread Bharath Rupireddy
On Mon, Nov 21, 2022 at 10:37 PM Robert Haas  wrote:
>
> On Sun, Nov 20, 2022 at 9:20 PM Kyotaro Horiguchi
>  wrote:
> > I prefer Robert's approach as it is more robust for future changes and
> > simple. I prefer to avoid this kind of piggy-backing and it doesn't
> > seem to be needed in this case. XLogShutdownWalRcv() looks like a
> > similar case to me and honestly I don't like it in the sense of
> > robustness but it is simpler than checking walreceiver status at every
> > site that refers to the flag.
>
> I don't understand what you want changed. Can you be more specific
> about what you mean by "Robert's approach"?
>
> I don't agree with Bharath's logic for preferring an if-test to an
> Assert. There are some cases where we think we've written the code
> correctly but also recognize that the logic is complicated enough that
> something might slip through the cracks. Then, a runtime check makes
> sense, because otherwise something real bad might happen on a
> production instance. But here, I don't think that's the main risk. I
> think the main risk is that some future patch tries to add code that
> should print startup log messages later on. That would probably be a
> coding mistake, and Assert would alert the patch author about that,
> whereas amending the if-test would just make the code do something
> differently then the author intended.
>
> But I don't feel super-strongly about this, which is why I mentioned
> both options in my previous email. I'm not clear on whether you are
> expressing an opinion on this point in particular or something more
> general.

If we place just the Assert(!StandbyMode); in
enable_startup_progress_timeout(), it fails for
begin_startup_progress_phase() in ResetUnloggedRelations() because the
InitWalRecovery(), that sets StandbyMode to true, is called before
ResetUnloggedRelations() . However, with the if (StandbyMode) {
return; }, we fail to report progress of ResetUnloggedRelations() in a
standby, which isn't a good idea at all because we only want to
disable the timeout during the recovery's main loop.

I introduced an assert-only function returning true when we're in
standby's main redo apply loop and modified the assertion to be
Assert(!InStandbyMainRedoApplyLoop()); works better but it complicates
the code a bit. FWIW, I'm attaching the v6 patch with this change.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


v6-0001-Disable-STARTUP_PROGRESS_TIMEOUT-in-standby-mode.patch
Description: Binary data


Re: when the startup process doesn't (logging startup delays)

2022-11-21 Thread Robert Haas
On Sun, Nov 20, 2022 at 9:20 PM Kyotaro Horiguchi
 wrote:
> I prefer Robert's approach as it is more robust for future changes and
> simple. I prefer to avoid this kind of piggy-backing and it doesn't
> seem to be needed in this case. XLogShutdownWalRcv() looks like a
> similar case to me and honestly I don't like it in the sense of
> robustness but it is simpler than checking walreceiver status at every
> site that refers to the flag.

I don't understand what you want changed. Can you be more specific
about what you mean by "Robert's approach"?

I don't agree with Bharath's logic for preferring an if-test to an
Assert. There are some cases where we think we've written the code
correctly but also recognize that the logic is complicated enough that
something might slip through the cracks. Then, a runtime check makes
sense, because otherwise something real bad might happen on a
production instance. But here, I don't think that's the main risk. I
think the main risk is that some future patch tries to add code that
should print startup log messages later on. That would probably be a
coding mistake, and Assert would alert the patch author about that,
whereas amending the if-test would just make the code do something
differently then the author intended.

But I don't feel super-strongly about this, which is why I mentioned
both options in my previous email. I'm not clear on whether you are
expressing an opinion on this point in particular or something more
general.

Thanks,

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: when the startup process doesn't (logging startup delays)

2022-11-20 Thread Kyotaro Horiguchi
At Fri, 18 Nov 2022 15:55:00 +0530, Bharath Rupireddy 
 wrote in 
> On Fri, Nov 18, 2022 at 12:42 AM Robert Haas  wrote:
> >
> > On Thu, Nov 17, 2022 at 2:22 AM Bharath Rupireddy
> >  wrote:
> > > Duplication is a problem that I agree with and I have an idea here -
> > > how about introducing a new function, say EnableStandbyMode() that
> > > sets StandbyMode to true and disables the startup progress timeout,
> > > something like the attached?
> >
> > That works for me, more or less. But I think that
> > enable_startup_progress_timeout should be amended to either say if
> > (log_startup_progress_interval == 0 || StandbyMode) return; or else it
> > should at least Assert(!StandbyMode), so that we can't accidentally
> > re-enable the timer after we shut it off.
> 
> Hm, an assertion may not help in typical production servers running on
> non-assert builds. I've modified the if condition, please see the
> attached v5 patch.

I prefer Robert's approach as it is more robust for future changes and
simple. I prefer to avoid this kind of piggy-backing and it doesn't
seem to be needed in this case. XLogShutdownWalRcv() looks like a
similar case to me and honestly I don't like it in the sense of
robustness but it is simpler than checking walreceiver status at every
site that refers to the flag.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: when the startup process doesn't (logging startup delays)

2022-11-18 Thread Bharath Rupireddy
On Fri, Nov 18, 2022 at 12:42 AM Robert Haas  wrote:
>
> On Thu, Nov 17, 2022 at 2:22 AM Bharath Rupireddy
>  wrote:
> > Duplication is a problem that I agree with and I have an idea here -
> > how about introducing a new function, say EnableStandbyMode() that
> > sets StandbyMode to true and disables the startup progress timeout,
> > something like the attached?
>
> That works for me, more or less. But I think that
> enable_startup_progress_timeout should be amended to either say if
> (log_startup_progress_interval == 0 || StandbyMode) return; or else it
> should at least Assert(!StandbyMode), so that we can't accidentally
> re-enable the timer after we shut it off.

Hm, an assertion may not help in typical production servers running on
non-assert builds. I've modified the if condition, please see the
attached v5 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From 758d4cc81b1c7ed087969df07a958e92a374ee0f Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Fri, 18 Nov 2022 09:53:15 +
Subject: [PATCH v5] Disable STARTUP_PROGRESS_TIMEOUT in standby mode

In standby mode, we actually don't report progress of recovery to
not bloat server logs as the standby is always in recovery unless
promoted. However, startup_progress_timeout_handler() gets called
every log_startup_progress_interval seconds, which is unnecessary.

Therefore, we disable the timeout in standby mode.

Reported-by: Thomas Munro
Authors: Bharath Rupireddy, Simon Riggs
Reviewed-by: Robert Haas
Backpatch-through: 15
Discussion: https://www.postgresql.org/message-id/CA%2BhUKGKCHSffAj8zZJKJvNX7ygnQFxVD6wm1d-2j3fVw%2BMafPQ%40mail.gmail.com
---
 src/backend/access/transam/xlogrecovery.c | 21 ---
 src/backend/postmaster/startup.c  | 33 +++
 src/include/postmaster/startup.h  |  2 ++
 3 files changed, 48 insertions(+), 8 deletions(-)

diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index cb07694aea..2feae1ebd3 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -386,6 +386,7 @@ static bool recoveryStopAfter;
 /* prototypes for local functions */
 static void ApplyWalRecord(XLogReaderState *xlogreader, XLogRecord *record, TimeLineID *replayTLI);
 
+static void EnableStandbyMode(void);
 static void readRecoverySignalFile(void);
 static void validateRecoveryParameters(void);
 static bool read_backup_label(XLogRecPtr *checkPointLoc,
@@ -470,6 +471,20 @@ XLogRecoveryShmemInit(void)
 	ConditionVariableInit(>recoveryNotPausedCV);
 }
 
+static void
+EnableStandbyMode(void)
+{
+	StandbyMode = true;
+
+	/*
+	 * To avoid server log bloat, we don't report recovery progress in a
+	 * standby as it will always be in recovery unless promoted. We disable
+	 * startup progress timeout in standby mode to avoid calling
+	 * startup_progress_timeout_handler() unnecessarily.
+	 */
+	disable_startup_progress_timeout();
+}
+
 /*
  * Prepare the system for WAL recovery, if needed.
  *
@@ -603,7 +618,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
 		 */
 		InArchiveRecovery = true;
 		if (StandbyModeRequested)
-			StandbyMode = true;
+			EnableStandbyMode();
 
 		/*
 		 * When a backup_label file is present, we want to roll forward from
@@ -740,7 +755,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
 		{
 			InArchiveRecovery = true;
 			if (StandbyModeRequested)
-StandbyMode = true;
+EnableStandbyMode();
 		}
 
 		/* Get the last valid checkpoint record. */
@@ -3115,7 +3130,7 @@ ReadRecord(XLogPrefetcher *xlogprefetcher, int emode,
 		(errmsg_internal("reached end of WAL in pg_wal, entering archive recovery")));
 InArchiveRecovery = true;
 if (StandbyModeRequested)
-	StandbyMode = true;
+	EnableStandbyMode();
 
 SwitchIntoArchiveRecovery(xlogreader->EndRecPtr, replayTLI);
 minRecoveryPoint = xlogreader->EndRecPtr;
diff --git a/src/backend/postmaster/startup.c b/src/backend/postmaster/startup.c
index f99186eab7..4a46d2070c 100644
--- a/src/backend/postmaster/startup.c
+++ b/src/backend/postmaster/startup.c
@@ -314,20 +314,29 @@ startup_progress_timeout_handler(void)
 	startup_progress_timer_expired = true;
 }
 
+void
+disable_startup_progress_timeout(void)
+{
+	/* Quick exit if feature is disabled. */
+	if (log_startup_progress_interval == 0)
+		return;
+
+	disable_timeout(STARTUP_PROGRESS_TIMEOUT, false);
+	startup_progress_timer_expired = false;
+}
+
 /*
  * Set the start timestamp of the current operation and enable the timeout.
  */
 void
-begin_startup_progress_phase(void)
+enable_startup_progress_timeout(void)
 {
 	TimestampTz fin_time;
 
-	/* Feature is disabled. */
-	if (log_startup_progress_interval == 0)
+	/* Quick exit if feature is disabled or we're in standby mode */
+	if (log_startup_progress_interval == 0 || StandbyMode)
 		return;
 
-	

Re: when the startup process doesn't (logging startup delays)

2022-11-17 Thread Robert Haas
On Thu, Nov 17, 2022 at 2:22 AM Bharath Rupireddy
 wrote:
> Duplication is a problem that I agree with and I have an idea here -
> how about introducing a new function, say EnableStandbyMode() that
> sets StandbyMode to true and disables the startup progress timeout,
> something like the attached?

That works for me, more or less. But I think that
enable_startup_progress_timeout should be amended to either say if
(log_startup_progress_interval == 0 || StandbyMode) return; or else it
should at least Assert(!StandbyMode), so that we can't accidentally
re-enable the timer after we shut it off.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: when the startup process doesn't (logging startup delays)

2022-11-16 Thread Bharath Rupireddy
On Thu, Nov 17, 2022 at 12:21 AM Robert Haas  wrote:
>
> On Wed, Nov 16, 2022 at 1:47 AM Bharath Rupireddy
>  wrote:
> > That can be done, only if we can disable the timeout in another place
> > when the StandbyMode is set to true in ReadRecord(), that is, after
> > the standby server finishes crash recovery and enters standby mode.
>
> Oh, interesting. I didn't realize that we would need to worry about that case.
>
> > I'm attaching the v3 patch for further review. Please find the CF
> > entry here - https://commitfest.postgresql.org/41/4012/.
>
> I kind of dislike having to have logic for this in two places. Seems
> like it could create future bugs.

Duplication is a problem that I agree with and I have an idea here -
how about introducing a new function, say EnableStandbyMode() that
sets StandbyMode to true and disables the startup progress timeout,
something like the attached?

> How about the attached approach, instead? This way, the first time the
> timer expires after we reach standby mode, we reactively disable it.

Hm. I'm not really sure if it's a good idea. While it simplifies the
code, the has_startup_progress_timeout_expired() gets called for every
WAL record in standby mode. Isn't this an unnecessary thing?
Currently, the if (!StandbyMode) condition blocks the function calls.
And I'm also a little concerned that we move the StandbyMode variable
to startup.c which so far tiled to xlogrecovery.c. Maybe these are not
really concerns at all. Maybe others are okay with this approach.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From b25838215534488c53855a5cd5009fba09bd358a Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Thu, 17 Nov 2022 07:13:27 +
Subject: [PATCH v4] Disable STARTUP_PROGRESS_TIMEOUT in standby mode

In standby mode, we actually don't report progress of recovery to
not bloat server logs as the standby is always in recovery unless
promoted. However, startup_progress_timeout_handler() gets called
every log_startup_progress_interval seconds, which is unnecessary.

Therefore, we disable the timeout in standby mode.

Reported-by: Thomas Munro
Authors: Bharath Rupireddy, Simon Riggs
Reviewed-by: Robert Haas
Backpatch-through: 15
Discussion: https://www.postgresql.org/message-id/CA%2BhUKGKCHSffAj8zZJKJvNX7ygnQFxVD6wm1d-2j3fVw%2BMafPQ%40mail.gmail.com
---
 src/backend/access/transam/xlogrecovery.c | 21 +---
 src/backend/postmaster/startup.c  | 29 ---
 src/include/postmaster/startup.h  |  2 ++
 3 files changed, 46 insertions(+), 6 deletions(-)

diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index cb07694aea..2feae1ebd3 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -386,6 +386,7 @@ static bool recoveryStopAfter;
 /* prototypes for local functions */
 static void ApplyWalRecord(XLogReaderState *xlogreader, XLogRecord *record, TimeLineID *replayTLI);
 
+static void EnableStandbyMode(void);
 static void readRecoverySignalFile(void);
 static void validateRecoveryParameters(void);
 static bool read_backup_label(XLogRecPtr *checkPointLoc,
@@ -470,6 +471,20 @@ XLogRecoveryShmemInit(void)
 	ConditionVariableInit(>recoveryNotPausedCV);
 }
 
+static void
+EnableStandbyMode(void)
+{
+	StandbyMode = true;
+
+	/*
+	 * To avoid server log bloat, we don't report recovery progress in a
+	 * standby as it will always be in recovery unless promoted. We disable
+	 * startup progress timeout in standby mode to avoid calling
+	 * startup_progress_timeout_handler() unnecessarily.
+	 */
+	disable_startup_progress_timeout();
+}
+
 /*
  * Prepare the system for WAL recovery, if needed.
  *
@@ -603,7 +618,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
 		 */
 		InArchiveRecovery = true;
 		if (StandbyModeRequested)
-			StandbyMode = true;
+			EnableStandbyMode();
 
 		/*
 		 * When a backup_label file is present, we want to roll forward from
@@ -740,7 +755,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
 		{
 			InArchiveRecovery = true;
 			if (StandbyModeRequested)
-StandbyMode = true;
+EnableStandbyMode();
 		}
 
 		/* Get the last valid checkpoint record. */
@@ -3115,7 +3130,7 @@ ReadRecord(XLogPrefetcher *xlogprefetcher, int emode,
 		(errmsg_internal("reached end of WAL in pg_wal, entering archive recovery")));
 InArchiveRecovery = true;
 if (StandbyModeRequested)
-	StandbyMode = true;
+	EnableStandbyMode();
 
 SwitchIntoArchiveRecovery(xlogreader->EndRecPtr, replayTLI);
 minRecoveryPoint = xlogreader->EndRecPtr;
diff --git a/src/backend/postmaster/startup.c b/src/backend/postmaster/startup.c
index f99186eab7..2705e425e8 100644
--- a/src/backend/postmaster/startup.c
+++ b/src/backend/postmaster/startup.c
@@ -314,11 +314,22 @@ startup_progress_timeout_handler(void)
 	

Re: when the startup process doesn't (logging startup delays)

2022-11-16 Thread Thomas Munro
On Thu, Nov 17, 2022 at 7:51 AM Robert Haas  wrote:
+ * up, since standby mode is a state that is intendeded to persist

typo

Otherwise LGTM.




Re: when the startup process doesn't (logging startup delays)

2022-11-16 Thread Robert Haas
On Wed, Nov 16, 2022 at 1:47 AM Bharath Rupireddy
 wrote:
> That can be done, only if we can disable the timeout in another place
> when the StandbyMode is set to true in ReadRecord(), that is, after
> the standby server finishes crash recovery and enters standby mode.

Oh, interesting. I didn't realize that we would need to worry about that case.

> I'm attaching the v3 patch for further review. Please find the CF
> entry here - https://commitfest.postgresql.org/41/4012/.

I kind of dislike having to have logic for this in two places. Seems
like it could create future bugs.

How about the attached approach, instead? This way, the first time the
timer expires after we reach standby mode, we reactively disable it.

-- 
Robert Haas
EDB: http://www.enterprisedb.com


another-way.patch
Description: Binary data


Re: when the startup process doesn't (logging startup delays)

2022-11-16 Thread Bharath Rupireddy
On Wed, Nov 16, 2022 at 2:28 PM Simon Riggs
 wrote:
>
> On Wed, 16 Nov 2022 at 06:47, Bharath Rupireddy
>  wrote:
> >
> > On Tue, Nov 15, 2022 at 10:55 PM Robert Haas  wrote:
> > >
> > > On Tue, Nov 15, 2022 at 8:33 AM Bharath Rupireddy
> > >  wrote:
> > > > Please review the v2 patch.
> > >
> > > It seems to me that this will call disable_startup_progress_timeout
> > > once per WAL record, which seems like an unnecessary expense. How
> > > about leaving the code inside the loop just as we have it, and putting
> > > if (StandbyMode) disable_startup_progress_timeout() before entering
> > > the loop?
> >
> > That can be done, only if we can disable the timeout in another place
> > when the StandbyMode is set to true in ReadRecord(), that is, after
> > the standby server finishes crash recovery and enters standby mode.
> >
> > I'm attaching the v3 patch for further review. Please find the CF
> > entry here - https://commitfest.postgresql.org/41/4012/.
>
> begin_startup_progress_phase() checks to see if feature is disabled
> twice, so I think you can skip the check and just rely on the check in
> enable().

Yes, I did that intentionally to avoid begin_startup_progress_phase()
calling disable and enable functions when the feature is disabled.
I'll leave it to the committer whether to retain it or delete it.

> Otherwise, all good.

Thanks.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: when the startup process doesn't (logging startup delays)

2022-11-16 Thread Simon Riggs
On Wed, 16 Nov 2022 at 06:47, Bharath Rupireddy
 wrote:
>
> On Tue, Nov 15, 2022 at 10:55 PM Robert Haas  wrote:
> >
> > On Tue, Nov 15, 2022 at 8:33 AM Bharath Rupireddy
> >  wrote:
> > > Please review the v2 patch.
> >
> > It seems to me that this will call disable_startup_progress_timeout
> > once per WAL record, which seems like an unnecessary expense. How
> > about leaving the code inside the loop just as we have it, and putting
> > if (StandbyMode) disable_startup_progress_timeout() before entering
> > the loop?
>
> That can be done, only if we can disable the timeout in another place
> when the StandbyMode is set to true in ReadRecord(), that is, after
> the standby server finishes crash recovery and enters standby mode.
>
> I'm attaching the v3 patch for further review. Please find the CF
> entry here - https://commitfest.postgresql.org/41/4012/.

begin_startup_progress_phase() checks to see if feature is disabled
twice, so I think you can skip the check and just rely on the check in
enable().

Otherwise, all good.

-- 
Simon Riggshttp://www.EnterpriseDB.com/




Re: when the startup process doesn't (logging startup delays)

2022-11-15 Thread Bharath Rupireddy
On Tue, Nov 15, 2022 at 10:55 PM Robert Haas  wrote:
>
> On Tue, Nov 15, 2022 at 8:33 AM Bharath Rupireddy
>  wrote:
> > Please review the v2 patch.
>
> It seems to me that this will call disable_startup_progress_timeout
> once per WAL record, which seems like an unnecessary expense. How
> about leaving the code inside the loop just as we have it, and putting
> if (StandbyMode) disable_startup_progress_timeout() before entering
> the loop?

That can be done, only if we can disable the timeout in another place
when the StandbyMode is set to true in ReadRecord(), that is, after
the standby server finishes crash recovery and enters standby mode.

I'm attaching the v3 patch for further review. Please find the CF
entry here - https://commitfest.postgresql.org/41/4012/.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From 612f17d5e81181b69c3d711524de840c88cb3b4a Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Wed, 16 Nov 2022 06:20:15 +
Subject: [PATCH v3] Disable STARTUP_PROGRESS_TIMEOUT in standby mode

In standby mode, we actually don't report progress of recovery to
not bloat server logs as the standby is always in recovery unless
promoted. However, startup_progress_timeout_handler() gets called
every log_startup_progress_interval seconds, which is unnecessary.

Therefore, we disable the timeout in standby mode.

Reported-by: Thomas Munro
Authors: Bharath Rupireddy, Simon Riggs
Reviewed-by: Robert Haas
Backpatch-through: 15
Discussion: https://www.postgresql.org/message-id/CA%2BhUKGKCHSffAj8zZJKJvNX7ygnQFxVD6wm1d-2j3fVw%2BMafPQ%40mail.gmail.com
---
 src/backend/access/transam/xlogrecovery.c | 30 +--
 src/backend/postmaster/startup.c  | 29 +++---
 src/include/postmaster/startup.h  |  2 ++
 3 files changed, 56 insertions(+), 5 deletions(-)

diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index cb07694aea..0d57ea17ca 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -1644,9 +1644,24 @@ PerformWalRecovery(void)
 (errmsg("redo starts at %X/%X",
 		LSN_FORMAT_ARGS(xlogreader->ReadRecPtr;
 
-		/* Prepare to report progress of the redo phase. */
-		if (!StandbyMode)
+		if (StandbyMode)
+		{
+			/*
+			 * To avoid server log bloat, we don't report recovery progress in
+			 * a standby as it will always be in recovery unless promoted. We
+			 * disable startup progress timeout in standby mode to avoid
+			 * calling startup_progress_timeout_handler() unnecessarily.
+			 */
+			disable_startup_progress_timeout();
+		}
+		else
+		{
+			/*
+			 * When not in standby mode, prepare to report progress of the redo
+			 * phase.
+			 */
 			begin_startup_progress_phase();
+		}
 
 		/*
 		 * main redo apply loop
@@ -3115,8 +3130,19 @@ ReadRecord(XLogPrefetcher *xlogprefetcher, int emode,
 		(errmsg_internal("reached end of WAL in pg_wal, entering archive recovery")));
 InArchiveRecovery = true;
 if (StandbyModeRequested)
+{
 	StandbyMode = true;
 
+	/*
+	 * To avoid server log bloat, we don't report
+	 * recovery progress in a standby as it will always be in
+	 * recovery unless promoted. We disable startup progress
+	 * timeout in standby mode to avoid calling
+	 * startup_progress_timeout_handler() unnecessarily.
+	 */
+	disable_startup_progress_timeout();
+}
+
 SwitchIntoArchiveRecovery(xlogreader->EndRecPtr, replayTLI);
 minRecoveryPoint = xlogreader->EndRecPtr;
 minRecoveryPointTLI = replayTLI;
diff --git a/src/backend/postmaster/startup.c b/src/backend/postmaster/startup.c
index f99186eab7..2705e425e8 100644
--- a/src/backend/postmaster/startup.c
+++ b/src/backend/postmaster/startup.c
@@ -314,11 +314,22 @@ startup_progress_timeout_handler(void)
 	startup_progress_timer_expired = true;
 }
 
+void
+disable_startup_progress_timeout(void)
+{
+	/* Feature is disabled. */
+	if (log_startup_progress_interval == 0)
+		return;
+
+	disable_timeout(STARTUP_PROGRESS_TIMEOUT, false);
+	startup_progress_timer_expired = false;
+}
+
 /*
  * Set the start timestamp of the current operation and enable the timeout.
  */
 void
-begin_startup_progress_phase(void)
+enable_startup_progress_timeout(void)
 {
 	TimestampTz fin_time;
 
@@ -326,8 +337,6 @@ begin_startup_progress_phase(void)
 	if (log_startup_progress_interval == 0)
 		return;
 
-	disable_timeout(STARTUP_PROGRESS_TIMEOUT, false);
-	startup_progress_timer_expired = false;
 	startup_progress_phase_start_time = GetCurrentTimestamp();
 	fin_time = TimestampTzPlusMilliseconds(startup_progress_phase_start_time,
 		   log_startup_progress_interval);
@@ -335,6 +344,20 @@ begin_startup_progress_phase(void)
 		 log_startup_progress_interval);
 }
 
+/*
+ * A thin wrapper to first disable and then enable the startup progress timeout.
+ */
+void

Re: when the startup process doesn't (logging startup delays)

2022-11-15 Thread Robert Haas
On Tue, Nov 15, 2022 at 8:33 AM Bharath Rupireddy
 wrote:
> Please review the v2 patch.

It seems to me that this will call disable_startup_progress_timeout
once per WAL record, which seems like an unnecessary expense. How
about leaving the code inside the loop just as we have it, and putting
if (StandbyMode) disable_startup_progress_timeout() before entering
the loop?

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: when the startup process doesn't (logging startup delays)

2022-11-15 Thread Simon Riggs
On Tue, 15 Nov 2022 at 13:33, Bharath Rupireddy
 wrote:
>
> On Mon, Nov 14, 2022 at 9:31 PM Robert Haas  wrote:
> >
> > On Mon, Nov 14, 2022 at 7:37 AM Simon Riggs
> >  wrote:
> > > > Whilte at it, I noticed that we report redo progress for PITR, but we
> > > > don't report when standby enters archive recovery mode, say due to a
> > > > failure in the connection to primary or after the promote signal is
> > > > found. Isn't it useful to report in this case as well to know the
> > > > recovery progress?
> > >
> > > I think your patch disables progress too early, effectively turning
> > > off the standby progress feature. The purpose was to report on things
> > > that take long periods during recovery, not just prior to recovery.
> > >
> > > I would advocate that we disable progress only while waiting, as I've 
> > > done here:
> > > https://www.postgresql.org/message-id/CANbhV-GcWjZ2cmj0uCbZDWQUHnneMi_4EfY3dVWq0-yD5o7Ccg%40mail.gmail.com
> >
> > Maybe I'm confused here, but I think that, on a standby, startup
> > progress messages are only printed until the main redo loop is
> > reached. Otherwise, we would print a message on a standby every 10s
> > forever, which seems like a thing that most users would not like. So I
> > think that Bharath has the right idea here.
>
> Yes, the idea is to disable the timeout on standby completely since we
> actually don't report any recovery progress. Keeping it enabled,
> unnecessarily calls startup_progress_timeout_handler() every
> log_startup_progress_interval seconds i.e. 10 seconds. That's the
> intention of the patch.

As long as we don't get the SIGALRMs that Thomas identified, then I'm happy.

-- 
Simon Riggshttp://www.EnterpriseDB.com/




Re: when the startup process doesn't (logging startup delays)

2022-11-15 Thread Bharath Rupireddy
On Mon, Nov 14, 2022 at 9:31 PM Robert Haas  wrote:
>
> On Mon, Nov 14, 2022 at 7:37 AM Simon Riggs
>  wrote:
> > > Whilte at it, I noticed that we report redo progress for PITR, but we
> > > don't report when standby enters archive recovery mode, say due to a
> > > failure in the connection to primary or after the promote signal is
> > > found. Isn't it useful to report in this case as well to know the
> > > recovery progress?
> >
> > I think your patch disables progress too early, effectively turning
> > off the standby progress feature. The purpose was to report on things
> > that take long periods during recovery, not just prior to recovery.
> >
> > I would advocate that we disable progress only while waiting, as I've done 
> > here:
> > https://www.postgresql.org/message-id/CANbhV-GcWjZ2cmj0uCbZDWQUHnneMi_4EfY3dVWq0-yD5o7Ccg%40mail.gmail.com
>
> Maybe I'm confused here, but I think that, on a standby, startup
> progress messages are only printed until the main redo loop is
> reached. Otherwise, we would print a message on a standby every 10s
> forever, which seems like a thing that most users would not like. So I
> think that Bharath has the right idea here.

Yes, the idea is to disable the timeout on standby completely since we
actually don't report any recovery progress. Keeping it enabled,
unnecessarily calls startup_progress_timeout_handler() every
log_startup_progress_interval seconds i.e. 10 seconds. That's the
intention of the patch.

> I don't think that his patch is right in detail, though. I don't think
> the call to disable_timeout() needs to be conditional,

Yes, disable_timeout() returns if the timeout was previously disabled
i.e. all_timeouts[STARTUP_PROGRESS_TIMEOUT].active is false. I changed
it in the v2 patch.

> and I don't
> think the Assert is correct.

You're right. My intention there was to check if the timeout is
enabled while ereport_startup_progress() is called. In the v2 patch,
when we actually disable the timeout startup_progress_timer_expired
gets set to false and has_startup_progress_timeout_expired() just
returns in such a case.

> Also, I think that your patch has the
> right idea in encapsulating the disable_timeout() call inside a new
> function disable_startup_progress_timeout(), rather than having the
> details known directly by xlogrecovery.c.

Yes, I too like Simon's idea of {enable,
disable}_startup_progress_timeout functions, I utilized them in the v2
patch here.

I actually want to get rid of begin_startup_progress_phase() which now
becomes a thin wrapper calling disable and enable functions and ensure
the callers do follow enable()-report_progress()-disable() way to use
the feature, however I didn't code for that as it needs changes across
many files. If okay, I can code for that too. Thoughts?

Please review the v2 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From 0edba6df408bbde80bc2a955990021995f3ec07e Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Tue, 15 Nov 2022 13:01:33 +
Subject: [PATCH v2] Disable STARTUP_PROGRESS_TIMEOUT in standby mode

In standby mode, we actually don't report progress of recovery to
not bloat server logs as the standby is always in recovery unless
promoted. However, startup_progress_timeout_handler() gets called
every log_startup_progress_interval seconds, which is unnecessary.

Therefore, we disable the timeout in standby mode.

Reported-by: Thomas Munro
Authors: Bharath Rupireddy, Simon Riggs
Reviewed-by: Robert Haas
Backpatch-through: 15
Discussion: https://www.postgresql.org/message-id/CA%2BhUKGKCHSffAj8zZJKJvNX7ygnQFxVD6wm1d-2j3fVw%2BMafPQ%40mail.gmail.com
---
 src/backend/access/transam/xlogrecovery.c |  9 ++-
 src/backend/postmaster/startup.c  | 29 ---
 src/include/postmaster/startup.h  |  2 ++
 3 files changed, 36 insertions(+), 4 deletions(-)

diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index cb07694aea..20c0dfd9d1 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -1653,7 +1653,14 @@ PerformWalRecovery(void)
 		 */
 		do
 		{
-			if (!StandbyMode)
+			/*
+			 * To avoid server log bloat, we don't report recovery progress in
+			 * a standby as it will always be in recovery unless promoted. We
+			 * also disable the timeout as we don't need it anymore.
+			 */
+			if (StandbyMode)
+disable_startup_progress_timeout();
+			else
 ereport_startup_progress("redo in progress, elapsed time: %ld.%02d s, current LSN: %X/%X",
 		 LSN_FORMAT_ARGS(xlogreader->ReadRecPtr));
 
diff --git a/src/backend/postmaster/startup.c b/src/backend/postmaster/startup.c
index f99186eab7..2705e425e8 100644
--- a/src/backend/postmaster/startup.c
+++ b/src/backend/postmaster/startup.c
@@ -314,11 +314,22 @@ startup_progress_timeout_handler(void)
 	startup_progress_timer_expired = true;
 }
 

Re: when the startup process doesn't (logging startup delays)

2022-11-14 Thread Robert Haas
On Mon, Nov 14, 2022 at 7:37 AM Simon Riggs
 wrote:
> > Whilte at it, I noticed that we report redo progress for PITR, but we
> > don't report when standby enters archive recovery mode, say due to a
> > failure in the connection to primary or after the promote signal is
> > found. Isn't it useful to report in this case as well to know the
> > recovery progress?
>
> I think your patch disables progress too early, effectively turning
> off the standby progress feature. The purpose was to report on things
> that take long periods during recovery, not just prior to recovery.
>
> I would advocate that we disable progress only while waiting, as I've done 
> here:
> https://www.postgresql.org/message-id/CANbhV-GcWjZ2cmj0uCbZDWQUHnneMi_4EfY3dVWq0-yD5o7Ccg%40mail.gmail.com

Maybe I'm confused here, but I think that, on a standby, startup
progress messages are only printed until the main redo loop is
reached. Otherwise, we would print a message on a standby every 10s
forever, which seems like a thing that most users would not like. So I
think that Bharath has the right idea here.

I don't think that his patch is right in detail, though. I don't think
the call to disable_timeout() needs to be conditional, and I don't
think the Assert is correct. Also, I think that your patch has the
right idea in encapsulating the disable_timeout() call inside a new
function disable_startup_progress_timeout(), rather than having the
details known directly by xlogrecovery.c.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: when the startup process doesn't (logging startup delays)

2022-11-14 Thread Simon Riggs
On Tue, 8 Nov 2022 at 12:33, Bharath Rupireddy
 wrote:
>
> On Tue, Nov 8, 2022 at 4:35 PM Thomas Munro  wrote:
> >
> > On Sat, Oct 30, 2021 at 7:44 AM Robert Haas  wrote:
> > > Committed.
> >
> > Is it expected that an otherwise idle standby's recovery process
> > receives SIGALRM every N seconds, or should the timer be canceled at
> > that point, as there is no further progress to report?
>
> Nice catch. Yeah, that seems unnecessary, see the below standby logs.
> I think we need to disable_timeout(STARTUP_PROGRESS_TIMEOUT, false);,
> something like the attached? I think there'll be no issue with the
> patch since the StandbyMode gets reset only at the end of recovery (in
> FinishWalRecovery()) but it can very well be set during recovery (in
> ReadRecord()). Note that I also added an assertion in
> has_startup_progress_timeout_expired(), just in case.
>
> 2022-11-08 11:28:23.563 UTC [980909] LOG:  SIGALRM handle_sig_alarm received
> 2022-11-08 11:28:23.563 UTC [980909] LOG:
> startup_progress_timeout_handler called
> 2022-11-08 11:28:33.563 UTC [980909] LOG:  SIGALRM handle_sig_alarm received
> 2022-11-08 11:28:33.563 UTC [980909] LOG:
> startup_progress_timeout_handler called
> 2022-11-08 11:28:43.563 UTC [980909] LOG:  SIGALRM handle_sig_alarm received
> 2022-11-08 11:28:43.563 UTC [980909] LOG:
> startup_progress_timeout_handler called
> 2022-11-08 11:28:53.563 UTC [980909] LOG:  SIGALRM handle_sig_alarm received
> 2022-11-08 11:28:53.563 UTC [980909] LOG:
> startup_progress_timeout_handler called
>
> Whilte at it, I noticed that we report redo progress for PITR, but we
> don't report when standby enters archive recovery mode, say due to a
> failure in the connection to primary or after the promote signal is
> found. Isn't it useful to report in this case as well to know the
> recovery progress?

I think your patch disables progress too early, effectively turning
off the standby progress feature. The purpose was to report on things
that take long periods during recovery, not just prior to recovery.

I would advocate that we disable progress only while waiting, as I've done here:
https://www.postgresql.org/message-id/CANbhV-GcWjZ2cmj0uCbZDWQUHnneMi_4EfY3dVWq0-yD5o7Ccg%40mail.gmail.com

--
Simon Riggshttp://www.EnterpriseDB.com/




Re: when the startup process doesn't (logging startup delays)

2022-11-08 Thread Bharath Rupireddy
On Tue, Nov 8, 2022 at 4:35 PM Thomas Munro  wrote:
>
> On Sat, Oct 30, 2021 at 7:44 AM Robert Haas  wrote:
> > Committed.
>
> Is it expected that an otherwise idle standby's recovery process
> receives SIGALRM every N seconds, or should the timer be canceled at
> that point, as there is no further progress to report?

Nice catch. Yeah, that seems unnecessary, see the below standby logs.
I think we need to disable_timeout(STARTUP_PROGRESS_TIMEOUT, false);,
something like the attached? I think there'll be no issue with the
patch since the StandbyMode gets reset only at the end of recovery (in
FinishWalRecovery()) but it can very well be set during recovery (in
ReadRecord()). Note that I also added an assertion in
has_startup_progress_timeout_expired(), just in case.

2022-11-08 11:28:23.563 UTC [980909] LOG:  SIGALRM handle_sig_alarm received
2022-11-08 11:28:23.563 UTC [980909] LOG:
startup_progress_timeout_handler called
2022-11-08 11:28:33.563 UTC [980909] LOG:  SIGALRM handle_sig_alarm received
2022-11-08 11:28:33.563 UTC [980909] LOG:
startup_progress_timeout_handler called
2022-11-08 11:28:43.563 UTC [980909] LOG:  SIGALRM handle_sig_alarm received
2022-11-08 11:28:43.563 UTC [980909] LOG:
startup_progress_timeout_handler called
2022-11-08 11:28:53.563 UTC [980909] LOG:  SIGALRM handle_sig_alarm received
2022-11-08 11:28:53.563 UTC [980909] LOG:
startup_progress_timeout_handler called

Whilte at it, I noticed that we report redo progress for PITR, but we
don't report when standby enters archive recovery mode, say due to a
failure in the connection to primary or after the promote signal is
found. Isn't it useful to report in this case as well to know the
recovery progress?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From 40ac498a76f38903bbee37108812907a76bb1a78 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Tue, 8 Nov 2022 12:18:11 +
Subject: [PATCH v1] Disable STARTUP_PROGRESS_TIMEOUT in standby mode

---
 src/backend/access/transam/xlogrecovery.c | 13 -
 src/backend/postmaster/startup.c  |  2 ++
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index cb07694aea..5326a98633 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -63,6 +63,7 @@
 #include "utils/pg_lsn.h"
 #include "utils/ps_status.h"
 #include "utils/pg_rusage.h"
+#include "utils/timeout.h"
 
 /* Unsupported old recovery command file names (relative to $PGDATA) */
 #define RECOVERY_COMMAND_FILE	"recovery.conf"
@@ -1653,7 +1654,17 @@ PerformWalRecovery(void)
 		 */
 		do
 		{
-			if (!StandbyMode)
+			/*
+			 * To avoid server log bloat, we don't report recovery progress in
+			 * a standby as it will always be in recovery unless promoted. We
+			 * also disable the timeout as we don't need it anymore.
+			 */
+			if (StandbyMode)
+			{
+if (get_timeout_active(STARTUP_PROGRESS_TIMEOUT))
+	disable_timeout(STARTUP_PROGRESS_TIMEOUT, false);
+			}
+			else
 ereport_startup_progress("redo in progress, elapsed time: %ld.%02d s, current LSN: %X/%X",
 		 LSN_FORMAT_ARGS(xlogreader->ReadRecPtr));
 
diff --git a/src/backend/postmaster/startup.c b/src/backend/postmaster/startup.c
index f99186eab7..1456e3ad3a 100644
--- a/src/backend/postmaster/startup.c
+++ b/src/backend/postmaster/startup.c
@@ -347,6 +347,8 @@ has_startup_progress_timeout_expired(long *secs, int *usecs)
 	int			useconds;
 	TimestampTz now;
 
+	Assert(get_timeout_active(STARTUP_PROGRESS_TIMEOUT) == true);
+
 	/* No timeout has occurred. */
 	if (!startup_progress_timer_expired)
 		return false;
-- 
2.34.1



Re: when the startup process doesn't (logging startup delays)

2022-11-08 Thread Thomas Munro
On Sat, Oct 30, 2021 at 7:44 AM Robert Haas  wrote:
> Committed.

Is it expected that an otherwise idle standby's recovery process
receives SIGALRM every N seconds, or should the timer be canceled at
that point, as there is no further progress to report?




Re: when the startup process doesn't (logging startup delays)

2021-10-29 Thread Robert Haas
On Fri, Oct 29, 2021 at 9:10 AM Nitin Jadhav
 wrote:
> > I think you're wrong. If we did that, the previous timer could fire
> > right after we set startup_progress_timer_expired = false, and before
> > we reschedule the timeout. It seems annoying to have to disable the
> > timeout and immediately turn around and re-enable it, but I don't see
> > how to avoid the race condition otherwise.
>
> Right. There is a possibility of race conditions. In that case the
> above changes look good to me.

Committed.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: when the startup process doesn't (logging startup delays)

2021-10-29 Thread Nitin Jadhav
> I think you're wrong. If we did that, the previous timer could fire
> right after we set startup_progress_timer_expired = false, and before
> we reschedule the timeout. It seems annoying to have to disable the
> timeout and immediately turn around and re-enable it, but I don't see
> how to avoid the race condition otherwise.

Right. There is a possibility of race conditions. In that case the
above changes look good to me.

Thanks & Regards,
Nitin Jadhav

On Fri, Oct 29, 2021 at 6:10 PM Robert Haas  wrote:
>
> On Fri, Oct 29, 2021 at 7:37 AM Nitin Jadhav
>  wrote:
> > ereport_startup_progress() logs a message. So I feel just setting
> > 'startup_progress_timer_expired' to false in
> > begin_startup_progress_phase() would work. Please correct me if I am
> > wrong.
>
> I think you're wrong. If we did that, the previous timer could fire
> right after we set startup_progress_timer_expired = false, and before
> we reschedule the timeout. It seems annoying to have to disable the
> timeout and immediately turn around and re-enable it, but I don't see
> how to avoid the race condition otherwise.
>
> --
> Robert Haas
> EDB: http://www.enterprisedb.com
>
>




Re: when the startup process doesn't (logging startup delays)

2021-10-29 Thread Robert Haas
On Fri, Oct 29, 2021 at 7:37 AM Nitin Jadhav
 wrote:
> ereport_startup_progress() logs a message. So I feel just setting
> 'startup_progress_timer_expired' to false in
> begin_startup_progress_phase() would work. Please correct me if I am
> wrong.

I think you're wrong. If we did that, the previous timer could fire
right after we set startup_progress_timer_expired = false, and before
we reschedule the timeout. It seems annoying to have to disable the
timeout and immediately turn around and re-enable it, but I don't see
how to avoid the race condition otherwise.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: when the startup process doesn't (logging startup delays)

2021-10-29 Thread Nitin Jadhav
> I was fooling around with a test setup today, working on an unrelated
> problem, and this happened:
>
> 2021-10-28 14:21:23.145 EDT [92010] LOG:  resetting unlogged relations
> (init), elapsed time: 0.00 s, current path: base/13020

Nice catch and interesting case.

> That's not supposed to happen. I assume the problem is that the
> timeout for the previous phase fired just as we were beginning a new
> one, and the code got confused. I think we probably need to do
> something like this to make sure that the timeout from one operation
> can't trigger a log message for the next:
>
> diff --git a/src/backend/postmaster/startup.c 
> b/src/backend/postmaster/startup.c
> index 28e68dd871..47ec737888 100644
> --- a/src/backend/postmaster/startup.c
> +++ b/src/backend/postmaster/startup.c
> @@ -320,6 +320,8 @@ begin_startup_progress_phase(void)
> if (log_startup_progress_interval == 0)
> return;
>
> +disable_timeout(STARTUP_PROGRESS_TIMEOUT, false);
> +startup_progress_timer_expired = false;
> startup_progress_phase_start_time = GetCurrentTimestamp();
> fin_time = TimestampTzPlusMilliseconds(startup_progress_phase_start_time,
>log_startup_progress_interval);
>
> Thoughts?

Yes. I agree that this is not an expected behaviour and I do agree
that, probably the timeout for the previous phase fired just as we
were beginning a new one. For each operation, we call
begin_startup_progress_phase() before starting the operation and
one/multiple calls to ereport_startup_progress() to report the
progress and intentionally we have not added any functionality to
disable the timer at the end of the operation. The timer remains
active and there may be multiple calls to
startup_progress_timeout_handler() which sets a flag to true. So
whenever we call a begin_startup_progress_phase() for the next
operation, we do enable the timer (In my understanding it will
internally disable the old timer and schedule a new one) but the flag
is already set to true by the previous timer. Hence the next call to
ereport_startup_progress() logs a message. So I feel just setting
'startup_progress_timer_expired' to false in
begin_startup_progress_phase() would work. Please correct me if I am
wrong.

Thanks & Regards,
Nitin Jadhav

On Thu, Oct 28, 2021 at 11:59 PM Robert Haas  wrote:
>
> On Mon, Oct 25, 2021 at 11:56 AM Robert Haas  wrote:
> > This version looks fine, so I have committed it (and my
> > enable_timeout_every patch also, as a necessary prerequisite).
>
> I was fooling around with a test setup today, working on an unrelated
> problem, and this happened:
>
> 2021-10-28 14:21:23.145 EDT [92010] LOG:  resetting unlogged relations
> (init), elapsed time: 0.00 s, current path: base/13020
>
> That's not supposed to happen. I assume the problem is that the
> timeout for the previous phase fired just as we were beginning a new
> one, and the code got confused. I think we probably need to do
> something like this to make sure that the timeout from one operation
> can't trigger a log message for the next:
>
> diff --git a/src/backend/postmaster/startup.c 
> b/src/backend/postmaster/startup.c
> index 28e68dd871..47ec737888 100644
> --- a/src/backend/postmaster/startup.c
> +++ b/src/backend/postmaster/startup.c
> @@ -320,6 +320,8 @@ begin_startup_progress_phase(void)
>  if (log_startup_progress_interval == 0)
>  return;
>
> +disable_timeout(STARTUP_PROGRESS_TIMEOUT, false);
> +startup_progress_timer_expired = false;
>  startup_progress_phase_start_time = GetCurrentTimestamp();
>  fin_time = TimestampTzPlusMilliseconds(startup_progress_phase_start_time,
> log_startup_progress_interval);
>
> Thoughts?
>
> --
> Robert Haas
> EDB: http://www.enterprisedb.com




Re: when the startup process doesn't (logging startup delays)

2021-10-28 Thread Robert Haas
On Mon, Oct 25, 2021 at 11:56 AM Robert Haas  wrote:
> This version looks fine, so I have committed it (and my
> enable_timeout_every patch also, as a necessary prerequisite).

I was fooling around with a test setup today, working on an unrelated
problem, and this happened:

2021-10-28 14:21:23.145 EDT [92010] LOG:  resetting unlogged relations
(init), elapsed time: 0.00 s, current path: base/13020

That's not supposed to happen. I assume the problem is that the
timeout for the previous phase fired just as we were beginning a new
one, and the code got confused. I think we probably need to do
something like this to make sure that the timeout from one operation
can't trigger a log message for the next:

diff --git a/src/backend/postmaster/startup.c b/src/backend/postmaster/startup.c
index 28e68dd871..47ec737888 100644
--- a/src/backend/postmaster/startup.c
+++ b/src/backend/postmaster/startup.c
@@ -320,6 +320,8 @@ begin_startup_progress_phase(void)
 if (log_startup_progress_interval == 0)
 return;

+disable_timeout(STARTUP_PROGRESS_TIMEOUT, false);
+startup_progress_timer_expired = false;
 startup_progress_phase_start_time = GetCurrentTimestamp();
 fin_time = TimestampTzPlusMilliseconds(startup_progress_phase_start_time,
log_startup_progress_interval);

Thoughts?

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: when the startup process doesn't (logging startup delays)

2021-10-26 Thread Robert Haas
On Tue, Oct 26, 2021 at 4:19 AM Bharath Rupireddy
 wrote:
> Can we also log the total time the startup process took to recover,
> and also the total time each stage of the recovery/redo processing
> took: 1) into a file or 2) emitting that info via a new hook 3) into a
> system catalog table (assuming at the end of the recovery the database
> is in a consistent state, but I'm not sure if we ever update any
> catalog tables in/after the startup/recovery phase).

#3 would be hard to do because there could be any number of databases
and it is unclear which one we ought to update. Also, we'd have to
launch a background worker to connect to that database in order to do
it, and be prepared for what happens if that worker fails to get the
work done for whatever reason. Also, it is unclear why we should only
log this specific thing to a system catalog and not anything else.

#1 and #2 could certainly be done, but I'm not sure what the use case
is, and I think it's a separate proposal from what we did here. So I
think a new thread would be appropriate if you want to make a new
proposal.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: when the startup process doesn't (logging startup delays)

2021-10-26 Thread Bharath Rupireddy
On Mon, Oct 25, 2021 at 9:26 PM Robert Haas  wrote:
>
> On Tue, Oct 19, 2021 at 9:06 AM Nitin Jadhav
>  wrote:
> > Thanks for sharing the patch. Overall approach looks good to me. But
> > just one concern about using enable_timeout_every() functionality. As
> > per my understanding the caller should calculate the first scheduled
> > timeout (now + interval) and pass it as the second argument but this
> > is not the same in 'v2-0002-Quick-testing-hack.patch'. Anyways I have
> > done the changes as I have mentioned (like now + interval). Kindly
> > correct me if I am wrong. I am attaching 2 patches here.
> > 'v19-0001-Add-enable_timeout_every-to-fire-the-same-timeout.patch' is
> > the same as Robert's v2 patch. I have rebased my patch on top of this
> > and it is 'v19-0002-startup-progress.patch'.
>
> This version looks fine, so I have committed it (and my
> enable_timeout_every patch also, as a necessary prerequisite).

Thanks for getting this in.

I have few more thoughts:

Can we also log the total time the startup process took to recover,
and also the total time each stage of the recovery/redo processing
took: 1) into a file or 2) emitting that info via a new hook 3) into a
system catalog table (assuming at the end of the recovery the database
is in a consistent state, but I'm not sure if we ever update any
catalog tables in/after the startup/recovery phase).

This will help the users/admins/developers for summarizing, analytical
and debugging purposes. This information can easily help us to
understand the recovery patterns.

Thoughts?

If okay, I can spend some more time and start a separate thread to discuss.

Regards,
Bharath Rupireddy.




Re: when the startup process doesn't (logging startup delays)

2021-10-25 Thread Robert Haas
On Tue, Oct 19, 2021 at 9:06 AM Nitin Jadhav
 wrote:
> Thanks for sharing the patch. Overall approach looks good to me. But
> just one concern about using enable_timeout_every() functionality. As
> per my understanding the caller should calculate the first scheduled
> timeout (now + interval) and pass it as the second argument but this
> is not the same in 'v2-0002-Quick-testing-hack.patch'. Anyways I have
> done the changes as I have mentioned (like now + interval). Kindly
> correct me if I am wrong. I am attaching 2 patches here.
> 'v19-0001-Add-enable_timeout_every-to-fire-the-same-timeout.patch' is
> the same as Robert's v2 patch. I have rebased my patch on top of this
> and it is 'v19-0002-startup-progress.patch'.

This version looks fine, so I have committed it (and my
enable_timeout_every patch also, as a necessary prerequisite).

Thanks,

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: when the startup process doesn't (logging startup delays)

2021-10-19 Thread Nitin Jadhav
> Apparently not, but here's a v2 anyway. In this version I made
> enable_timeout_every() a three-argument function, so that the caller
> can specify both the first time at which the timeout routine should be
> called and the interval between them, instead of only the latter. That
> seems to be more convenient for this use case, and is more powerful in
> general.

Thanks for sharing the patch. Overall approach looks good to me. But
just one concern about using enable_timeout_every() functionality. As
per my understanding the caller should calculate the first scheduled
timeout (now + interval) and pass it as the second argument but this
is not the same in 'v2-0002-Quick-testing-hack.patch'. Anyways I have
done the changes as I have mentioned (like now + interval). Kindly
correct me if I am wrong. I am attaching 2 patches here.
'v19-0001-Add-enable_timeout_every-to-fire-the-same-timeout.patch' is
the same as Robert's v2 patch. I have rebased my patch on top of this
and it is 'v19-0002-startup-progress.patch'.

> I just noticed something else which I realize is the indirect result
> of my own suggestion but which doesn't actually look all that nice.
> You've now got a call to RegisterTimeout(STARTUP_PROGRESS_TIMEOUT,
> ...) in InitPostgres, guarded by ! IsPostmasterEnvironment, and then
> another one in StartupProcessMain(). I think that's so that the
> feature works in single-user mode, but that means that technically,
> we're not reporting on the progress of the startup process. We're
> reporting progress on the startup operations that are normally
> performed by the startup process. But that means that the
> documentation isn't quite accurate (because it mentions the startup
> process specifically) and that the placement of the code in startup.c
> is suspect (because that's specifically for the startup process) and
> that basically every instance of the word "process" in the patch is
> technically a little bit wrong. I'm not sure if that's a big enough
> problem to be worth worrying about or exactly what we ought to do
> about it, but it doesn't seem fantastic. At a minimum, I think we
> should probably try to eliminate as many references to the startup
> process as we can, and talk about startup progress or startup
> operations or something like that.

Yeah right. I have modified the comments accordingly and also fixed
the other review comments related to the code comments.

I have modified the code to include a call to RegisterTimeout() in
only one place than the two calls previously. Initially I thought to
call this in begin_startup_progress_phase(). I feel this is not a
better choice since begin_startup_progress_phase function is getting
called in many places. So it calls RegisterTimeout() many times which
is not required. I feel StartupXLOG() is a better place for this since
StartupXLOG() gets called during the startup process, bootstrap
process or standalone backend. As per earlier discussion we need
support for this in the case of startup process and standalone
backend. Hence guarded this with '!IsBootstrapProcessingMode()'. Also
verified the behaviour in both of the cases. Please correct me if I am
wrong.

Thanks & Regards,
Nitin Jadhav

On Mon, Oct 18, 2021 at 9:15 PM Robert Haas  wrote:
>
> On Thu, Sep 30, 2021 at 5:08 PM Robert Haas  wrote:
> > Any thoughts on the patch I attached?
>
> Apparently not, but here's a v2 anyway. In this version I made
> enable_timeout_every() a three-argument function, so that the caller
> can specify both the first time at which the timeout routine should be
> called and the interval between them, instead of only the latter. That
> seems to be more convenient for this use case, and is more powerful in
> general.
>
> --
> Robert Haas
> EDB: http://www.enterprisedb.com


v19-0001-Add-enable_timeout_every-to-fire-the-same-timeout.patch
Description: Binary data


v19-0002-startup-progress.patch
Description: Binary data


Re: when the startup process doesn't (logging startup delays)

2021-10-18 Thread Robert Haas
On Thu, Sep 30, 2021 at 5:08 PM Robert Haas  wrote:
> Any thoughts on the patch I attached?

Apparently not, but here's a v2 anyway. In this version I made
enable_timeout_every() a three-argument function, so that the caller
can specify both the first time at which the timeout routine should be
called and the interval between them, instead of only the latter. That
seems to be more convenient for this use case, and is more powerful in
general.

-- 
Robert Haas
EDB: http://www.enterprisedb.com


v2-0001-Add-enable_timeout_every-to-fire-the-same-timeout.patch
Description: Binary data


v2-0002-Quick-testing-hack.patch
Description: Binary data


Re: when the startup process doesn't (logging startup delays)

2021-10-13 Thread Robert Haas
On Wed, Oct 13, 2021 at 9:06 AM Amul Sul  wrote:
> I think the "elapsed time" part can be implicitly added to the error
> message inside ereport_startup_progress() which is common to all
> calls.

Not if it means having to call psprintf there!

If there's some way we could do it with macro tricks, it might be
worth considering, but I'm not sure there is, or that it would be less
confusing.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: when the startup process doesn't (logging startup delays)

2021-10-13 Thread Amul Sul
On Wed, Sep 29, 2021 at 11:10 PM Robert Haas  wrote:
>
> On Wed, Sep 29, 2021 at 10:08 AM Nitin Jadhav
>  wrote:
> > Sorry. There was a misunderstanding about this and for the patch
> > shared on September 27th, I had tested for the value '0' and observed
> > that no progress messages were getting logged, probably the time at
> > which 'enable_timeout_at' is getting called is past the 'next_timeout'
> > value. This behaviour is completely dependent on the system. Now added
> > an extra condition to disable the feature in case of '0' setting.
>
> Oh, interesting. I failed to consider that the behavior might vary
> from one system to another.
>
> I just noticed something else which I realize is the indirect result
> of my own suggestion but which doesn't actually look all that nice.
> You've now got a call to RegisterTimeout(STARTUP_PROGRESS_TIMEOUT,
> ...) in InitPostgres, guarded by ! IsPostmasterEnvironment, and then
> another one in StartupProcessMain(). I think that's so that the
> feature works in single-user mode, but that means that technically,
> we're not reporting on the progress of the startup process. We're
> reporting progress on the startup operations that are normally
> performed by the startup process. But that means that the
> documentation isn't quite accurate (because it mentions the startup
> process specifically) and that the placement of the code in startup.c
> is suspect (because that's specifically for the startup process) and
> that basically every instance of the word "process" in the patch is
> technically a little bit wrong. I'm not sure if that's a big enough
> problem to be worth worrying about or exactly what we ought to do
> about it, but it doesn't seem fantastic. At a minimum, I think we
> should probably try to eliminate as many references to the startup
> process as we can, and talk about startup progress or startup
> operations or something like that.
>
> + * Start timestamp of the operation that occur during startup process.
>
> This is not correct grammar - it would need to be "operations that
> occur" or "operation that occurs". But neither seems particularly
> clear about what the variable actually does. How about "Time at which
> the most recent startup operation started"?
>
> + * Indicates the timestamp at which the timer was supposed to expire.
>
> "Indicates" can be deleted, but also I think it would be better to
> state the purpose of the timer i.e. "Timestamp at which the next
> startup progress message should be logged."
>
> + enable_timeout_at(STARTUP_PROGRESS_TIMEOUT, next_timeout);
> + scheduled_startup_progress_timeout = next_timeout;
> + startup_progress_timer_expired = false;
>
> I think you should set startup_progress_timer_expired to false before
> calling enable_timeout_at. Otherwise there's a race condition. It's
> unlikely that the timer could expire and the interrupt handler fire
> before we reach startup_progress_timer_expired = false, but it seems
> like there's no reason to take a chance.
>
> + * Calculates the timestamp at which the next timer should expire and enables
>
> So in some places you have verbs with an "s" on the end, like here,
> and in other places without, like in the next example. In "telegraph
> style" comments like this, this implicit subject is "it" or "this,"
> but you don't write that. However you write the rest of the sentence
> as if it were there. So this should say "calculate" and "enable"
> rather than "calculates" and "enables".
>
> + * Schedule a wakeup call for logging the progress of startup process.
>
> This isn't really an accurate description, I think. It's not
> scheduling anything, and I don't know what a "wakeup call" is anyway.
> "Set a flag indicating that it's time to log a progress report" might
> be better.
>
> + * Sets the start timestamp of the current operation and also enables the
>
> Set. enable.
>
> + * timeout for logging the progress of startup process.
>
> I think you could delete "for logging the progress of startup process"
> here; that seems clear enough, and this reads a bit awkwardly. If you
> want to keep something like this I wrote write "...enable the timeout
> so that the progress of the startup progress will be logged."
>
> + * the timer if it did, otheriwse return false.
>
> otherwise
>
> + * Begin the startup progress phase to report the progress of syncing
> + * data directory (syncfs).
>
> All of the comments that start with "Begin the startup progress phase"
> should instead start with "Begin startup progress phase".
>
> I think this could be condensed down to "Prepare to report progress
> syncing the data directory via syncfs", and likewise "Prepare to
> report progress of the pre-fsync phase", "Prepare to report progress
> resetting unlogged relations," etc.
>
> + gettext_noop("Sets the time interval between each progress update "
> + "of the startup process."),
>
> This is actually inaccurate. It's sort of the same problem I was
> worried about with respect to the 

Re: when the startup process doesn't (logging startup delays)

2021-10-01 Thread Michael Paquier
On Thu, Sep 30, 2021 at 05:08:17PM -0400, Robert Haas wrote:
> It's certainly less of an issue than it used to be back in my day.
> 
> Any thoughts on the patch I attached?

I don't know.  Anyway, this is actively worked on, so I have taken the
liberty to move that to the next CF.
--
Michael


signature.asc
Description: PGP signature


Re: when the startup process doesn't (logging startup delays)

2021-09-30 Thread Robert Haas
On Thu, Sep 30, 2021 at 3:10 PM Tom Lane  wrote:
> That would be lovely, certainly.  But aren't you moving the goalposts
> rather far?  I don't think we make any promises about such things
> today, so why has the issue suddenly gotten more pressing?

Yeah, perhaps it's best to not to worry about it. I dislike failure to
worry about that case on general principle, but I agree with you that
it seems to be moving the goalposts a disproportionate distance.

> In particular,
> why do you think Nitin's patch is proof against this?  Seems to me it's
> probably got *more* failure cases, not fewer, if the system clock is
> acting funny.

You might be right. I sort of assumed that timeout.c had some defense
against this, but since that seems not to be the case, I suppose no
facility that depends on it can hope to stay out of trouble either.

> On the whole, in these days of NTP, I'm not sure I care to spend
> large amounts of effort on dealing with a bogus system clock.

It's certainly less of an issue than it used to be back in my day.

Any thoughts on the patch I attached?

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: when the startup process doesn't (logging startup delays)

2021-09-30 Thread Tom Lane
Robert Haas  writes:
> ...  When I say I want my handler to be
> fired in 3 s, I don't mean that I want it to be fired when the system
> time is 3 seconds greater than it is right now. I mean I want it to be
> fired in 3 actual seconds, regardless of what dumb thing the system
> clock may choose to do.

That would be lovely, certainly.  But aren't you moving the goalposts
rather far?  I don't think we make any promises about such things
today, so why has the issue suddenly gotten more pressing?  In particular,
why do you think Nitin's patch is proof against this?  Seems to me it's
probably got *more* failure cases, not fewer, if the system clock is
acting funny.

BTW, one could imagine addressing this concern by having timeout.c work
with CLOCK_MONOTONIC instead of the regular wall clock.  But I fear
we'd have to drop enable_timeout_at(), for lack of ability to translate
between CLOCK_MONOTONIC timestamps and those used by anybody else.
Also get_timeout_start_time/get_timeout_finish_time would become
problematic.  Maybe we only really care about deltas, so the more
restrictive API would be workable, but it seems like a nontrivial
amount of work.

On the whole, in these days of NTP, I'm not sure I care to spend
large amounts of effort on dealing with a bogus system clock.

regards, tom lane




Re: when the startup process doesn't (logging startup delays)

2021-09-30 Thread Robert Haas
On Wed, Sep 29, 2021 at 5:12 PM Tom Lane  wrote:
> I didn't claim there are any other places that could use the feature
> *today*.  But once we've got one, it seems like there could be more
> tomorrow.  In any case, I dislike keeping timeout state data outside
> timeout.c, because it's so likely to get out-of-sync that way.

Well, I had a quick go at implementing this (attached).

It seems to do a satisfactory job preventing drift over time, but it
doesn't behave nicely if you set the system clock backward. With a bit
of extra debugging output:

2021-09-30 14:23:50.291 EDT [2279] LOG:  scheduling wakeup in 2 secs,
998727 usecs
2021-09-30 14:23:53.291 EDT [2279] LOG:  scheduling wakeup in 2 secs,
998730 usecs
2021-09-30 14:23:56.291 EDT [2279] LOG:  scheduling wakeup in 2 secs,
998728 usecs
2021-09-30 14:20:01.154 EDT [2279] LOG:  scheduling wakeup in 238
secs, 135532 usecs
2021-09-30 14:23:59.294 EDT [2279] LOG:  scheduling wakeup in 2 secs, 995922 use

The issue here is that fin_time is really the driving force behind
everything timeout.c does. In particular, it determines the order of
active_timeouts[]. And that's not really correct either for
enable_timeout_after(), or for the new function I added in this draft
patch, enable_timeout_every(). When I say I want my handler to be
fired in 3 s, I don't mean that I want it to be fired when the system
time is 3 seconds greater than it is right now. I mean I want it to be
fired in 3 actual seconds, regardless of what dumb thing the system
clock may choose to do. I don't really think that precise behavior can
be implemented, but ideally if a timeout that was supposed to happen
after 3 s is now scheduled for a time that is more than 3 seconds
beyond the current value of the system clock, we'd move the firing
time backwards to 3 seconds beyond the current system clock. That way,
if you set the system time backward by 4 minutes, you might see a few
seconds of delay before the next firing, but you wouldn't go into the
tank for 4 minutes.

I don't see an obvious way of making timeout.c behave like that
without major surgery, though. If nobody else does either, then we
could (1) stick with something closer to Nitin's current patch, which
tries to handle this concern outside of timeout.c, (2) adopt something
like the attached 0001 and leave the question of improved behavior in
case of backwards system clock adjustments for another day, or (3)
undertake to rewrite timeout.c as a precondition of being able to log
messages about why startup is slow. I'm not a huge fan of (3) but I'm
open to suggestions.

-- 
Robert Haas
EDB: http://www.enterprisedb.com


0002-Quick-testing-hack.patch
Description: Binary data


0001-Add-enable_timeout_every-to-fire-the-same-timeout-re.patch
Description: Binary data


Re: when the startup process doesn't (logging startup delays)

2021-09-29 Thread Tom Lane
Robert Haas  writes:
> On Wed, Sep 29, 2021 at 2:06 PM Tom Lane  wrote:
>> The real comment I'd have here, though, is that writing one-off
>> code for this purpose is bad.  If we have a need for a repetitive
>> timeout, it'd be better to add the feature to timeout.c explicitly.
>> That would probably also remove the need for extra copies of the
>> timeout time.

> I'm not sure that really helps very much, honestly.

I didn't claim there are any other places that could use the feature
*today*.  But once we've got one, it seems like there could be more
tomorrow.  In any case, I dislike keeping timeout state data outside
timeout.c, because it's so likely to get out-of-sync that way.

regards, tom lane




Re: when the startup process doesn't (logging startup delays)

2021-09-29 Thread Robert Haas
On Wed, Sep 29, 2021 at 2:06 PM Tom Lane  wrote:
> The real comment I'd have here, though, is that writing one-off
> code for this purpose is bad.  If we have a need for a repetitive
> timeout, it'd be better to add the feature to timeout.c explicitly.
> That would probably also remove the need for extra copies of the
> timeout time.

I'm not sure that really helps very much, honestly.  I mean it would
be useful in this particular case, but there are other cases where we
have logic like this already, and this wouldn't do anything about
those. For example, consider autoprewarm_main(). Like this code, that
code thinks (perhaps just because I'm the one who reviewed it) that
the next time should be measured from the last time ... but an
enhancement to the timeout machinery wouldn't help it at all. I
suspect there are other cases like this elsewhere, because this is
what I personally tend to think is the right behavior and I feel like
it comes up in patch reviews from time to time, but I'm not finding
any at the moment. Even if I'm right that they exist, I'm not sure
they look much like each other or can easily reuse any code.

And then again on the other hand, BackgroundWriterMain() thinks that
the next time should be measured from the time we got around to doing
it, not the scheduled time. I guess we don't really have any
consistent practice here.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: when the startup process doesn't (logging startup delays)

2021-09-29 Thread Robert Haas
On Wed, Sep 29, 2021 at 1:52 PM Alvaro Herrera  wrote:
> I think one person casting an opinion on one aspect does not set that
> aspect in stone.

Of course not. I was just explaining that how the patch ended up like it did.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: when the startup process doesn't (logging startup delays)

2021-09-29 Thread Tom Lane
Alvaro Herrera  writes:
> On 2021-Sep-29, Robert Haas wrote:
>> Well, this was my suggestion, because if you don't do this, you get
>> drift, which I think looks weird. Like the timestamps will be:
>> 
>> 13:41:05.012456
>> 13:41:15.072484
>> 13:41:25.149632
>> 
>> ...and it gets further and further off as it goes on.'

> Right ... I actually *expect* this drift to occur.  Maybe people
> generally don't like this, it just seems natural to me.  Are there other
> opinions on this aspect?

FWIW, I agree with Robert that it's nicer if the timeout doesn't drift.
There's a limit to how much complexity I'm willing to tolerate for that,
but it doesn't seem like this exceeds it.

The real comment I'd have here, though, is that writing one-off
code for this purpose is bad.  If we have a need for a repetitive
timeout, it'd be better to add the feature to timeout.c explicitly.
That would probably also remove the need for extra copies of the
timeout time.

regards, tom lane




Re: when the startup process doesn't (logging startup delays)

2021-09-29 Thread Alvaro Herrera
On 2021-Sep-29, Justin Pryzby wrote:

> Robert requested the current behavior here.
> https://www.postgresql.org/message-id/CA%2BTgmoYkS1ZeWdSMFMBecMNxWonHk6J5eoX4FEQrpKtvEbXqGQ%40mail.gmail.com
> 
> It's confusing (at least) to get these kind of requests to change the behavior
> back and forth.

Well, I did scan the thread to see if this had been discussed, and I
overlooked that message.  But there was no reply to that message, so
it's not clear whether this was just Robert's opinion or consensus; in
fact we now have exactly two votes on it (mine and Robert's).

I think one person casting an opinion on one aspect does not set that
aspect in stone.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"It takes less than 2 seconds to get to 78% complete; that's a good sign.
A few seconds later it's at 90%, but it seems to have stuck there.  Did
somebody make percentages logarithmic while I wasn't looking?"
http://smylers.hates-software.com/2005/09/08/1995c749.html




Re: when the startup process doesn't (logging startup delays)

2021-09-29 Thread Alvaro Herrera
On 2021-Sep-29, Robert Haas wrote:

> Well, this was my suggestion, because if you don't do this, you get
> drift, which I think looks weird. Like the timestamps will be:
> 
> 13:41:05.012456
> 13:41:15.072484
> 13:41:25.149632
> 
> ...and it gets further and further off as it goes on.'

Right ... I actually *expect* this drift to occur.  Maybe people
generally don't like this, it just seems natural to me.  Are there other
opinions on this aspect?

-- 
Álvaro Herrera   39°49'30"S 73°17'W  —  https://www.EnterpriseDB.com/
"Nadie está tan esclavizado como el que se cree libre no siéndolo" (Goethe)




Re: when the startup process doesn't (logging startup delays)

2021-09-29 Thread Justin Pryzby
On Wed, Sep 29, 2021 at 02:36:14PM -0300, Alvaro Herrera wrote:
> Why is it that we set the next timeout to fire not at "now + interval"
> but at "when-it-should-have-fired-but-didn't + interval"?  As a user, if
> I request a message to be logged every N milliseconds, and one
> of them is a little bit delayed, then (assuming I set it to 10s) I still
> expect the next one to occur at now+10s.  I don't expect the next at
> "now+5s" if one is delayed 5s.
> 
> In other words, I think this function should just be
>   enable_timeout_after(STARTUP_PROGRESS_TIMEOUT, 
> log_startup_progress_interval);
> 
> This means you can remove the scheduled_startup_progress_timeout
> variable.

Robert requested the current behavior here.
https://www.postgresql.org/message-id/CA%2BTgmoYkS1ZeWdSMFMBecMNxWonHk6J5eoX4FEQrpKtvEbXqGQ%40mail.gmail.com

It's confusing (at least) to get these kind of requests to change the behavior
back and forth.

-- 
Justin




Re: when the startup process doesn't (logging startup delays)

2021-09-29 Thread Robert Haas
On Wed, Sep 29, 2021 at 1:36 PM Alvaro Herrera  wrote:
> Why is it that we set the next timeout to fire not at "now + interval"
> but at "when-it-should-have-fired-but-didn't + interval"?  As a user, if
> I request a message to be logged every N milliseconds, and one
> of them is a little bit delayed, then (assuming I set it to 10s) I still
> expect the next one to occur at now+10s.  I don't expect the next at
> "now+5s" if one is delayed 5s.

Well, this was my suggestion, because if you don't do this, you get
drift, which I think looks weird. Like the timestamps will be:

13:41:05.012456
13:41:15.072484
13:41:25.149632

...and it gets further and further off as it goes on.'

I guess my expectation is different from yours: I expect that if I ask
for a message every 10 seconds, the time between messages is going to
be 10s, at least on average, not 10s + however much latency the system
has.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: when the startup process doesn't (logging startup delays)

2021-09-29 Thread Robert Haas
On Wed, Sep 29, 2021 at 10:08 AM Nitin Jadhav
 wrote:
> Sorry. There was a misunderstanding about this and for the patch
> shared on September 27th, I had tested for the value '0' and observed
> that no progress messages were getting logged, probably the time at
> which 'enable_timeout_at' is getting called is past the 'next_timeout'
> value. This behaviour is completely dependent on the system. Now added
> an extra condition to disable the feature in case of '0' setting.

Oh, interesting. I failed to consider that the behavior might vary
from one system to another.

I just noticed something else which I realize is the indirect result
of my own suggestion but which doesn't actually look all that nice.
You've now got a call to RegisterTimeout(STARTUP_PROGRESS_TIMEOUT,
...) in InitPostgres, guarded by ! IsPostmasterEnvironment, and then
another one in StartupProcessMain(). I think that's so that the
feature works in single-user mode, but that means that technically,
we're not reporting on the progress of the startup process. We're
reporting progress on the startup operations that are normally
performed by the startup process. But that means that the
documentation isn't quite accurate (because it mentions the startup
process specifically) and that the placement of the code in startup.c
is suspect (because that's specifically for the startup process) and
that basically every instance of the word "process" in the patch is
technically a little bit wrong. I'm not sure if that's a big enough
problem to be worth worrying about or exactly what we ought to do
about it, but it doesn't seem fantastic. At a minimum, I think we
should probably try to eliminate as many references to the startup
process as we can, and talk about startup progress or startup
operations or something like that.

+ * Start timestamp of the operation that occur during startup process.

This is not correct grammar - it would need to be "operations that
occur" or "operation that occurs". But neither seems particularly
clear about what the variable actually does. How about "Time at which
the most recent startup operation started"?

+ * Indicates the timestamp at which the timer was supposed to expire.

"Indicates" can be deleted, but also I think it would be better to
state the purpose of the timer i.e. "Timestamp at which the next
startup progress message should be logged."

+ enable_timeout_at(STARTUP_PROGRESS_TIMEOUT, next_timeout);
+ scheduled_startup_progress_timeout = next_timeout;
+ startup_progress_timer_expired = false;

I think you should set startup_progress_timer_expired to false before
calling enable_timeout_at. Otherwise there's a race condition. It's
unlikely that the timer could expire and the interrupt handler fire
before we reach startup_progress_timer_expired = false, but it seems
like there's no reason to take a chance.

+ * Calculates the timestamp at which the next timer should expire and enables

So in some places you have verbs with an "s" on the end, like here,
and in other places without, like in the next example. In "telegraph
style" comments like this, this implicit subject is "it" or "this,"
but you don't write that. However you write the rest of the sentence
as if it were there. So this should say "calculate" and "enable"
rather than "calculates" and "enables".

+ * Schedule a wakeup call for logging the progress of startup process.

This isn't really an accurate description, I think. It's not
scheduling anything, and I don't know what a "wakeup call" is anyway.
"Set a flag indicating that it's time to log a progress report" might
be better.

+ * Sets the start timestamp of the current operation and also enables the

Set. enable.

+ * timeout for logging the progress of startup process.

I think you could delete "for logging the progress of startup process"
here; that seems clear enough, and this reads a bit awkwardly. If you
want to keep something like this I wrote write "...enable the timeout
so that the progress of the startup progress will be logged."

+ * the timer if it did, otheriwse return false.

otherwise

+ * Begin the startup progress phase to report the progress of syncing
+ * data directory (syncfs).

All of the comments that start with "Begin the startup progress phase"
should instead start with "Begin startup progress phase".

I think this could be condensed down to "Prepare to report progress
syncing the data directory via syncfs", and likewise "Prepare to
report progress of the pre-fsync phase", "Prepare to report progress
resetting unlogged relations," etc.

+ gettext_noop("Sets the time interval between each progress update "
+ "of the startup process."),

This is actually inaccurate. It's sort of the same problem I was
worried about with respect to the documentation: it's NOT the interval
between progress updates, because it applies separately to each
operation. We need to say something that makes that clear, or at least
that doesn't get overtly the opposite impression. It's hard to do that

Re: when the startup process doesn't (logging startup delays)

2021-09-29 Thread Alvaro Herrera
So, I've wondered about this part all along:

> +/*
> + * Calculates the timestamp at which the next timer should expire and enables
> + * the timer accordingly.
> + */
> +static void
> +reset_startup_progress_timeout(TimestampTz now)
> +{
> + TimestampTz next_timeout;
> +
> + next_timeout = 
> TimestampTzPlusMilliseconds(scheduled_startup_progress_timeout,
> + 
>log_startup_progress_interval);
> + if (next_timeout < now)
> + {
> + /*
> +  * Either the timeout was processed so late that we missed an
> +  * entire cycle or system clock was set backwards.
> +  */
> + next_timeout = TimestampTzPlusMilliseconds(now, 
> log_startup_progress_interval);
> + }
> +
> + enable_timeout_at(STARTUP_PROGRESS_TIMEOUT, next_timeout);

Why is it that we set the next timeout to fire not at "now + interval"
but at "when-it-should-have-fired-but-didn't + interval"?  As a user, if
I request a message to be logged every N milliseconds, and one
of them is a little bit delayed, then (assuming I set it to 10s) I still
expect the next one to occur at now+10s.  I don't expect the next at
"now+5s" if one is delayed 5s.

In other words, I think this function should just be
  enable_timeout_after(STARTUP_PROGRESS_TIMEOUT, log_startup_progress_interval);

This means you can remove the scheduled_startup_progress_timeout
variable.

-- 
Álvaro Herrera   39°49'30"S 73°17'W  —  https://www.EnterpriseDB.com/
"No hay ausente sin culpa ni presente sin disculpa" (Prov. francés)




Re: when the startup process doesn't (logging startup delays)

2021-09-29 Thread Nitin Jadhav
> It is common to mention what the default is as part of the
> documentation of a GUC. I don't see why this one should be an
> exception, especially since not mentioning it reduces the length of
> the documentation by exactly one word.
>
> I don't mind the sentence you added at the end to clarify the default
> units, but the way you've rewritten the first sentence makes it, in my
> opinion, much less clear.

Ok. I have kept your documentation as it is and added the sentence at
the end to clarify the default units.

> v9 was posted on August 3rd. I told you that it wasn't working on
> September 23rd. You posted a new version that still does not work on
> September 27th. I think you should have tested each version of your
> patch before posting it, and especially after any major refactorings.
> And if for whatever reason you didn't, then certainly after I told you
> on September 23rd that it didn't work, I think you should have fixed
> it before posting a new version.

Sorry. There was a misunderstanding about this and for the patch
shared on September 27th, I had tested for the value '0' and observed
that no progress messages were getting logged, probably the time at
which 'enable_timeout_at' is getting called is past the 'next_timeout'
value. This behaviour is completely dependent on the system. Now added
an extra condition to disable the feature in case of '0' setting.

> I think this comment can be worded better.  It says it "decides", but it
> doesn't actually decide on any action to take -- it just reports whether
> the timer expired or not, to allow its caller to make the decision.  In
> such situations we just say something like "Report whether startup
> progress has caused a timeout, return true and rearm the timer if it
> did, or just return false otherwise"; and we don't indicate what the
> value is going to be used *for*.  Then the caller can use the boolean
> return value to make a decision, such as whether something is going to
> be logged.  This function can be oblivious to details such as this:
>
> here we can just say "No timeout has occurred" and make no inference
> about what's going to happen or not happen.

Modified the comment.

> Also, for functions that do things like this we typically use English
> sentence structure with the function name starting with the verb --
> perhaps has_startup_progress_timeout_expired().  Sometimes we are lax
> about this if we have some sort of poor-man's modularisation by using a
> common prefix for several functions doing related things, which perhaps
> could be "startup_progress_*" in your case, but your other functions are
> already not doing that (such as begin_startup_progress_phase) so it's
> not clear why you would not use the most natural name for this one.

I agree that has_startup_progress_timeout_expired() is better than the
existing function name. So I changed the function name accordingly.

> Please make sure to add ereport_startup_progress() as a translation
> trigger in src/backend/nls.mk.

I have added ereport_startup_progress() under the section
GETTEXT_TRIGGERS and GETTEXT_FLAGS in src/backend/nls.mk. Also
verified the messages in src/backend/po/postgres.pot file.

Kindly let me know if I have missed anything.

Thanks & Regards,
Nitin Jadhav

On Tue, Sep 28, 2021 at 8:29 PM Robert Haas  wrote:
>
> On Tue, Sep 28, 2021 at 8:06 AM Nitin Jadhav
>  wrote:
> > I thought mentioning the unit in milliseconds and the example in
> > seconds would confuse the user, so I changed the example to
> > milliseconds.The message behind the above description looks good to me
> > however I feel some sentences can be explained in less words. The
> > information related to the units is missing and I feel it should be
> > mentioned in the document. The example says, if the setting has the
> > default value of 10 seconds, then it explains the behaviour. I feel it
> > may not be the default value, it can be any value set by the user. So
> > mentioning 'default' in the example does not look good to me. I feel
> > we just have to mention "if this setting is set to the value of 10
> > seconds". Below is the modified version of the above information.
>
> It is common to mention what the default is as part of the
> documentation of a GUC. I don't see why this one should be an
> exception, especially since not mentioning it reduces the length of
> the documentation by exactly one word.
>
> I don't mind the sentence you added at the end to clarify the default
> units, but the way you've rewritten the first sentence makes it, in my
> opinion, much less clear.
>
> > I had added additional code to check the value of the
> > 'log_startup_progress_interval' variable and  disable the feature in
> > case of -1 in the earlier versions of the patch (Specifically
> > v9.patch). There was a review comment for v9 patch and it resulted in
> > major refactoring of the patch.
> ...
> > The answer for the question of "I don't understand why you posted the
> > previous version of the 

Re: when the startup process doesn't (logging startup delays)

2021-09-28 Thread Robert Haas
On Tue, Sep 28, 2021 at 8:06 AM Nitin Jadhav
 wrote:
> I thought mentioning the unit in milliseconds and the example in
> seconds would confuse the user, so I changed the example to
> milliseconds.The message behind the above description looks good to me
> however I feel some sentences can be explained in less words. The
> information related to the units is missing and I feel it should be
> mentioned in the document. The example says, if the setting has the
> default value of 10 seconds, then it explains the behaviour. I feel it
> may not be the default value, it can be any value set by the user. So
> mentioning 'default' in the example does not look good to me. I feel
> we just have to mention "if this setting is set to the value of 10
> seconds". Below is the modified version of the above information.

It is common to mention what the default is as part of the
documentation of a GUC. I don't see why this one should be an
exception, especially since not mentioning it reduces the length of
the documentation by exactly one word.

I don't mind the sentence you added at the end to clarify the default
units, but the way you've rewritten the first sentence makes it, in my
opinion, much less clear.

> I had added additional code to check the value of the
> 'log_startup_progress_interval' variable and  disable the feature in
> case of -1 in the earlier versions of the patch (Specifically
> v9.patch). There was a review comment for v9 patch and it resulted in
> major refactoring of the patch.
...
> The answer for the question of "I don't understand why you posted the
> previous version of the patch without testing that it works" is, for
> the value of -1, the above description was my understanding and for
> the value of 0, the older versions of the patch was behaving as
> documented. But with the later versions the behaviour got changed and
> I missed to modify the documentation. So I modified it in the last
> version.

v9 was posted on August 3rd. I told you that it wasn't working on
September 23rd. You posted a new version that still does not work on
September 27th. I think you should have tested each version of your
patch before posting it, and especially after any major refactorings.
And if for whatever reason you didn't, then certainly after I told you
on September 23rd that it didn't work, I think you should have fixed
it before posting a new version.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: when the startup process doesn't (logging startup delays)

2021-09-28 Thread Nitin Jadhav
> That's really not what I'm complaining about. I think if we're going
> to give an example at all, 10ms is a better example than 100ms,
> because 10s is a value that people are more likely to find useful. But
> I'm not sure that it's necessary to mention a specific value, and if
> it is, I think it needs to be phrased in a less confusing way.
>
> > >>It also looks pretty silly to say that if you set the value to 10s, 
> > >>something
> > >>will happen every 10s. What else would anyone expect to happen?
> >
> > @Robert: that's consistent with existing documentation, even though it might
> > seem obvious and silly to us.
> >
> > | For example, if you set this to 250ms then all automatic vacuums and 
> > analyzes that run 250ms or longer will be logged
> > | For example, if you set it to 250ms then all SQL statements that run 
> > 250ms or longer will be logged
>
> Fair enough, but I still don't like it much. I tried my hand at
> rewriting this and came up with the attached:
>
> + Sets the amount of time after which the startup process will log
> + a message about a long-running operation that is still in progress,
> + as well as the interval between further progress messages for that
> + operation. This setting is applied separately to each operation.
> + For example, if syncing the data directory takes 25 seconds and
> + thereafter resetting unlogged relations takes 8 seconds, and if this
> + setting has the default value of 10 seconds, then a messages will be
> + logged for syncing the data directory after it has been in progress
> + for 10 seconds and again after it has been in progress for 20 
> seconds,
> + but nothing will be logged for resetting unlogged operations.
> + A setting of 0 disables the feature.
>
> I prefer this to Nitin's version because I think it could be unclear
> to someone that the value applies separately to each operation,
> whereas I don't think we need to document that we can't guarantee that
> the messages will be perfectly on time - that's true of every kind of
> scheduled event in pretty much every computer system - or what happens
> if the system clock goes backwards. Those are things we should try to
> get right, as well as we can anyway, but we don't need to tell the
> user that we got them right.

I thought mentioning the unit in milliseconds and the example in
seconds would confuse the user, so I changed the example to
milliseconds.The message behind the above description looks good to me
however I feel some sentences can be explained in less words. The
information related to the units is missing and I feel it should be
mentioned in the document. The example says, if the setting has the
default value of 10 seconds, then it explains the behaviour. I feel it
may not be the default value, it can be any value set by the user. So
mentioning 'default' in the example does not look good to me. I feel
we just have to mention "if this setting is set to the value of 10
seconds". Below is the modified version of the above information.

+ Sets the amount of time after every such interval the startup process
+ will log a message about a long-running operation that is still in
+ progress. This setting is applied separately to each operation.
+ For example, if syncing the data directory takes 25 seconds and
+ thereafter resetting unlogged relations takes 8 seconds, and if this
+ setting is set to the value of 10 seconds, then a messages will be
+ logged for syncing the data directory after it has been in progress
+ for 10 seconds and again after it has been in progress for 20 seconds,
+ but nothing will be logged for resetting unlogged operations.
+ A setting of 0 disables the feature. If this value
+ is specified without units, it is taken as milliseconds.

> Well, I see that -1 is now disallowed, and that's good as far as it
> goes, but 0 still does not actually disable the feature. I don't
> understand why you posted the previous version of the patch without
> testing that it works, and I even less understand why you are posting
> another version without fixing the bug that I pointed out to you in
> the last version.

I had added additional code to check the value of the
'log_startup_progress_interval' variable and  disable the feature in
case of -1 in the earlier versions of the patch (Specifically
v9.patch). There was a review comment for v9 patch and it resulted in
major refactoring of the patch. The comment was

> With these changes you'd have only 1 place in the code that needs to
> care about log_startup_progress_interval, as opposed to 3 as you have
> it currently, and only one place that enables the timeout, as opposed
> to 2 as you have it currently. I think that would be tidier.

Based on the above comment and the idea behind enabling the timer, it
does not log anything if the value is set to -1. So 

Re: when the startup process doesn't (logging startup delays)

2021-09-27 Thread Alvaro Herrera
> +/*
> + * Decides whether to log the startup progress or not based on whether the
> + * timer is expired or not. Returns FALSE if the timer is not expired, 
> otherwise
> + * calculates the elapsed time and sets the respective out parameters secs 
> and
> + * usecs. Enables the timer for the next log message and returns TRUE.
> + */
> +bool
> +startup_progress_timeout_has_expired(long *secs, int *usecs)

I think this comment can be worded better.  It says it "decides", but it
doesn't actually decide on any action to take -- it just reports whether
the timer expired or not, to allow its caller to make the decision.  In
such situations we just say something like "Report whether startup
progress has caused a timeout, return true and rearm the timer if it
did, or just return false otherwise"; and we don't indicate what the
value is going to be used *for*.  Then the caller can use the boolean
return value to make a decision, such as whether something is going to
be logged.  This function can be oblivious to details such as this:

> + /* If the timeout has not occurred, then no need to log the details. */
> + if (!startup_progress_timer_expired)
> + return false;

here we can just say "No timeout has occurred" and make no inference
about what's going to happen or not happen.

Also, for functions that do things like this we typically use English
sentence structure with the function name starting with the verb --
perhaps has_startup_progress_timeout_expired().  Sometimes we are lax
about this if we have some sort of poor-man's modularisation by using a
common prefix for several functions doing related things, which perhaps
could be "startup_progress_*" in your case, but your other functions are
already not doing that (such as begin_startup_progress_phase) so it's
not clear why you would not use the most natural name for this one.

> + ereport_startup_progress("syncing data directory (syncfs), elapsed 
> time: %ld.%02d s, current path: %s",
> +  path);

Please make sure to add ereport_startup_progress() as a translation
trigger in src/backend/nls.mk.

-- 
Álvaro Herrera   39°49'30"S 73°17'W  —  https://www.EnterpriseDB.com/




Re: when the startup process doesn't (logging startup delays)

2021-09-27 Thread Robert Haas
On Mon, Sep 27, 2021 at 9:32 AM Justin Pryzby  wrote:
> >>It also looks pretty silly to say that if you set the value to 10s, 
> >>something
> >>will happen every 10s. What else would anyone expect to happen?
>
> @Robert: that's consistent with existing documentation, even though it might
> seem obvious and silly to us.
>
> | For example, if you set this to 250ms then all automatic vacuums and 
> analyzes that run 250ms or longer will be logged
> | For example, if you set it to 250ms then all SQL statements that run 250ms 
> or longer will be logged

Fair enough, but I still don't like it much. I tried my hand at
rewriting this and came up with the attached:

+ Sets the amount of time after which the startup process will log
+ a message about a long-running operation that is still in progress,
+ as well as the interval between further progress messages for that
+ operation. This setting is applied separately to each operation.
+ For example, if syncing the data directory takes 25 seconds and
+ thereafter resetting unlogged relations takes 8 seconds, and if this
+ setting has the default value of 10 seconds, then a messages will be
+ logged for syncing the data directory after it has been in progress
+ for 10 seconds and again after it has been in progress for 20 seconds,
+ but nothing will be logged for resetting unlogged operations.
+ A setting of 0 disables the feature.

I prefer this to Nitin's version because I think it could be unclear
to someone that the value applies separately to each operation,
whereas I don't think we need to document that we can't guarantee that
the messages will be perfectly on time - that's true of every kind of
scheduled event in pretty much every computer system - or what happens
if the system clock goes backwards. Those are things we should try to
get right, as well as we can anyway, but we don't need to tell the
user that we got them right.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: when the startup process doesn't (logging startup delays)

2021-09-27 Thread Robert Haas
On Mon, Sep 27, 2021 at 7:26 AM Nitin Jadhav
 wrote:
> > I really don't know what to say about this. You say that the time is
> > measured in milliseconds, and then immediately turn around and say
> > "For example, if you set it to 10s". Now we do expect that most people
> > will set it to intervals that are measured in seconds rather than
> > milliseconds, but saying that setting it to a value measured in
> > seconds is an example of setting it in milliseconds is not logical.
>
> Based on the statement "I suggest making the GUC GUC_UNIT_MS rather
> than GUC_UNIT_S, but expressing the default in postgresl.conf.sample
> as 10s rather than 1ms", I have used the default value in the
> postgresl.conf.sample as 10s rather than 1ms. So I just used the
> same value in the example too in config.sgml. If it is really getting
> confusing, I will change it to 100ms in config.sgml.

That's really not what I'm complaining about. I think if we're going
to give an example at all, 10ms is a better example than 100ms,
because 10s is a value that people are more likely to find useful. But
I'm not sure that it's necessary to mention a specific value, and if
it is, I think it needs to be phrased in a less confusing way.

> Made changes which indicate 0 mean disabled, > 0 mean interval in
> millisecond and removed -1.

Well, I see that -1 is now disallowed, and that's good as far as it
goes, but 0 still does not actually disable the feature. I don't
understand why you posted the previous version of the patch without
testing that it works, and I even less understand why you are posting
another version without fixing the bug that I pointed out to you in
the last version.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: when the startup process doesn't (logging startup delays)

2021-09-27 Thread Justin Pryzby
On Mon, Sep 27, 2021 at 04:57:20PM +0530, Nitin Jadhav wrote:
> > It is also not logical to define 0 as meaning that "all messages for
> > such operations are logged". What does that even mean? It makes sense
> > for something like log_autovacuum_min_duration, because there we are
> > talking about logging one message per operation, and we could log
> > messages for all operations or just some of them. Here we are talking
> > about the time between one message and the next, so talking about "all
> > messages" does not really seem to make a lot of sense. Experimentally,
> > what 0 actually does is cause the system to spam log lines in a tight
> > loop, which cannot be what anyone wants. I think you should make 0
> > mean disabled, and a positive value mean log at that interval, and
> > disallow -1 altogether.
> 
> Made changes which indicate 0 mean disabled, > 0 mean interval in
> millisecond and removed -1.
> 
> Please find the patch attached.

I think you misunderstood - Robert was saying that interval=0 doesn't work, not
suggesting that you write more documentation about it.

Also, I agree with Robert that the documentation is too verbose.  I don't think
you need to talk about what happens if the clock goes backwards (It just needs
to behave conveniently).

Look at the other _duration statements for what they say about units.
"If this value is specified without units, it is taken as milliseconds."
https://www.postgresql.org/docs/14/runtime-config-logging.html
 log_autovacuum_min_duration
 log_min_duration_statement

>>It also looks pretty silly to say that if you set the value to 10s, something
>>will happen every 10s. What else would anyone expect to happen?

@Robert: that's consistent with existing documentation, even though it might
seem obvious and silly to us.

| For example, if you set this to 250ms then all automatic vacuums and analyzes 
that run 250ms or longer will be logged
| For example, if you set it to 250ms then all SQL statements that run 250ms or 
longer will be logged

-- 
Justin




Re: when the startup process doesn't (logging startup delays)

2021-09-27 Thread Nitin Jadhav
> I really don't know what to say about this. You say that the time is
> measured in milliseconds, and then immediately turn around and say
> "For example, if you set it to 10s". Now we do expect that most people
> will set it to intervals that are measured in seconds rather than
> milliseconds, but saying that setting it to a value measured in
> seconds is an example of setting it in milliseconds is not logical.

Based on the statement "I suggest making the GUC GUC_UNIT_MS rather
than GUC_UNIT_S, but expressing the default in postgresl.conf.sample
as 10s rather than 1ms", I have used the default value in the
postgresl.conf.sample as 10s rather than 1ms. So I just used the
same value in the example too in config.sgml. If it is really getting
confusing, I will change it to 100ms in config.sgml.

> It also looks pretty silly to say that if you set the value to 10s,
> something will happen every 10s. What else would anyone expect to
> happen? You really need to give some thought to how to explain the
> behavior in a way that is clear and logical but not overly wordy.

Added a few lines about that. "For example, if you set it to 1000ms,
then it tries to emit a log every 1000ms. If the log message is not
available at every 100th millisecond, then there is a possibility of
delay in emitting the log. If the delay is more than a cycle or if the
system clock gets set backwards then the next attempt is done based on
the last logging time, otherwise the delay gets adjusted in the next
attempt."

Please correct the explanation if it does not meet your expectations.

> Also, please note that you've got spaces around the literals, which
> does not match the surrounding style and does not render properly in
> HTML.

Fixed.

> It is also not logical to define 0 as meaning that "all messages for
> such operations are logged". What does that even mean? It makes sense
> for something like log_autovacuum_min_duration, because there we are
> talking about logging one message per operation, and we could log
> messages for all operations or just some of them. Here we are talking
> about the time between one message and the next, so talking about "all
> messages" does not really seem to make a lot of sense. Experimentally,
> what 0 actually does is cause the system to spam log lines in a tight
> loop, which cannot be what anyone wants. I think you should make 0
> mean disabled, and a positive value mean log at that interval, and
> disallow -1 altogether.

Made changes which indicate 0 mean disabled, > 0 mean interval in
millisecond and removed -1.

Please find the patch attached.



On Thu, Sep 23, 2021 at 9:44 PM Robert Haas  wrote:
>
> On Wed, Sep 22, 2021 at 10:28 AM Nitin Jadhav
>  wrote:
> > Thanks Justin for the detailed explanation.  Done the necessary changes.
>
> Not really. The documentation here does not make a ton of sense:
>
> + Sets the time interval between each progress update of the 
> operations
> + performed by the startup process. This produces the log messages for
> + those operations which take longer than the specified
> duration. The unit
> + used to specify the value is milliseconds. For example, if
> you set it to
> +  10s , then every  10s
> , a log is
> + emitted indicating which operation is ongoing, and the
> elapsed time from
> + the beginning of the operation. If the value is set to
>  0 ,
> + then all messages for such operations are logged. 
> -1 
> + disables the feature. The default value is  10s 
>
> I really don't know what to say about this. You say that the time is
> measured in milliseconds, and then immediately turn around and say
> "For example, if you set it to 10s". Now we do expect that most people
> will set it to intervals that are measured in seconds rather than
> milliseconds, but saying that setting it to a value measured in
> seconds is an example of setting it in milliseconds is not logical. It
> also looks pretty silly to say that if you set the value to 10s,
> something will happen every 10s. What else would anyone expect to
> happen? You really need to give some thought to how to explain the
> behavior in a way that is clear and logical but not overly wordy.
> Also, please note that you've got spaces around the literals, which
> does not match the surrounding style and does not render properly in
> HTML.
>
> It is also not logical to define 0 as meaning that "all messages for
> such operations are logged". What does that even mean? It makes sense
> for something like log_autovacuum_min_duration, because there we are
> talking about logging one message per operation, and we could log
> messages for all operations or just some of them. Here we are talking
> about the time between one message and the next, so talking about "all
> messages" does not really seem to make a lot of sense. Experimentally,
> what 0 actually does is cause the system to spam log lines in a tight
> loop, which cannot be what 

Re: when the startup process doesn't (logging startup delays)

2021-09-23 Thread Robert Haas
On Wed, Sep 22, 2021 at 10:28 AM Nitin Jadhav
 wrote:
> Thanks Justin for the detailed explanation.  Done the necessary changes.

Not really. The documentation here does not make a ton of sense:

+ Sets the time interval between each progress update of the operations
+ performed by the startup process. This produces the log messages for
+ those operations which take longer than the specified
duration. The unit
+ used to specify the value is milliseconds. For example, if
you set it to
+  10s , then every  10s
, a log is
+ emitted indicating which operation is ongoing, and the
elapsed time from
+ the beginning of the operation. If the value is set to
 0 ,
+ then all messages for such operations are logged. 
-1 
+ disables the feature. The default value is  10s 

I really don't know what to say about this. You say that the time is
measured in milliseconds, and then immediately turn around and say
"For example, if you set it to 10s". Now we do expect that most people
will set it to intervals that are measured in seconds rather than
milliseconds, but saying that setting it to a value measured in
seconds is an example of setting it in milliseconds is not logical. It
also looks pretty silly to say that if you set the value to 10s,
something will happen every 10s. What else would anyone expect to
happen? You really need to give some thought to how to explain the
behavior in a way that is clear and logical but not overly wordy.
Also, please note that you've got spaces around the literals, which
does not match the surrounding style and does not render properly in
HTML.

It is also not logical to define 0 as meaning that "all messages for
such operations are logged". What does that even mean? It makes sense
for something like log_autovacuum_min_duration, because there we are
talking about logging one message per operation, and we could log
messages for all operations or just some of them. Here we are talking
about the time between one message and the next, so talking about "all
messages" does not really seem to make a lot of sense. Experimentally,
what 0 actually does is cause the system to spam log lines in a tight
loop, which cannot be what anyone wants. I think you should make 0
mean disabled, and a positive value mean log at that interval, and
disallow -1 altogether.

And on that note, I tested your patch with
log_startup_progress_interval=-1 and found that -1 behaves just like
0. In other words, contrary to what the documentation says, -1 does
not disable the feature. It instead behaves just like 0. It's really
confusing to me how you write documentation that says -1 has a certain
behavior without thinking about the fact that you haven't written any
code that would make -1 behave that way. And apparently you didn't
test it, either. It took me approximately 1 minute of testing to find
that this is broken, which really is not very much.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: when the startup process doesn't (logging startup delays)

2021-09-22 Thread Nitin Jadhav
> Michael is right.  You updated some of the units based on Robert's suggestion
> to use MS, but didn't update all of the corresponding parts of the patch.
> guc.c says that the units are in MS, which means that unqualified values are
> interpretted as such.  But postgresql.conf.sample still says "seconds", and
> guc.c says the default value is "10", and you still do:
>
> +   interval_in_ms = log_startup_progress_interval * 1000;
>
> I checked that this currently does not interpret the value as ms:
> |./tmp_install/usr/local/pgsql/bin/postgres -D 
> src/test/regress/tmp_check/data/ -c log_startup_progress_interval=1
> |2021-09-07 06:28:58.694 CDT startup[18865] LOG:  redo in progress, elapsed 
> time: 1.00 s, current LSN: 0/E94ED88
> |2021-09-07 06:28:59.694 CDT startup[18865] LOG:  redo in progress, elapsed 
> time: 2.00 s, current LSN: 0/10808EE0
> |2021-09-07 06:29:00.694 CDT startup[18865] LOG:  redo in progress, elapsed 
> time: 3.00 s, current LSN: 0/126B8C80
>
> (Also, the GUC value is in the range 0..INT_MAX, so multiplying and storing to
> another int could overflow.)
>
> I think the convention is to for GUC global vars to be initialized with the
> same default as in guc.c, so both should be 1, like:
>
> +int log_startup_progress_interval = 10 * 1000 /* 10sec */

Thanks Justin for the detailed explanation.  Done the necessary changes.

Please find the updated patch attached.


On Mon, Sep 13, 2021 at 8:32 PM Nitin Jadhav
 wrote:
>
> > I think you're confusing discussions.
> >
> > Robert was talking about initdb/bootstrap/single, and I separately and
> > independently asked about hot standbys.  It seems like Robert and I agreed
> > about the desired behavior and there was no further discussion.
>
> Sorry for including 2 separate points into one.
>
> > Looking over this version, I realized something I (or you) should have
> > noticed sooner: you've added the RegisterTimeout call to
> > InitPostgres(), but that's for things that are used by all backends,
> > and this is only used by the startup process. So it seems to me that
> > the right place is StartupProcessMain. That would have the further
> > advantage of allowing startup_progress_timeout_handler to be made
> > static. begin_startup_progress_phase() and
> > startup_progress_timeout_has_expired() are the actual API functions
> > though so they will need to remain extern.
>
> The earlier discussion wrt this point is as follows.
>
> > > I also agree that this is the better place to do it. Hence modified
> > > the patch accordingly. The condition "!AmStartupProcess()" is added to
> > > differentiate whether the call is done from a startup process or some
> > > other process. Actually StartupXLOG() gets called in 2 places. one in
> > > StartupProcessMain() and the other in InitPostgres(). As the logging
> > > of the startup progress is required only during the startup process
> > > and not in the other cases,
> >
> > The InitPostgres() case occurs when the server is started in bootstrap
> > mode (during initdb) or in single-user mode (postgres --single). I do
> > not see any reason why we shouldn't produce progress messages in at
> > least the latter case. I suspect that someone who is in the rather
> > desperate scenario of having to use single-user mode would really like
> > to know how long the server is going to take to start up.
> >
> > Perhaps during initdb we don't want messages, but I'm not sure that we
> > need to do anything about that here. None of the messages that the
> > server normally produces show up when you run initdb, so I guess they
> > are getting redirected to /dev/null or something.
> >
> > So I don't think that using AmStartupProcess() for this purpose is right.
>
> This point is really confusing. As per the earlier discussion we
> concluded to include RegisterTimeout() call even in case of
> InitPostgres() to support logging in case of single-user mode. Now if
> we remove the RegisterTimeout() call from InitPostgres(), we are not
> going to support that anymore. Is this what you're trying to convey?
> or we should add some checks and disable the code to RegisterTimeout()
> if it is other than single-user mode. I have added a check if
> (!IsPostmasterEnvironment) in the attached patch for this scenario.
> Kindly confirm my understanding.
>
> > > Should this feature distinguish between crash recovery and archive 
> > > recovery on
> > > a hot standby ?  Otherwise the standby will display this all the time.
> > >
> > >2021-08-14 16:13:33.139 CDT startup[11741] LOG:  redo in progress, elapsed 
> > >time: 124.42 s, current LSN: 0/EEE2100
> > >
> > >If so, I think maybe you'd check !InArchiveRecovery (but until Robert 
> > >finishes
> > > cleanup of xlog.c variables, I can't say that with much confidence).
> >
> > Hmm. My inclination is to think that on an actual standby, you
> > wouldn't want to get messages like this, but if you were doing a
> > point-in-time-recovery, then you would. So I think maybe
> > 

Re: when the startup process doesn't (logging startup delays)

2021-09-13 Thread Nitin Jadhav
> I think you're confusing discussions.
>
> Robert was talking about initdb/bootstrap/single, and I separately and
> independently asked about hot standbys.  It seems like Robert and I agreed
> about the desired behavior and there was no further discussion.

Sorry for including 2 separate points into one.

> Looking over this version, I realized something I (or you) should have
> noticed sooner: you've added the RegisterTimeout call to
> InitPostgres(), but that's for things that are used by all backends,
> and this is only used by the startup process. So it seems to me that
> the right place is StartupProcessMain. That would have the further
> advantage of allowing startup_progress_timeout_handler to be made
> static. begin_startup_progress_phase() and
> startup_progress_timeout_has_expired() are the actual API functions
> though so they will need to remain extern.

The earlier discussion wrt this point is as follows.

> > I also agree that this is the better place to do it. Hence modified
> > the patch accordingly. The condition "!AmStartupProcess()" is added to
> > differentiate whether the call is done from a startup process or some
> > other process. Actually StartupXLOG() gets called in 2 places. one in
> > StartupProcessMain() and the other in InitPostgres(). As the logging
> > of the startup progress is required only during the startup process
> > and not in the other cases,
>
> The InitPostgres() case occurs when the server is started in bootstrap
> mode (during initdb) or in single-user mode (postgres --single). I do
> not see any reason why we shouldn't produce progress messages in at
> least the latter case. I suspect that someone who is in the rather
> desperate scenario of having to use single-user mode would really like
> to know how long the server is going to take to start up.
>
> Perhaps during initdb we don't want messages, but I'm not sure that we
> need to do anything about that here. None of the messages that the
> server normally produces show up when you run initdb, so I guess they
> are getting redirected to /dev/null or something.
>
> So I don't think that using AmStartupProcess() for this purpose is right.

This point is really confusing. As per the earlier discussion we
concluded to include RegisterTimeout() call even in case of
InitPostgres() to support logging in case of single-user mode. Now if
we remove the RegisterTimeout() call from InitPostgres(), we are not
going to support that anymore. Is this what you're trying to convey?
or we should add some checks and disable the code to RegisterTimeout()
if it is other than single-user mode. I have added a check if
(!IsPostmasterEnvironment) in the attached patch for this scenario.
Kindly confirm my understanding.

> > Should this feature distinguish between crash recovery and archive recovery 
> > on
> > a hot standby ?  Otherwise the standby will display this all the time.
> >
> >2021-08-14 16:13:33.139 CDT startup[11741] LOG:  redo in progress, elapsed 
> >time: 124.42 s, current LSN: 0/EEE2100
> >
> >If so, I think maybe you'd check !InArchiveRecovery (but until Robert 
> >finishes
> > cleanup of xlog.c variables, I can't say that with much confidence).
>
> Hmm. My inclination is to think that on an actual standby, you
> wouldn't want to get messages like this, but if you were doing a
> point-in-time-recovery, then you would. So I think maybe
> InArchiveRecovery is not the right thing. Perhaps StandbyMode?

I also feel that the log messages should be recorded in case of
point-in-time-recovery. So I have added a check if (!StandbyMode) and
verified the replication and point-in-time-recovery scenario.

> > Some doc comments:
>
> Thanks for the suggestions. I will take care in the next patch.

Fixed.

> Michael is right.  You updated some of the units based on Robert's suggestion
> to use MS, but didn't update all of the corresponding parts of the patch.
> guc.c says that the units are in MS, which means that unqualified values are
> interpretted as such.  But postgresql.conf.sample still says "seconds", and
> guc.c says the default value is "10", and you still do:
>
> +   interval_in_ms = log_startup_progress_interval * 1000;
>
> I checked that this currently does not interpret the value as ms:
> |./tmp_install/usr/local/pgsql/bin/postgres -D 
> src/test/regress/tmp_check/data/ -c log_startup_progress_interval=1
> |2021-09-07 06:28:58.694 CDT startup[18865] LOG:  redo in progress, elapsed 
> time: 1.00 s, current LSN: 0/E94ED88
> |2021-09-07 06:28:59.694 CDT startup[18865] LOG:  redo in progress, elapsed 
> time: 2.00 s, current LSN: 0/10808EE0
> |2021-09-07 06:29:00.694 CDT startup[18865] LOG:  redo in progress, elapsed 
> time: 3.00 s, current LSN: 0/126B8C80
>
> (Also, the GUC value is in the range 0..INT_MAX, so multiplying and storing to
> another int could overflow.)
>
> I think the convention is to for GUC global vars to be initialized with the
> same default as in guc.c, so both should be 1, like:
>
> +int 

Re: when the startup process doesn't (logging startup delays)

2021-09-07 Thread Justin Pryzby
On Tue, Sep 07, 2021 at 03:07:15PM +0530, Nitin Jadhav wrote:
> > Looking over this version, I realized something I (or you) should have
> > noticed sooner: you've added the RegisterTimeout call to
> > InitPostgres(), but that's for things that are used by all backends,
> > and this is only used by the startup process. So it seems to me that
> > the right place is StartupProcessMain. That would have the further
> > advantage of allowing startup_progress_timeout_handler to be made
> > static. begin_startup_progress_phase() and
> > startup_progress_timeout_has_expired() are the actual API functions
> > though so they will need to remain extern.
> >
> > I agree with Robert that a standby server should not continuously show 
> > timing
> > messages for WAL replay.
> 
> The earlier discussion wrt this point is as follows.

I think you're confusing discussions.

Robert was talking about initdb/bootstrap/single, and I separately and
independently asked about hot standbys.  It seems like Robert and I agreed
about the desired behavior and there was no further discussion.

> > Honestly, I don't see why we should have
> > a GUC for what's proposed here.  A value too low would bloat the logs
> > with entries that are not that meaningful.  A value too large would
> > just prevent access to some information wanted.  Wouldn't it be better
> > to just pick up a value like 10s or 20s?

I don't think bloating logs is a issue for values > 10sec.

You agreed that it's important to choose the "right" value, but I think that
will vary between users.  Some installations with large checkpoint_timeout
might anticipate taking 15+min to perform recovery, but others might even have
a strict requirement that recovery must not take more than (say) 10sec; someone
might want to use this to verify that, or to optimize the slow parts of
recovery, with an interval that someone else might not care about.

> > +   {"log_startup_progress_interval", PGC_SIGHUP, LOGGING_WHEN,
> > +   gettext_noop("Sets the time interval between each progress 
> > update "
> > +"of the startup process."),
> > +   gettext_noop("0 logs all messages. -1 turns this feature off."),
> > +   GUC_UNIT_MS,
|+   10, -1, INT_MAX,
> > The unit is incorrect here, as that would default to 10ms, contrary to
> > what the documentation says about 10s.
> 
> Kindly refer the previous few discussions wrt this point.

You copied and pasted unrelated emails, which isn't helpful.

Michael is right.  You updated some of the units based on Robert's suggestion
to use MS, but didn't update all of the corresponding parts of the patch.
guc.c says that the units are in MS, which means that unqualified values are
interpretted as such.  But postgresql.conf.sample still says "seconds", and
guc.c says the default value is "10", and you still do:

+   interval_in_ms = log_startup_progress_interval * 1000;

I checked that this currently does not interpret the value as ms:
|./tmp_install/usr/local/pgsql/bin/postgres -D src/test/regress/tmp_check/data/ 
-c log_startup_progress_interval=1
|2021-09-07 06:28:58.694 CDT startup[18865] LOG:  redo in progress, elapsed 
time: 1.00 s, current LSN: 0/E94ED88
|2021-09-07 06:28:59.694 CDT startup[18865] LOG:  redo in progress, elapsed 
time: 2.00 s, current LSN: 0/10808EE0
|2021-09-07 06:29:00.694 CDT startup[18865] LOG:  redo in progress, elapsed 
time: 3.00 s, current LSN: 0/126B8C80

(Also, the GUC value is in the range 0..INT_MAX, so multiplying and storing to
another int could overflow.)

I think the convention is to for GUC global vars to be initialized with the
same default as in guc.c, so both should be 1, like:

+int log_startup_progress_interval = 10 * 1000 /* 10sec */

-- 
Justin




Re: when the startup process doesn't (logging startup delays)

2021-09-07 Thread Nitin Jadhav
> Looking over this version, I realized something I (or you) should have
> noticed sooner: you've added the RegisterTimeout call to
> InitPostgres(), but that's for things that are used by all backends,
> and this is only used by the startup process. So it seems to me that
> the right place is StartupProcessMain. That would have the further
> advantage of allowing startup_progress_timeout_handler to be made
> static. begin_startup_progress_phase() and
> startup_progress_timeout_has_expired() are the actual API functions
> though so they will need to remain extern.
>
> I agree with Robert that a standby server should not continuously show timing
> messages for WAL replay.

The earlier discussion wrt this point is as follows.

> > I also agree that this is the better place to do it. Hence modified
> > the patch accordingly. The condition "!AmStartupProcess()" is added to
> > differentiate whether the call is done from a startup process or some
> > other process. Actually StartupXLOG() gets called in 2 places. one in
> > StartupProcessMain() and the other in InitPostgres(). As the logging
> > of the startup progress is required only during the startup process
> > and not in the other cases,
>
> The InitPostgres() case occurs when the server is started in bootstrap
> mode (during initdb) or in single-user mode (postgres --single). I do
> not see any reason why we shouldn't produce progress messages in at
> least the latter case. I suspect that someone who is in the rather
> desperate scenario of having to use single-user mode would really like
> to know how long the server is going to take to start up.
>
> Perhaps during initdb we don't want messages, but I'm not sure that we
> need to do anything about that here. None of the messages that the
> server normally produces show up when you run initdb, so I guess they
> are getting redirected to /dev/null or something.
>
> So I don't think that using AmStartupProcess() for this purpose is right.

So as per the recent discussion, RegisterTimeout call should be
removed from InitPostgres() and the condition "!AmStartupProcess()" is
to be added in begin_startup_progress_phase() and
ereport_startup_progress() to differentiate whether the call is from a
startup process or some other process. Kindly correct me if I am
wrong.

> Some doc comments:

Thanks for the suggestions. I will take care in the next patch.

> Clearly.  This should be limited to crash recovery, and maybe there
> could be some checks to make sure that nothing is logged once a
> consistent point is reached.

The purpose here is to show the progress of the operation if it is
taking longer than the interval set by the user until it completes the
operation. Users should know what operation is happening in the
background and to show the progress, displaying the elapsed time. So
according to me the consistent point is nothing but the end of the
operation. Kindly let me know if you have something in mind and that
could be the better consistent point.

> Honestly, I don't see why we should have
> a GUC for what's proposed here.  A value too low would bloat the logs
> with entries that are not that meaningful.  A value too large would
> just prevent access to some information wanted.  Wouldn't it be better
> to just pick up a value like 10s or 20s?

It is difficult to finalise the value and use that value without
providing an option to change. If we choose one value (say 10s), it
may be too less for some users or too large for some other users. So I
feel it is better to provide an option to users so that they can
choose the value according to their need. Anyway the default value set
for this setting is 10s.

> +   {"log_startup_progress_interval", PGC_SIGHUP, LOGGING_WHEN,
> +   gettext_noop("Sets the time interval between each progress update 
> "
> +"of the startup process."),
> +   gettext_noop("0 logs all messages. -1 turns this feature off."),
> +   GUC_UNIT_MS,
> The unit is incorrect here, as that would default to 10ms, contrary to
> what the documentation says about 10s.

Kindly refer the previous few discussions wrt this point.

On Tue, Sep 7, 2021 at 10:58 AM Michael Paquier  wrote:
>
> On Fri, Sep 03, 2021 at 09:23:27PM -0500, Justin Pryzby wrote:
> > On Fri, Sep 03, 2021 at 01:23:56PM +0530, Nitin Jadhav wrote:
> > > Please find the updated patch attached.
> >
> > Please check 
> > CA+TgmoZtbqxaOLdpNkBcDbz=41tWALA8kpH4M=rwtpyhc7-...@mail.gmail.com
> >
> > I agree with Robert that a standby server should not continuously show 
> > timing
> > messages for WAL replay.
>
> Clearly.  This should be limited to crash recovery, and maybe there
> could be some checks to make sure that nothing is logged once a
> consistent point is reached.  Honestly, I don't see why we should have
> a GUC for what's proposed here.  A value too low would bloat the logs
> with entries that are not that meaningful.  A value too large would
> just prevent access to some 

Re: when the startup process doesn't (logging startup delays)

2021-09-07 Thread Nitin Jadhav
> I also agree that this is the better place to do it. Hence modified
> the patch accordingly. The condition "!AmStartupProcess()" is added to
> differentiate whether the call is done from a startup process or some
> other process. Actually StartupXLOG() gets called in 2 places. one in
> StartupProcessMain() and the other in InitPostgres(). As the logging
> of the startup progress is required only during the startup process
> and not in the other cases,

The InitPostgres() case occurs when the server is started in bootstrap
mode (during initdb) or in single-user mode (postgres --single). I do
not see any reason why we shouldn't produce progress messages in at
least the latter case. I suspect that someone who is in the rather
desperate scenario of having to use single-user mode would really like
to know how long the server is going to take to start up.

Perhaps during initdb we don't want messages, but I'm not sure that we
need to do anything about that here. None of the messages that the
server normally produces show up when you run initdb, so I guess they
are getting redirected to /dev/null or something.

So I don't think that using AmStartupProcess() for this purpose is right.

On Tue, Sep 7, 2021 at 10:58 AM Michael Paquier  wrote:
>
> On Fri, Sep 03, 2021 at 09:23:27PM -0500, Justin Pryzby wrote:
> > On Fri, Sep 03, 2021 at 01:23:56PM +0530, Nitin Jadhav wrote:
> > > Please find the updated patch attached.
> >
> > Please check 
> > CA+TgmoZtbqxaOLdpNkBcDbz=41tWALA8kpH4M=rwtpyhc7-...@mail.gmail.com
> >
> > I agree with Robert that a standby server should not continuously show 
> > timing
> > messages for WAL replay.
>
> Clearly.  This should be limited to crash recovery, and maybe there
> could be some checks to make sure that nothing is logged once a
> consistent point is reached.  Honestly, I don't see why we should have
> a GUC for what's proposed here.  A value too low would bloat the logs
> with entries that are not that meaningful.  A value too large would
> just prevent access to some information wanted.  Wouldn't it be better
> to just pick up a value like 10s or 20s?
>
> Looking at v13..
>
> +   {"log_startup_progress_interval", PGC_SIGHUP, LOGGING_WHEN,
> +   gettext_noop("Sets the time interval between each progress update 
> "
> +"of the startup process."),
> +   gettext_noop("0 logs all messages. -1 turns this feature off."),
> +   GUC_UNIT_MS,
> The unit is incorrect here, as that would default to 10ms, contrary to
> what the documentation says about 10s.
> --
> Michael




Re: when the startup process doesn't (logging startup delays)

2021-09-06 Thread Michael Paquier
On Fri, Sep 03, 2021 at 09:23:27PM -0500, Justin Pryzby wrote:
> On Fri, Sep 03, 2021 at 01:23:56PM +0530, Nitin Jadhav wrote:
> > Please find the updated patch attached.
> 
> Please check 
> CA+TgmoZtbqxaOLdpNkBcDbz=41tWALA8kpH4M=rwtpyhc7-...@mail.gmail.com
> 
> I agree with Robert that a standby server should not continuously show timing
> messages for WAL replay.

Clearly.  This should be limited to crash recovery, and maybe there
could be some checks to make sure that nothing is logged once a
consistent point is reached.  Honestly, I don't see why we should have
a GUC for what's proposed here.  A value too low would bloat the logs
with entries that are not that meaningful.  A value too large would
just prevent access to some information wanted.  Wouldn't it be better
to just pick up a value like 10s or 20s?

Looking at v13..

+   {"log_startup_progress_interval", PGC_SIGHUP, LOGGING_WHEN,
+   gettext_noop("Sets the time interval between each progress update "
+"of the startup process."),
+   gettext_noop("0 logs all messages. -1 turns this feature off."),
+   GUC_UNIT_MS,
The unit is incorrect here, as that would default to 10ms, contrary to
what the documentation says about 10s.
--
Michael


signature.asc
Description: PGP signature


Re: when the startup process doesn't (logging startup delays)

2021-09-03 Thread Justin Pryzby
On Fri, Sep 03, 2021 at 01:23:56PM +0530, Nitin Jadhav wrote:
> Please find the updated patch attached.

Please check CA+TgmoZtbqxaOLdpNkBcDbz=41tWALA8kpH4M=rwtpyhc7-...@mail.gmail.com

I agree with Robert that a standby server should not continuously show timing
messages for WAL replay.

Some doc comments:

+ Sets the time interval between each progress update of the operations
+ performed during startup process. This produces the log message after

Either say "performed by the startup process" or "performed during startup".

s/the/a/

+ every interval of time for the operations that take longer time. The 
unit

..for those operations which take longer than the specified duration.

+ used to specify the value is seconds. For example, if you set it to
+  10s , then after every  10s  
there

remove "after"

+ is a log message indicating which operation is going on and what is 
the

say "..every 10s, a log is emitted indicating which operation is ongoing, and
the elapsed time from the beginning of the operation.."

+ elapsed time from beginning. If the value is set to  0 
,
+ then it logs all the available messages for such operations. 
 -1

"..then all messages for such operations are logged."

+  disables the feature. The default value is set to 
 10s
+ 

"The default value is >10s<."




Re: when the startup process doesn't (logging startup delays)

2021-09-03 Thread Nitin Jadhav
> > Anything that's not used in other files should be declared static in
> > the file itself, and not present in the header. Once you fix this for
> > reset_startup_progress_timeout, the header won't need to include
> > datatype/timestamp.h any more, which is good, because we don't want
> > header files to depend on more other header files than necessary.
>
> Thanks for identifying this. I will take care in the next patch.

Fixed.

> > This hunk should be removed.
>
> I will remove it in the next patch.

Removed.

Please find the updated patch attached.

On Wed, Aug 18, 2021 at 12:23 PM Nitin Jadhav
 wrote:
>
> > Anything that's not used in other files should be declared static in
> > the file itself, and not present in the header. Once you fix this for
> > reset_startup_progress_timeout, the header won't need to include
> > datatype/timestamp.h any more, which is good, because we don't want
> > header files to depend on more other header files than necessary.
>
> Thanks for identifying this. I will take care in the next patch.
>
> > Looking over this version, I realized something I (or you) should have
> > noticed sooner: you've added the RegisterTimeout call to
> > InitPostgres(), but that's for things that are used by all backends,
> > and this is only used by the startup process. So it seems to me that
> > the right place is StartupProcessMain. That would have the further
> > advantage of allowing startup_progress_timeout_handler to be made
> > static. begin_startup_progress_phase() and
> > startup_progress_timeout_has_expired() are the actual API functions
> > though so they will need to remain extern.
>
> Yes. I had noticed this earlier and the RegisterTimeout() call was
> only present in StartupProcessMain() and not in InitPostgres() in the
> earlier versions (v7) of the patch. Since StartupXLOG() gets called in
> the 2 places, I had restricted the InitPostgres() flow by checking for
> the !AmStartupProcess() in the newly added functions. But later we had
> discussion and concluded to add the RegisterTimeout() call even in
> case of InitPostgres(). Kindly refer to the discussion just after the
> v7 patch in this thread and let me know your thoughts.
>
> > This hunk should be removed.
>
> I will remove it in the next patch.
>
> On Tue, Aug 17, 2021 at 1:08 AM Robert Haas  wrote:
> >
> > On Sat, Aug 14, 2021 at 5:47 PM Justin Pryzby  wrote:
> > > Should this feature distinguish between crash recovery and archive 
> > > recovery on
> > > a hot standby ?  Otherwise the standby will display this all the time.
> > >
> > > 2021-08-14 16:13:33.139 CDT startup[11741] LOG:  redo in progress, 
> > > elapsed time: 124.42 s, current LSN: 0/EEE2100
> > >
> > > If so, I think maybe you'd check !InArchiveRecovery (but until Robert 
> > > finishes
> > > cleanup of xlog.c variables, I can't say that with much confidence).
> >
> > Hmm. My inclination is to think that on an actual standby, you
> > wouldn't want to get messages like this, but if you were doing a
> > point-in-time-recovery, then you would. So I think maybe
> > InArchiveRecovery is not the right thing. Perhaps StandbyMode?
> >
> > --
> > Robert Haas
> > EDB: http://www.enterprisedb.com


v13-0001-startup-process-progress.patch
Description: Binary data


Re: when the startup process doesn't (logging startup delays)

2021-08-18 Thread Nitin Jadhav
> Anything that's not used in other files should be declared static in
> the file itself, and not present in the header. Once you fix this for
> reset_startup_progress_timeout, the header won't need to include
> datatype/timestamp.h any more, which is good, because we don't want
> header files to depend on more other header files than necessary.

Thanks for identifying this. I will take care in the next patch.

> Looking over this version, I realized something I (or you) should have
> noticed sooner: you've added the RegisterTimeout call to
> InitPostgres(), but that's for things that are used by all backends,
> and this is only used by the startup process. So it seems to me that
> the right place is StartupProcessMain. That would have the further
> advantage of allowing startup_progress_timeout_handler to be made
> static. begin_startup_progress_phase() and
> startup_progress_timeout_has_expired() are the actual API functions
> though so they will need to remain extern.

Yes. I had noticed this earlier and the RegisterTimeout() call was
only present in StartupProcessMain() and not in InitPostgres() in the
earlier versions (v7) of the patch. Since StartupXLOG() gets called in
the 2 places, I had restricted the InitPostgres() flow by checking for
the !AmStartupProcess() in the newly added functions. But later we had
discussion and concluded to add the RegisterTimeout() call even in
case of InitPostgres(). Kindly refer to the discussion just after the
v7 patch in this thread and let me know your thoughts.

> This hunk should be removed.

I will remove it in the next patch.

On Tue, Aug 17, 2021 at 1:08 AM Robert Haas  wrote:
>
> On Sat, Aug 14, 2021 at 5:47 PM Justin Pryzby  wrote:
> > Should this feature distinguish between crash recovery and archive recovery 
> > on
> > a hot standby ?  Otherwise the standby will display this all the time.
> >
> > 2021-08-14 16:13:33.139 CDT startup[11741] LOG:  redo in progress, elapsed 
> > time: 124.42 s, current LSN: 0/EEE2100
> >
> > If so, I think maybe you'd check !InArchiveRecovery (but until Robert 
> > finishes
> > cleanup of xlog.c variables, I can't say that with much confidence).
>
> Hmm. My inclination is to think that on an actual standby, you
> wouldn't want to get messages like this, but if you were doing a
> point-in-time-recovery, then you would. So I think maybe
> InArchiveRecovery is not the right thing. Perhaps StandbyMode?
>
> --
> Robert Haas
> EDB: http://www.enterprisedb.com




Re: when the startup process doesn't (logging startup delays)

2021-08-16 Thread Robert Haas
On Sat, Aug 14, 2021 at 5:47 PM Justin Pryzby  wrote:
> Should this feature distinguish between crash recovery and archive recovery on
> a hot standby ?  Otherwise the standby will display this all the time.
>
> 2021-08-14 16:13:33.139 CDT startup[11741] LOG:  redo in progress, elapsed 
> time: 124.42 s, current LSN: 0/EEE2100
>
> If so, I think maybe you'd check !InArchiveRecovery (but until Robert finishes
> cleanup of xlog.c variables, I can't say that with much confidence).

Hmm. My inclination is to think that on an actual standby, you
wouldn't want to get messages like this, but if you were doing a
point-in-time-recovery, then you would. So I think maybe
InArchiveRecovery is not the right thing. Perhaps StandbyMode?

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: when the startup process doesn't (logging startup delays)

2021-08-14 Thread Justin Pryzby
Should this feature distinguish between crash recovery and archive recovery on
a hot standby ?  Otherwise the standby will display this all the time.

2021-08-14 16:13:33.139 CDT startup[11741] LOG:  redo in progress, elapsed 
time: 124.42 s, current LSN: 0/EEE2100

If so, I think maybe you'd check !InArchiveRecovery (but until Robert finishes
cleanup of xlog.c variables, I can't say that with much confidence).




Re: when the startup process doesn't (logging startup delays)

2021-08-12 Thread Robert Haas
On Thu, Aug 12, 2021 at 7:40 AM Nitin Jadhav
 wrote:
> > I suggest making the GUC GUC_UNIT_MS rather than GUC_UNIT_S, but
> > expressing the default in postgresl.conf.sample as 10s rather than
> > 1ms. I tried values measured in milliseconds just for testing
> > purposes and didn't initially understand why it wasn't working. I
> > don't think there's any reason it can't work.
>
> As suggested, I have changed it to GUC_UNIT_MS and kept the default
> value to 10s. I would like to know the reason why it can't be
> GUC_UNIT_S as we are expressing the values in terms of seconds.

I mean, it *could* be. There's just no advantage. Values in seconds
will work correctly either way. But values in milliseconds will only
work if it's GUC_UNIT_MS. It seems to me that it's better to make more
things work rather than fewer.

> > There's no precedent in the tree for the use of ##__VA_ARGS__. On my
> > system it seems to work fine if I just leave out the ##. Any reason
> > not to do that?
>
> I had added this to support if no variable argument are passed to the
> macro. For example, in the previous patches we used to log the
> progress at the end of the operation like
> "ereport_startup_progress(true, "data directory sync (syncfs) complete
> after %ld.%02d s");". Since these calls are removed now, ## is not
> required. Hence removed in the attached patch.

Hmm, OK. That's actually a pretty good reason for using ## there. It
just made me nervous because we have no similar uses of ## in the
backend code. We rely on it elsewhere for concatenation, but not for
comma removal. Let's try leaving it out for now unless somebody else
shows up with a different opinion.

> I had not added extern since those function were not used in the other
> files. Now added to match the project style.

Anything that's not used in other files should be declared static in
the file itself, and not present in the header. Once you fix this for
reset_startup_progress_timeout, the header won't need to include
datatype/timestamp.h any more, which is good, because we don't want
header files to depend on more other header files than necessary.

Looking over this version, I realized something I (or you) should have
noticed sooner: you've added the RegisterTimeout call to
InitPostgres(), but that's for things that are used by all backends,
and this is only used by the startup process. So it seems to me that
the right place is StartupProcessMain. That would have the further
advantage of allowing startup_progress_timeout_handler to be made
static. begin_startup_progress_phase() and
startup_progress_timeout_has_expired() are the actual API functions
though so they will need to remain extern.

@@ -679,7 +680,6 @@ static char *recovery_target_lsn_string;
 /* should be static, but commands/variable.c needs to get at this */
 char  *role_string;

-
 /*
  * Displayable names for context types (enum GucContext)
  *

This hunk should be removed.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: when the startup process doesn't (logging startup delays)

2021-08-12 Thread Nitin Jadhav
> startup_progress_timer_expired should be declared as sig_atomic_t like
> we do in other cases (see interrupt.c).

Fixed.

> The default value of the new GUC is 10s in postgresql.conf.sample, but
> -1 in guc.c. They should both be 10s, and the one in
> postgresql.conf.sample should be commented out.

Fixed.

> I suggest making the GUC GUC_UNIT_MS rather than GUC_UNIT_S, but
> expressing the default in postgresl.conf.sample as 10s rather than
> 1ms. I tried values measured in milliseconds just for testing
> purposes and didn't initially understand why it wasn't working. I
> don't think there's any reason it can't work.

As suggested, I have changed it to GUC_UNIT_MS and kept the default
value to 10s. I would like to know the reason why it can't be
GUC_UNIT_S as we are expressing the values in terms of seconds.

> I would prefer to see log_startup_progress_interval declared and
> defined in startup.c/startup.h rather than guc.c/guc.h.

Fixed.

> There's no precedent in the tree for the use of ##__VA_ARGS__. On my
> system it seems to work fine if I just leave out the ##. Any reason
> not to do that?

I had added this to support if no variable argument are passed to the
macro. For example, in the previous patches we used to log the
progress at the end of the operation like
"ereport_startup_progress(true, "data directory sync (syncfs) complete
after %ld.%02d s");". Since these calls are removed now, ## is not
required. Hence removed in the attached patch.

> Two of the declarations in startup.h forgot the leading "extern",
> while the other two that are right next to them have it, matching
> project style.

I had not added extern since those function were not used in the other
files. Now added to match the project style.

> I'm reasonably happy with most of the identifier names now, but I
> think init_startup_progress() is confusing. The reason I think that is
> that we call it more than once, which is not really what people think
> about when they think of an "init" function, I think. It's not
> initializing the startup progress facility in general; it's preparing
> for the next phase of startup progress reporting. How about renaming
> it to begin_startup_progress_phase()? And then
> startup_process_op_start_time could be
> startup_process_phase_start_time to match.

Yes begin_startup_progress_phase() looks more appropriate. Instead of
startup_process_phase_start_time, startup_progress_phase_start_time
looks more appropriate. Changed these in the attached patch.

> SyncDataDirectory() potentially walks over the data directory three
> times: once to call do_syncfs(), once to call pre_sync_fname(), and
> once to call datadir_fsync_fname(). With this patch, the first and
> third will emit progress messages if the operation runs long, but the
> second will not. I think they should all be treated the same. Also,
> the locations where you've inserted calls to init_startup_progress()
> don't really make it clear with what code that's associated. I'd put
> them *immediately* before the call to do_syncfs() or walkdir().

Fixed.

> Remember that PostgreSQL comments are typically written "telegraph
> style," so function comments should say "Does whatever" not "It does
> whatever". Most of them are correct, but there's one sentence you need
> to fix.

Fixed in the function comments of
startup_progress_timeout_has_expired(). Please let me now if this is
not the one you wanted me to correct.

> You mentioned previously that you would add documentation, but I do
> not see it here.

Sorry. I missed this. I have added the documentation in the attached patch.
On Tue, Aug 10, 2021 at 8:56 PM Robert Haas  wrote:
>
> On Tue, Aug 10, 2021 at 9:28 AM Nitin Jadhav
>  wrote:
> > Please find the updated patch attached.
>
> I think this is getting close. The output looks nice. However, I still
> see a bunch of issues.
>
> You mentioned previously that you would add documentation, but I do
> not see it here.
>
> startup_progress_timer_expired should be declared as sig_atomic_t like
> we do in other cases (see interrupt.c).
>
> The default value of the new GUC is 10s in postgresql.conf.sample, but
> -1 in guc.c. They should both be 10s, and the one in
> postgresql.conf.sample should be commented out.
>
> I suggest making the GUC GUC_UNIT_MS rather than GUC_UNIT_S, but
> expressing the default in postgresl.conf.sample as 10s rather than
> 1ms. I tried values measured in milliseconds just for testing
> purposes and didn't initially understand why it wasn't working. I
> don't think there's any reason it can't work.
>
> I would prefer to see log_startup_progress_interval declared and
> defined in startup.c/startup.h rather than guc.c/guc.h.
>
> There's no precedent in the tree for the use of ##__VA_ARGS__. On my
> system it seems to work fine if I just leave out the ##. Any reason
> not to do that?
>
> Two of the declarations in startup.h forgot the leading "extern",
> while the other two that are right next to them have it, 

Re: when the startup process doesn't (logging startup delays)

2021-08-10 Thread Robert Haas
On Tue, Aug 10, 2021 at 9:28 AM Nitin Jadhav
 wrote:
> Please find the updated patch attached.

I think this is getting close. The output looks nice. However, I still
see a bunch of issues.

You mentioned previously that you would add documentation, but I do
not see it here.

startup_progress_timer_expired should be declared as sig_atomic_t like
we do in other cases (see interrupt.c).

The default value of the new GUC is 10s in postgresql.conf.sample, but
-1 in guc.c. They should both be 10s, and the one in
postgresql.conf.sample should be commented out.

I suggest making the GUC GUC_UNIT_MS rather than GUC_UNIT_S, but
expressing the default in postgresl.conf.sample as 10s rather than
1ms. I tried values measured in milliseconds just for testing
purposes and didn't initially understand why it wasn't working. I
don't think there's any reason it can't work.

I would prefer to see log_startup_progress_interval declared and
defined in startup.c/startup.h rather than guc.c/guc.h.

There's no precedent in the tree for the use of ##__VA_ARGS__. On my
system it seems to work fine if I just leave out the ##. Any reason
not to do that?

Two of the declarations in startup.h forgot the leading "extern",
while the other two that are right next to them have it, matching
project style.

I'm reasonably happy with most of the identifier names now, but I
think init_startup_progress() is confusing. The reason I think that is
that we call it more than once, which is not really what people think
about when they think of an "init" function, I think. It's not
initializing the startup progress facility in general; it's preparing
for the next phase of startup progress reporting. How about renaming
it to begin_startup_progress_phase()? And then
startup_process_op_start_time could be
startup_process_phase_start_time to match.

SyncDataDirectory() potentially walks over the data directory three
times: once to call do_syncfs(), once to call pre_sync_fname(), and
once to call datadir_fsync_fname(). With this patch, the first and
third will emit progress messages if the operation runs long, but the
second will not. I think they should all be treated the same. Also,
the locations where you've inserted calls to init_startup_progress()
don't really make it clear with what code that's associated. I'd put
them *immediately* before the call to do_syncfs() or walkdir().

Remember that PostgreSQL comments are typically written "telegraph
style," so function comments should say "Does whatever" not "It does
whatever". Most of them are correct, but there's one sentence you need
to fix.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: when the startup process doesn't (logging startup delays)

2021-08-10 Thread Nitin Jadhav
> I'd really like to avoid this. I don't see why it's necessary. You say
> it causes a problem, but you don't explain what problem it causes.
> enable_timeout_at() will disable the timer if not already done. I
> think all we need to do is set scheduled_startup_progress_timeout = 0
> before calling reset_startup_progress_timeout() in the "init" case and
> don't do that for the non-init case. If that's not quite right, maybe
> you can work out something that does work. But adding an is_init flag
> to a function and having no common code between the is_init = true
> case and the is_init = false case is exactly the kind of thing that I
> don't want here. I want as much common code as possible.

Setting set scheduled_startup_progress_timeout = 0 in the "init" case
solves the problem. The problem was if we call init_start_progress()
continuously then the first call to reset_startup_progress_timeout()
sets the value of scheduled_startup_progress_timeout to "now +
interval". Later call to reset_startup_progress_timeout() uses the
previously set value of scheduled_startup_progress_timeout which was
not correct and it was not behaving as expected. I could see that the
first log gets printed far later than the expected interval.

> To some degree, we tend to use names_like_this() for low-level
> functions and NamesLikeThis() for higher-level functions, but that is
> not a very consistent practice.

Ok. Thanks for the information.

> But this strategy cannot be used if the drift is larger than the
> interval. If we are trying to log a message every 1000ms and the timer
> doesn't fire for 14ms, we can wait only 986ms the next time. If it
> doesn't fire for 140ms, we can wait only 860ms the next time. But if
> the timer doesn't fire for 1400ms, we cannot wait for -400ms the next
> time. So what should we do? My proposal is to just wait for the
> configured interval, 1000ms, essentially giving up on drift
> correction. Now you could argue that we ought to just wait for 600ms
> in the hopes of making it 2 * 1000ms after the previous status
> message, but I'm not sure that really has any value, and it doesn't
> seem especially likely to work. The only way timer interrupts are
> likely to be that badly delayed is if the system is horrifically
> overloaded, and if that's the case the next timer interrupt isn't
> likely to fire on schedule anyway. Trying to correct for drift in such
> a situation seems more likely to be confusing than to produce any
> helpful result.

This is what I was trying to convey in case-2. I agree that it is
better to consider "now + interval" in such a case instead of trying
to adjust the drift.

Please find the updated patch attached.

On Tue, Aug 10, 2021 at 1:06 AM Robert Haas  wrote:
>
> On Mon, Aug 9, 2021 at 11:20 AM Nitin Jadhav
>  wrote:
> > Modified the reset_startup_progress_timeout() to take the second
> > parameter which indicates whether it is called for initialization or
> > for resetting. Without this parameter there is a problem if we call
> > init_startup_progress() more than one time if there is no call to
> > ereport_startup_progress() in between as the code related to disabling
> > the timer has been removed.
>
> I'd really like to avoid this. I don't see why it's necessary. You say
> it causes a problem, but you don't explain what problem it causes.
> enable_timeout_at() will disable the timer if not already done. I
> think all we need to do is set scheduled_startup_progress_timeout = 0
> before calling reset_startup_progress_timeout() in the "init" case and
> don't do that for the non-init case. If that's not quite right, maybe
> you can work out something that does work. But adding an is_init flag
> to a function and having no common code between the is_init = true
> case and the is_init = false case is exactly the kind of thing that I
> don't want here. I want as much common code as possible.
>
> > > This makes sense, but I think I'd like to have all the functions in
> > > this patch use names_like_this() rather than NamesLikeThis().
> >
> > I have changed all the function names accordingly. But I would like to
> > know why it should be names_like_this() as there are many functions
> > with the format NamesLikeThis(). I would like to understand when to
> > use what, just to incorporate in the future patches.
>
> There is, unfortunately, no hard-and-fast rule, but we want to
> maintain as much consistency with the existing style as we can. I
> wasn't initially sure what would work best for this particular patch,
> but after we committed to a name like ereport_startup_progress() that
> to me was a strong hint in favor of using names_like_this()
> throughout. It seems impossible to imagine punctuating it as
> EreportStartupProgress() or something since that would be wildly
> inconsistent with the existing function name, and there seems to be no
> good reason why this patch can't be internally consistent.
>
> To some degree, we tend to use names_like_this() for low-level
> 

Re: when the startup process doesn't (logging startup delays)

2021-08-09 Thread Robert Haas
On Mon, Aug 9, 2021 at 11:20 AM Nitin Jadhav
 wrote:
> Modified the reset_startup_progress_timeout() to take the second
> parameter which indicates whether it is called for initialization or
> for resetting. Without this parameter there is a problem if we call
> init_startup_progress() more than one time if there is no call to
> ereport_startup_progress() in between as the code related to disabling
> the timer has been removed.

I'd really like to avoid this. I don't see why it's necessary. You say
it causes a problem, but you don't explain what problem it causes.
enable_timeout_at() will disable the timer if not already done. I
think all we need to do is set scheduled_startup_progress_timeout = 0
before calling reset_startup_progress_timeout() in the "init" case and
don't do that for the non-init case. If that's not quite right, maybe
you can work out something that does work. But adding an is_init flag
to a function and having no common code between the is_init = true
case and the is_init = false case is exactly the kind of thing that I
don't want here. I want as much common code as possible.

> > This makes sense, but I think I'd like to have all the functions in
> > this patch use names_like_this() rather than NamesLikeThis().
>
> I have changed all the function names accordingly. But I would like to
> know why it should be names_like_this() as there are many functions
> with the format NamesLikeThis(). I would like to understand when to
> use what, just to incorporate in the future patches.

There is, unfortunately, no hard-and-fast rule, but we want to
maintain as much consistency with the existing style as we can. I
wasn't initially sure what would work best for this particular patch,
but after we committed to a name like ereport_startup_progress() that
to me was a strong hint in favor of using names_like_this()
throughout. It seems impossible to imagine punctuating it as
EreportStartupProgress() or something since that would be wildly
inconsistent with the existing function name, and there seems to be no
good reason why this patch can't be internally consistent.

To some degree, we tend to use names_like_this() for low-level
functions and NamesLikeThis() for higher-level functions, but that is
not a very consistent practice.

> > reset_startup_progress_timeout(TimeStampTz now)
> > {
> >   next_timeout = last_startup_progress_timeout + interval;
> >   if (next_timeout < now)
> >   {
> >  // Either the timeout was processed so late that we missed an entire 
> > cycle,
> >  // or the system clock was set backwards.
> >  next_timeout = now + interval;
> >   }
> >   enable_timeout_at(next_timeout);
> >   last_startup_progress_timeout = next_timeout;
> > }
>
> As per the above logic, I would like to discuss 2 cases.
>
> Case-1:
> First scheduled timeout is after 1 sec. But the time out occurred
> after 1.5 sec. So the log msg prints after 1.5 sec. Next timer is
> scheduled after 2 sec (scheduled_startup_progress_timeout + interval).
> The condition (next_timeout < now) gets evaluated to false and
> everything goes smooth.
>
> Case-2:
> First scheduled timeout is after 1 sec. But the timeout occurred after
> 2.5 sec. So the log msg prints after 2.5 sec. Now next time is
> scheduled after 2 sec (scheduled_startup_progress_timeout + interval).
> So the condition (next_timeout < now) will fail and the next_timeout
> will get reset to 3.5 sec (2.5 + 1) and it continues. Is this
> behaviour ok or should we set the next_timeout to 3 sec. Please share
> your thoughts on this.

I can't quite follow this, because it seems like you are sometimes
viewing the interval as 1 second and sometimes as 2 seconds. Maybe you
could clarify that, and perhaps show example output?

My feeling is that the timer will almost always be slightly late, but
it will very rarely be extremely late, and it will also very rarely be
early (only if someone resets the system clock). So let's consider
those two cases separately. If the timer is a little bit late each
time, we want to avoid drift, so we want to shorten the next sleep
time by the amount that the previous interrupt was late. If the
interval is 1000ms and the interrupt fires 1ms late then we should
sleep 999ms the next time; if 2ms late, 998ms. That way, although
there will be some variation in which the messages are logged, the
drift won't accumulate over time and even after many minutes of
recovery the messages will be printed at ABOUT the same number of
milliseconds after the second every time, instead of drifting further
and further off course.

But this strategy cannot be used if the drift is larger than the
interval. If we are trying to log a message every 1000ms and the timer
doesn't fire for 14ms, we can wait only 986ms the next time. If it
doesn't fire for 140ms, we can wait only 860ms the next time. But if
the timer doesn't fire for 1400ms, we cannot wait for -400ms the next
time. So what should we do? My proposal is to just wait for the
configured interval, 

Re: when the startup process doesn't (logging startup delays)

2021-08-09 Thread Nitin Jadhav
Modified the reset_startup_progress_timeout() to take the second
parameter which indicates whether it is called for initialization or
for resetting. Without this parameter there is a problem if we call
init_startup_progress() more than one time if there is no call to
ereport_startup_progress() in between as the code related to disabling
the timer has been removed.

reset_startup_progress_timeout(TimeStampTz now, bool is_init)
{
   if (is_init)
  next_timeout = now + interval;
   else
   {
 next_timeout = scheduled_startup_progress_timeout + interval;
 if (next_timeout < now)
 {
// Either the timeout was processed so late that we missed an
entire cycle,
// or the system clock was set backwards.
next_timeout = now + interval;
 }
 enable_timeout_at(next_timeout);
 scheduled_startup_progress_timeout = next_timeout;
   }
}

I have incorporated this in the attached patch. Please correct me if I am wrong.

> This makes sense, but I think I'd like to have all the functions in
> this patch use names_like_this() rather than NamesLikeThis().

I have changed all the function names accordingly. But I would like to
know why it should be names_like_this() as there are many functions
with the format NamesLikeThis(). I would like to understand when to
use what, just to incorporate in the future patches.

> Hmm, yeah, that seems good, but given this change, maybe the variables
> need a little renaming. Like change last_startup_progress_timeout to
> scheduled_startup_progress_timeout, perhaps.

Right. Changed the variable name.

> I guess this one needs to return a Boolean, actually.

Yes.

> reset_startup_progress_timeout(TimeStampTz now)
> {
>   next_timeout = last_startup_progress_timeout + interval;
>   if (next_timeout < now)
>   {
>  // Either the timeout was processed so late that we missed an entire 
> cycle,
>  // or the system clock was set backwards.
>  next_timeout = now + interval;
>   }
>   enable_timeout_at(next_timeout);
>   last_startup_progress_timeout = next_timeout;
> }

As per the above logic, I would like to discuss 2 cases.

Case-1:
First scheduled timeout is after 1 sec. But the time out occurred
after 1.5 sec. So the log msg prints after 1.5 sec. Next timer is
scheduled after 2 sec (scheduled_startup_progress_timeout + interval).
The condition (next_timeout < now) gets evaluated to false and
everything goes smooth.

Case-2:
First scheduled timeout is after 1 sec. But the timeout occurred after
2.5 sec. So the log msg prints after 2.5 sec. Now next time is
scheduled after 2 sec (scheduled_startup_progress_timeout + interval).
So the condition (next_timeout < now) will fail and the next_timeout
will get reset to 3.5 sec (2.5 + 1) and it continues. Is this
behaviour ok or should we set the next_timeout to 3 sec. Please share
your thoughts on this.

Thanks & Regards,
Nitin Jadhav


On Thu, Aug 5, 2021 at 7:49 PM Robert Haas  wrote:
>
> On Thu, Aug 5, 2021 at 7:41 AM Nitin Jadhav
>  wrote:
> > This seems a little confusing. As we are setting
> > last_startup_progress_timeout = now and then calling
> > reset_startup_progress_timeout() which will calculate the next_time
> > based on the value of last_startup_progress_timeout initially and
> > checks whether next_timeout is less than now. It doesn't make sense to
> > me. I feel we should calculate the next_timeout based on the time that
> > it is supposed to fire next time. So we should set
> > last_startup_progress_timeout = next_timeout after enabling the timer.
> > Also I feel with the 2 functions mentioned above, we also need
> > InitStartupProgress() which sets the initial value to
> > startupProcessOpStartTime which is used to calculate the time
> > difference and display in the logs. I could see those functions like
> > below.
> >
> > InitStartupProgress(void)
> > {
> > startupProcessOpStartTime = GetCurrentTimestamp();
> > ResetStartupProgressTimeout(startupProcessOpStartTime);
> > }
>
> This makes sense, but I think I'd like to have all the functions in
> this patch use names_like_this() rather than NamesLikeThis().
>
> > reset_startup_progress_timeout(TimeStampTz now)
> > {
> >   next_timeout = last_startup_progress_timeout + interval;
> >   if (next_timeout < now)
> >   {
> >  // Either the timeout was processed so late that we missed an entire 
> > cycle,
> >  // or the system clock was set backwards.
> >  next_timeout = now + interval;
> >   }
> >   enable_timeout_at(next_timeout);
> >   last_startup_progress_timeout = next_timeout;
> > }
>
> Hmm, yeah, that seems good, but given this change, maybe the variables
> need a little renaming. Like change last_startup_progress_timeout to
> scheduled_startup_progress_timeout, perhaps.
>
> > startup_progress_timeout_has_expired()
> > {
> >if (!startup_progress_timer_expired)
> >  return;
> >   now = GetCurrentTimestamp();
> >   // compute timestamp difference based on startupProcessOpStartTime
> >   

Re: when the startup process doesn't (logging startup delays)

2021-08-05 Thread Robert Haas
On Thu, Aug 5, 2021 at 7:41 AM Nitin Jadhav
 wrote:
> This seems a little confusing. As we are setting
> last_startup_progress_timeout = now and then calling
> reset_startup_progress_timeout() which will calculate the next_time
> based on the value of last_startup_progress_timeout initially and
> checks whether next_timeout is less than now. It doesn't make sense to
> me. I feel we should calculate the next_timeout based on the time that
> it is supposed to fire next time. So we should set
> last_startup_progress_timeout = next_timeout after enabling the timer.
> Also I feel with the 2 functions mentioned above, we also need
> InitStartupProgress() which sets the initial value to
> startupProcessOpStartTime which is used to calculate the time
> difference and display in the logs. I could see those functions like
> below.
>
> InitStartupProgress(void)
> {
> startupProcessOpStartTime = GetCurrentTimestamp();
> ResetStartupProgressTimeout(startupProcessOpStartTime);
> }

This makes sense, but I think I'd like to have all the functions in
this patch use names_like_this() rather than NamesLikeThis().

> reset_startup_progress_timeout(TimeStampTz now)
> {
>   next_timeout = last_startup_progress_timeout + interval;
>   if (next_timeout < now)
>   {
>  // Either the timeout was processed so late that we missed an entire 
> cycle,
>  // or the system clock was set backwards.
>  next_timeout = now + interval;
>   }
>   enable_timeout_at(next_timeout);
>   last_startup_progress_timeout = next_timeout;
> }

Hmm, yeah, that seems good, but given this change, maybe the variables
need a little renaming. Like change last_startup_progress_timeout to
scheduled_startup_progress_timeout, perhaps.

> startup_progress_timeout_has_expired()
> {
>if (!startup_progress_timer_expired)
>  return;
>   now = GetCurrentTimestamp();
>   // compute timestamp difference based on startupProcessOpStartTime
>   reset_startup_progress_timeout(now);
> }

I guess this one needs to return a Boolean, actually.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: when the startup process doesn't (logging startup delays)

2021-08-05 Thread Nitin Jadhav
Thanks for the detailed explanation.

> +elapsed_ms = (seconds * 1000) + (useconds / 1000);
> +interval_in_ms = log_startup_progress_interval * 1000;
> +enable_timeout_after(LOG_STARTUP_PROGRESS_TIMEOUT,
> + (interval_in_ms - (elapsed_ms % interval_in_ms)));
>
> This will work correctly only if elapsed_ms is equal to interval_in_ms
> or slightly greater than interval_ms. But if elapsed_ms is greater
> than two times interval_ms, then it will produce pretty much random
> results. If elapsed_ms is negative because the system clock gets set
> backward, a possibility I've already mentioned to you in a previous
> review, then it will also misbehave. I actually don't think
> enable_timeout_after() is very easy to use for this kind of thing. At
> least for me, it's way easier to think about calculating the timestamp
> at which I want the timer to expire. Maybe something along these
> lines:
>
> next_timeout = last_startup_progress_timeout + interval;
> if (next_timeout < now)
> {
>// Either the timeout was processed so late that we missed an entire cycle,
>// or the system clock was set backwards.
>next_timeout = now + interval;
> }
> enable_timeout_at(next_timeout);
>
> So I think we should take this out, which would permit simplifying a
> bunch of code.The four places where you call
> ereport_startup_progress(true, ...) would go away.
> ereport_startup_progress() would no longer need a Boolean argument,
> and neither would LogStartupProgressTimerExpired() /
> startup_progress_timer_has_expired(). Note that there's no real need
> to disable the timeout when we're done with it. It's fine if we do,
> but if we don't, it's also not a big deal; all that happens if we
> leave the timer scheduled and let it expire is that it will set a
> Boolean flag that nobody will care about. So what I'm thinking about
> is that we could just have, say, reset_startup_progress_timeout() and
> startup_progress_timeout_has_expired().
> reset_startup_progress_timeout() would just do exactly what I showed
> above to reset the timeout, and you'd call it at the beginning of each
> phase. And startup_progress_timeout_has_expired() would look roughly
> like this:
>
> if (!startup_progress_timer_expired)
>return;
> now = GetCurrentTimestamp();
> // compute timestamp difference
> last_startup_progress_timeout = now;
> reset_startup_progress_timeout();

This seems a little confusing. As we are setting
last_startup_progress_timeout = now and then calling
reset_startup_progress_timeout() which will calculate the next_time
based on the value of last_startup_progress_timeout initially and
checks whether next_timeout is less than now. It doesn't make sense to
me. I feel we should calculate the next_timeout based on the time that
it is supposed to fire next time. So we should set
last_startup_progress_timeout = next_timeout after enabling the timer.
Also I feel with the 2 functions mentioned above, we also need
InitStartupProgress() which sets the initial value to
startupProcessOpStartTime which is used to calculate the time
difference and display in the logs. I could see those functions like
below.

InitStartupProgress(void)
{
startupProcessOpStartTime = GetCurrentTimestamp();
ResetStartupProgressTimeout(startupProcessOpStartTime);
}

reset_startup_progress_timeout(TimeStampTz now)
{
  next_timeout = last_startup_progress_timeout + interval;
  if (next_timeout < now)
  {
 // Either the timeout was processed so late that we missed an entire cycle,
 // or the system clock was set backwards.
 next_timeout = now + interval;
  }
  enable_timeout_at(next_timeout);
  last_startup_progress_timeout = next_timeout;
}

startup_progress_timeout_has_expired()
{
   if (!startup_progress_timer_expired)
 return;
  now = GetCurrentTimestamp();
  // compute timestamp difference based on startupProcessOpStartTime
  reset_startup_progress_timeout(now);
}

Kindly share your thoughts and correct me if I am wrong.

> I think we also should consider where to put the new functions
> introduced by this patch, and the GUC. You put them in xlog.c which is
> reasonable since that is where StartupXLOG() lives. However, xlog.c is
> also a gigantic file, and xlog.h is included in a lot of places, most
> of which aren't going to care about the new things you're adding to
> that file at all. So I'm thinking it might make more sense to put the
> new code in src/backend/postmaster/startup.c. That is actually a
> better thematic fit given that this is really about the startup
> process specifically, not WAL-logging in general. Then reinit.c would
> include startup.h instead of xlog.h, which seems better, because I
> don't think we want any actual xlog operations to happen from within
> that file, so better not to be including xlog.h.
>
> The patch currently lacks documentation. It needs to update config.sgml.

I agree and I will take care in the next patch.

Thanks & Regards,
Nitin Jadhav



On Tue, Aug 3, 

Re: when the startup process doesn't (logging startup delays)

2021-08-03 Thread Robert Haas
On Tue, Aug 3, 2021 at 2:48 AM Nitin Jadhav
 wrote:
> Implemented the above approach and verified the patch. Kindly have a
> look and share your thoughts.

+LogStartupProgressTimerExpired(bool force, long *secs, int *usecs)

The name of this function begins with "log" but it does not log
anything, so that's probably a sign that you should rethink the name
of the function. I suggested startup_progress_timer_expired()
upthread, but now I think maybe we should call it
startup_progress_timer_has_expired() and then renaming the Boolean
you've called logStartupProgressTimeout to
startup_progress_timer_expired. Also, the argument "bool force"
doesn't really make sense for this function, which is why I suggested
above calling it "bool done" instead.

+elapsed_ms = (seconds * 1000) + (useconds / 1000);
+interval_in_ms = log_startup_progress_interval * 1000;
+enable_timeout_after(LOG_STARTUP_PROGRESS_TIMEOUT,
+ (interval_in_ms - (elapsed_ms % interval_in_ms)));

This will work correctly only if elapsed_ms is equal to interval_in_ms
or slightly greater than interval_ms. But if elapsed_ms is greater
than two times interval_ms, then it will produce pretty much random
results. If elapsed_ms is negative because the system clock gets set
backward, a possibility I've already mentioned to you in a previous
review, then it will also misbehave. I actually don't think
enable_timeout_after() is very easy to use for this kind of thing. At
least for me, it's way easier to think about calculating the timestamp
at which I want the timer to expire. Maybe something along these
lines:

next_timeout = last_startup_progress_timeout + interval;
if (next_timeout < now)
{
// Either the timeout was processed so late that we missed an entire cycle,
// or the system clock was set backwards.
next_timeout = now + interval;
}
enable_timeout_at(next_timeout);

Also, I said before that I thought it was OK that you were logging a
line at the end of every operation as well as after every N
milliseconds. But, the more I think about it, the less I like it. We
already have a 'redo done' line that shows up at the end of redo,
which the patch wisely does not duplicate. But it's not quite clear
that any of these other things are important enough to bother
mentioning in the log unless they take a long time. After an immediate
shutdown of an empty cluster, with this patch applied, I get 3 extra
log messages:

2021-08-03 10:17:49.630 EDT [17567] LOG:  data directory sync (fsync)
complete after 0.13 s
2021-08-03 10:17:49.633 EDT [17567] LOG:  resetting unlogged relations
(cleanup) complete after 0.00 s
2021-08-03 10:17:49.635 EDT [17567] LOG:  resetting unlogged relations
(init) complete after 0.00 s

That doesn't seem like information anyone really needs. If it had
taken a long time, it would have been worth logging, but in the normal
case where it doesn't, it's just clutter. Another funny thing is that,
as you've coded it, those additional log lines only appear when
log_startup_progress_interval != 0. That's strange. It seems
particularly strange because of the existing precedent where 'redo
done' appears regardless of any setting, but also because when I set,
say, a 10s interval, I guess I expect something to happen every 10s.
Making something happen once at the end is different.

So I think we should take this out, which would permit simplifying a
bunch of code.The four places where you call
ereport_startup_progress(true, ...) would go away.
ereport_startup_progress() would no longer need a Boolean argument,
and neither would LogStartupProgressTimerExpired() /
startup_progress_timer_has_expired(). Note that there's no real need
to disable the timeout when we're done with it. It's fine if we do,
but if we don't, it's also not a big deal; all that happens if we
leave the timer scheduled and let it expire is that it will set a
Boolean flag that nobody will care about. So what I'm thinking about
is that we could just have, say, reset_startup_progress_timeout() and
startup_progress_timeout_has_expired().
reset_startup_progress_timeout() would just do exactly what I showed
above to reset the timeout, and you'd call it at the beginning of each
phase. And startup_progress_timeout_has_expired() would look roughly
like this:

if (!startup_progress_timer_expired)
return;
now = GetCurrentTimestamp();
// compute timestamp difference
last_startup_progress_timeout = now;
reset_startup_progress_timeout();

With these changes you'd have only 1 place in the code that needs to
care about log_startup_progress_interval, as opposed to 3 as you have
it currently, and only one place that enables the timeout, as opposed
to 2 as you have it currently. I think that would be tidier.

I think we also should consider where to put the new functions
introduced by this patch, and the GUC. You put them in xlog.c which is
reasonable since that is where StartupXLOG() lives. However, xlog.c is
also a gigantic file, and xlog.h is included 

Re: when the startup process doesn't (logging startup delays)

2021-08-03 Thread Justin Pryzby
Two issues that I saw:

The first syncfs message is not output, because it's before
InitStartupProgress() is called:

2021-08-03 07:53:02.176 CDT startup[9717] LOG:  database system was 
interrupted; last known up at 2021-08-03 07:52:15 CDT
2021-08-03 07:53:02.733 CDT startup[9717] LOG:  data directory sync (syncfs) 
complete after 0.55 s
2021-08-03 07:53:02.734 CDT startup[9717] LOG:  database system was not 
properly shut down; automatic recovery in progress

FP exception when the GUC is set to 0:

2021-08-03 07:53:02.877 CDT postmaster[9715] LOG:  startup process (PID 9717) 
was terminated by signal 8: Floating point exception

Probably due to mod zero operation.
This prevents the process from starting.

+   enable_timeout_after(LOG_STARTUP_PROGRESS_TIMEOUT,
+(interval_in_ms - (elapsed_ms 
% interval_in_ms)));

-- 
Justin




Re: when the startup process doesn't (logging startup delays)

2021-08-03 Thread Nitin Jadhav
> Thanks. Can you please have a look at what I suggested down toward the
> bottom of 
> http://postgr.es/m/ca+tgmoap2wefsktmcgwt9lxuz7y99hnduyshpk7qnfuqb98...@mail.gmail.com
?

Implemented the above approach and verified the patch. Kindly have a
look and share your thoughts.

Thanks & Regards,
Nitin Jadhav

On Fri, Jul 30, 2021 at 10:40 AM Nitin Jadhav
 wrote:
>
> > Thanks. Can you please have a look at what I suggested down toward the
> > bottom of 
> > http://postgr.es/m/ca+tgmoap2wefsktmcgwt9lxuz7y99hnduyshpk7qnfuqb98...@mail.gmail.com
> > ?
> >
> > If we're going to go the route of combining the functions, I agree
> > that a Boolean is the way to go; I think we have some existing
> > precedent for 'bool finished' rather than 'bool done'.
> >
> > But I kind of wonder if we should have an enum in the first place. It
> > feels like we've got code in a bunch of places that just exists to
> > decide which enum value to use, and then code someplace else that
> > turns around and decides which message to produce. That would be
> > sensible if we were using the same enum values in lots of places, but
> > that's not the case here. So suppose we just moved the messages to the
> > places where we're now calling LogStartupProgress() or
> > LogStartupProgressComplete()? So something like this:
>
> Sorry. I thought it is related to the discussion of deciding whether
> LogStartupProgress() and LogStartupProgressComplete() should be
> combined or not. I feel it's a really nice design. With this we avoid
> a "action at a distance" issue and its easy to use. If we are
> reporting the same kind of msgs at multiple places then the current
> approach of using enum will be more suitable since we don't have to
> worry about matching the log msg string. But in the current scenario,
> we are not using the same kind of msgs at multiple places (I feel such
> scenario will not occur in future also. Even if there is similar
> operation, it can be distinguished like resetting unlogged relations
> is distinguished by init and clean. Kindly mention if you can oversee
> any such scenario), hence the approach you are suggesting will be a
> best suit.
>
> Thanks & Regards,
> Nitin Jadhav
>
> On Thu, Jul 29, 2021 at 9:48 PM Robert Haas  wrote:
> >
> > On Thu, Jul 29, 2021 at 4:56 AM Nitin Jadhav
> >  wrote:
> > > Thanks for sharing the information. I have done the necessary changes
> > > to show the logs during the latter case (postgres --single) and
> > > verified the log messages.
> >
> > Thanks. Can you please have a look at what I suggested down toward the
> > bottom of 
> > http://postgr.es/m/ca+tgmoap2wefsktmcgwt9lxuz7y99hnduyshpk7qnfuqb98...@mail.gmail.com
> > ?
> >
> > --
> > Robert Haas
> > EDB: http://www.enterprisedb.com


v9-0001-startup-process-progress.patch
Description: Binary data


Re: when the startup process doesn't (logging startup delays)

2021-07-29 Thread Nitin Jadhav
> Thanks. Can you please have a look at what I suggested down toward the
> bottom of 
> http://postgr.es/m/ca+tgmoap2wefsktmcgwt9lxuz7y99hnduyshpk7qnfuqb98...@mail.gmail.com
> ?
>
> If we're going to go the route of combining the functions, I agree
> that a Boolean is the way to go; I think we have some existing
> precedent for 'bool finished' rather than 'bool done'.
>
> But I kind of wonder if we should have an enum in the first place. It
> feels like we've got code in a bunch of places that just exists to
> decide which enum value to use, and then code someplace else that
> turns around and decides which message to produce. That would be
> sensible if we were using the same enum values in lots of places, but
> that's not the case here. So suppose we just moved the messages to the
> places where we're now calling LogStartupProgress() or
> LogStartupProgressComplete()? So something like this:

Sorry. I thought it is related to the discussion of deciding whether
LogStartupProgress() and LogStartupProgressComplete() should be
combined or not. I feel it's a really nice design. With this we avoid
a "action at a distance" issue and its easy to use. If we are
reporting the same kind of msgs at multiple places then the current
approach of using enum will be more suitable since we don't have to
worry about matching the log msg string. But in the current scenario,
we are not using the same kind of msgs at multiple places (I feel such
scenario will not occur in future also. Even if there is similar
operation, it can be distinguished like resetting unlogged relations
is distinguished by init and clean. Kindly mention if you can oversee
any such scenario), hence the approach you are suggesting will be a
best suit.

Thanks & Regards,
Nitin Jadhav

On Thu, Jul 29, 2021 at 9:48 PM Robert Haas  wrote:
>
> On Thu, Jul 29, 2021 at 4:56 AM Nitin Jadhav
>  wrote:
> > Thanks for sharing the information. I have done the necessary changes
> > to show the logs during the latter case (postgres --single) and
> > verified the log messages.
>
> Thanks. Can you please have a look at what I suggested down toward the
> bottom of 
> http://postgr.es/m/ca+tgmoap2wefsktmcgwt9lxuz7y99hnduyshpk7qnfuqb98...@mail.gmail.com
> ?
>
> --
> Robert Haas
> EDB: http://www.enterprisedb.com




Re: when the startup process doesn't (logging startup delays)

2021-07-29 Thread Robert Haas
On Thu, Jul 29, 2021 at 4:56 AM Nitin Jadhav
 wrote:
> Thanks for sharing the information. I have done the necessary changes
> to show the logs during the latter case (postgres --single) and
> verified the log messages.

Thanks. Can you please have a look at what I suggested down toward the
bottom of 
http://postgr.es/m/ca+tgmoap2wefsktmcgwt9lxuz7y99hnduyshpk7qnfuqb98...@mail.gmail.com
?

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: when the startup process doesn't (logging startup delays)

2021-07-29 Thread Nitin Jadhav
> The InitPostgres() case occurs when the server is started in bootstrap
> mode (during initdb) or in single-user mode (postgres --single). I do
> not see any reason why we shouldn't produce progress messages in at
> least the latter case. I suspect that someone who is in the rather
> desperate scenario of having to use single-user mode would really like
> to know how long the server is going to take to start up.

Thanks for sharing the information. I have done the necessary changes
to show the logs during the latter case (postgres --single) and
verified the log messages.

> +1 to emit the same log messages in single-user mode and basically
> whoever calls StartupXLOG. Do we need to adjust the GUC parameter
> log_startup_progress_interval(a reasonable value) so that the logs are
> generated by default?

At present, this feature is enabled by default and the initial value
set for log_startup_progress_interval is 10 seconds.


On Wed, Jul 28, 2021 at 9:07 PM Robert Haas  wrote:
>
> On Wed, Jul 28, 2021 at 11:25 AM Bharath Rupireddy
>  wrote:
> > > Perhaps during initdb we don't want messages, but I'm not sure that we
> > > need to do anything about that here. None of the messages that the
> > > server normally produces show up when you run initdb, so I guess they
> > > are getting redirected to /dev/null or something.
> >
> > I enabled the below log message in XLogFlush and ran initdb, it is
> > printing these logs onto the stdout, looks like the logs have not been
> > redirected to /dev/null in initdb/bootstrap mode.
> >
> > #ifdef WAL_DEBUG
> > if (XLOG_DEBUG)
> > elog(LOG, "xlog flush request %X/%X; write %X/%X; flush %X/%X",
> > LSN_FORMAT_ARGS(record),
> > LSN_FORMAT_ARGS(LogwrtResult.Write),
> > LSN_FORMAT_ARGS(LogwrtResult.Flush));
> > #endif
> >
> > So, even in bootstrap mode, can we use the above #ifdef WAL_DEBUG and
> > XLOG_DEBUG to print those logs? It looks like the problem with these
> > macros is that they are not settable by normal users in the production
> > environment, but only by the developers. I'm not sure how much it is
> > helpful to print the startup process progress logs in bootstrap mode.
> > Maybe we can use the IsBootstrapProcessingMode macro to disable these
> > logs in the bootstrap mode.
>
> I don't think we should drag XLOG_DEBUG into this. That's a debugging
> facility that isn't really relevant to the topic at hand. The point is
> the server normally prints a bunch of messages that we don't see in
> bootstrap mode. For example:
>
> [rhaas pgsql]$ postgres
> 2021-07-28 11:32:33.824 EDT [36801] LOG:  starting PostgreSQL 15devel
> on x86_64-apple-darwin19.6.0, compiled by clang version 5.0.2
> (tags/RELEASE_502/final), 64-bit
> 2021-07-28 11:32:33.825 EDT [36801] LOG:  listening on IPv6 address
> "::1", port 5432
> 2021-07-28 11:32:33.825 EDT [36801] LOG:  listening on IPv4 address
> "127.0.0.1", port 5432
> 2021-07-28 11:32:33.826 EDT [36801] LOG:  listening on Unix socket
> "/tmp/.s.PGSQL.5432"
> 2021-07-28 11:32:33.846 EDT [36802] LOG:  database system was shut
> down at 2021-07-28 11:32:32 EDT
> 2021-07-28 11:32:33.857 EDT [36801] LOG:  database system is ready to
> accept connections
>
> None of that shows up when you run initdb. Taking a fast look at the
> code, I don't think the reasons are the same for all of those
> messages. Some of the code isn't reached, whereas e.g. "database
> system was shut down at 2021-07-28 11:32:32 EDT" is special-cased. I'm
> not sure right off the top of my head what this code should do, but
> ideally it looks something like one of the cases we've already got.
>
> --
> Robert Haas
> EDB: http://www.enterprisedb.com
From 55c73bb0920e3487b72d2588c43c1c5e8798e6d4 Mon Sep 17 00:00:00 2001
From: Nitin 
Date: Thu, 29 Jul 2021 14:11:24 +0530
Subject: [PATCH] Shows the progress of the operations performed during startup
 process. The interval between each progress update is configurable and it
 should be mentioned in seconds

---
 src/backend/access/transam/xlog.c | 163 ++
 src/backend/postmaster/startup.c  |   1 +
 src/backend/storage/file/fd.c |  13 ++
 src/backend/storage/file/reinit.c |  17 +++
 src/backend/utils/init/postinit.c |   1 +
 src/backend/utils/misc/guc.c  |  12 ++
 src/backend/utils/misc/postgresql.conf.sample |   6 +
 src/include/access/xlog.h |  18 +++
 src/include/utils/timeout.h   |   1 +
 9 files changed, 232 insertions(+)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 3479402..5a39da7 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -79,6 +79,7 @@
 #include "utils/relmapper.h"
 #include "utils/pg_rusage.h"
 #include "utils/snapmgr.h"
+#include "utils/timeout.h"
 #include "utils/timestamp.h"
 
 extern uint32 bootstrap_data_checksum_version;
@@ -397,6 +398,23 @@ static bool doRequestWalReceiverReply;
  */
 

Re: when the startup process doesn't (logging startup delays)

2021-07-28 Thread Robert Haas
On Wed, Jul 28, 2021 at 11:25 AM Bharath Rupireddy
 wrote:
> > Perhaps during initdb we don't want messages, but I'm not sure that we
> > need to do anything about that here. None of the messages that the
> > server normally produces show up when you run initdb, so I guess they
> > are getting redirected to /dev/null or something.
>
> I enabled the below log message in XLogFlush and ran initdb, it is
> printing these logs onto the stdout, looks like the logs have not been
> redirected to /dev/null in initdb/bootstrap mode.
>
> #ifdef WAL_DEBUG
> if (XLOG_DEBUG)
> elog(LOG, "xlog flush request %X/%X; write %X/%X; flush %X/%X",
> LSN_FORMAT_ARGS(record),
> LSN_FORMAT_ARGS(LogwrtResult.Write),
> LSN_FORMAT_ARGS(LogwrtResult.Flush));
> #endif
>
> So, even in bootstrap mode, can we use the above #ifdef WAL_DEBUG and
> XLOG_DEBUG to print those logs? It looks like the problem with these
> macros is that they are not settable by normal users in the production
> environment, but only by the developers. I'm not sure how much it is
> helpful to print the startup process progress logs in bootstrap mode.
> Maybe we can use the IsBootstrapProcessingMode macro to disable these
> logs in the bootstrap mode.

I don't think we should drag XLOG_DEBUG into this. That's a debugging
facility that isn't really relevant to the topic at hand. The point is
the server normally prints a bunch of messages that we don't see in
bootstrap mode. For example:

[rhaas pgsql]$ postgres
2021-07-28 11:32:33.824 EDT [36801] LOG:  starting PostgreSQL 15devel
on x86_64-apple-darwin19.6.0, compiled by clang version 5.0.2
(tags/RELEASE_502/final), 64-bit
2021-07-28 11:32:33.825 EDT [36801] LOG:  listening on IPv6 address
"::1", port 5432
2021-07-28 11:32:33.825 EDT [36801] LOG:  listening on IPv4 address
"127.0.0.1", port 5432
2021-07-28 11:32:33.826 EDT [36801] LOG:  listening on Unix socket
"/tmp/.s.PGSQL.5432"
2021-07-28 11:32:33.846 EDT [36802] LOG:  database system was shut
down at 2021-07-28 11:32:32 EDT
2021-07-28 11:32:33.857 EDT [36801] LOG:  database system is ready to
accept connections

None of that shows up when you run initdb. Taking a fast look at the
code, I don't think the reasons are the same for all of those
messages. Some of the code isn't reached, whereas e.g. "database
system was shut down at 2021-07-28 11:32:32 EDT" is special-cased. I'm
not sure right off the top of my head what this code should do, but
ideally it looks something like one of the cases we've already got.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: when the startup process doesn't (logging startup delays)

2021-07-28 Thread Bharath Rupireddy
On Wed, Jul 28, 2021 at 7:02 PM Robert Haas  wrote:
>
> On Wed, Jul 28, 2021 at 5:24 AM Nitin Jadhav
>  wrote:
> > I also agree that this is the better place to do it. Hence modified
> > the patch accordingly. The condition "!AmStartupProcess()" is added to
> > differentiate whether the call is done from a startup process or some
> > other process. Actually StartupXLOG() gets called in 2 places. one in
> > StartupProcessMain() and the other in InitPostgres(). As the logging
> > of the startup progress is required only during the startup process
> > and not in the other cases,
>
> The InitPostgres() case occurs when the server is started in bootstrap
> mode (during initdb) or in single-user mode (postgres --single). I do
> not see any reason why we shouldn't produce progress messages in at
> least the latter case. I suspect that someone who is in the rather
> desperate scenario of having to use single-user mode would really like
> to know how long the server is going to take to start up.

+1 to emit the same log messages in single-user mode and basically
whoever calls StartupXLOG. Do we need to adjust the GUC parameter
log_startup_progress_interval(a reasonable value) so that the logs are
generated by default?

> Perhaps during initdb we don't want messages, but I'm not sure that we
> need to do anything about that here. None of the messages that the
> server normally produces show up when you run initdb, so I guess they
> are getting redirected to /dev/null or something.

I enabled the below log message in XLogFlush and ran initdb, it is
printing these logs onto the stdout, looks like the logs have not been
redirected to /dev/null in initdb/bootstrap mode.

#ifdef WAL_DEBUG
if (XLOG_DEBUG)
elog(LOG, "xlog flush request %X/%X; write %X/%X; flush %X/%X",
LSN_FORMAT_ARGS(record),
LSN_FORMAT_ARGS(LogwrtResult.Write),
LSN_FORMAT_ARGS(LogwrtResult.Flush));
#endif

So, even in bootstrap mode, can we use the above #ifdef WAL_DEBUG and
XLOG_DEBUG to print those logs? It looks like the problem with these
macros is that they are not settable by normal users in the production
environment, but only by the developers. I'm not sure how much it is
helpful to print the startup process progress logs in bootstrap mode.
Maybe we can use the IsBootstrapProcessingMode macro to disable these
logs in the bootstrap mode.

> So I don't think that using AmStartupProcess() for this purpose is right.

Agree.

Regards,
Bharath Rupireddy.




Re: when the startup process doesn't (logging startup delays)

2021-07-28 Thread Robert Haas
On Wed, Jul 28, 2021 at 5:24 AM Nitin Jadhav
 wrote:
> I also agree that this is the better place to do it. Hence modified
> the patch accordingly. The condition "!AmStartupProcess()" is added to
> differentiate whether the call is done from a startup process or some
> other process. Actually StartupXLOG() gets called in 2 places. one in
> StartupProcessMain() and the other in InitPostgres(). As the logging
> of the startup progress is required only during the startup process
> and not in the other cases,

The InitPostgres() case occurs when the server is started in bootstrap
mode (during initdb) or in single-user mode (postgres --single). I do
not see any reason why we shouldn't produce progress messages in at
least the latter case. I suspect that someone who is in the rather
desperate scenario of having to use single-user mode would really like
to know how long the server is going to take to start up.

Perhaps during initdb we don't want messages, but I'm not sure that we
need to do anything about that here. None of the messages that the
server normally produces show up when you run initdb, so I guess they
are getting redirected to /dev/null or something.

So I don't think that using AmStartupProcess() for this purpose is right.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: when the startup process doesn't (logging startup delays)

2021-07-28 Thread Nitin Jadhav
> > > I saw that you fixed it by calling InitStartupProgress() after the 
> > > walkdir()
> > > calls which do pre_sync_fname.  So then walkdir is calling
> > > LogStartupProgress(STARTUP_PROCESS_OP_FSYNC) even when it's not doing 
> > > fsync,
> > > and then LogStartupProgress() is returning because !AmStartupProcess().
> >
> > I don't think walkdir() has any business calling LogStartupProgress()
> > at all. It's supposed to be a generic function, not one that is only
> > available to be called from the startup process, or has different
> > behavior there. From my point of view, the right thing is to put the
> > logging calls into the particular callbacks that SyncDataDirectory()
> > uses.
>
> You're right - this is better.

I also agree that this is the better place to do it. Hence modified
the patch accordingly. The condition "!AmStartupProcess()" is added to
differentiate whether the call is done from a startup process or some
other process. Actually StartupXLOG() gets called in 2 places. one in
StartupProcessMain() and the other in InitPostgres(). As the logging
of the startup progress is required only during the startup process
and not in the other cases, so added the condition to confirm the call
is from the startup process. I did not find any other way to
differentiate. Kindly let me know if there is any other better
approach to do this.

> > > Maybe I'm missing something here, but I don't understand the purpose
> > > of this. You can always combine two functions into one, but it's only
> > > worth doing if you end up with less code, which doesn't seem to be the
> > > case here.
> >
> >  4 files changed, 39 insertions(+), 71 deletions(-)
>
> Hmm. I don't completely hate that, but I don't love it either. I don't
> think the result is any easier to understand than the original, and in
> some ways it's worse. In particular, you've flattened the separate
> comments for each of the individual functions into a single-line
> comment that is more generic than the comments it replaced. You could
> fix that and you'd still be slightly ahead on LOC, but I'm not
> convinced that it's actually a real win.

As per my understanding there are no changes required wrt this. Hence
not done any changes.

Please find the updated patch attached. Kindly share your comments if any.

On Mon, Jul 26, 2021 at 10:41 PM Robert Haas  wrote:
>
> On Mon, Jul 26, 2021 at 11:30 AM Justin Pryzby  wrote:
> > > Maybe I'm missing something here, but I don't understand the purpose
> > > of this. You can always combine two functions into one, but it's only
> > > worth doing if you end up with less code, which doesn't seem to be the
> > > case here.
> >
> >  4 files changed, 39 insertions(+), 71 deletions(-)
>
> Hmm. I don't completely hate that, but I don't love it either. I don't
> think the result is any easier to understand than the original, and in
> some ways it's worse. In particular, you've flattened the separate
> comments for each of the individual functions into a single-line
> comment that is more generic than the comments it replaced. You could
> fix that and you'd still be slightly ahead on LOC, but I'm not
> convinced that it's actually a real win.
>
> > > ... but I'm not exactly sure how to get there from here. Having only
> > > LogStartupProgress() but having it do a giant if-statement to figure
> > > out whether we're mid-phase or end-of-phase does not seem like the
> > > right approach.
> >
> > I used a bool arg and negation to handle within a single switch.  Maybe it's
> > cleaner to use a separate enum value for each DONE, and set a local done 
> > flag.
>
> If we're going to go the route of combining the functions, I agree
> that a Boolean is the way to go; I think we have some existing
> precedent for 'bool finished' rather than 'bool done'.
>
> But I kind of wonder if we should have an enum in the first place. It
> feels like we've got code in a bunch of places that just exists to
> decide which enum value to use, and then code someplace else that
> turns around and decides which message to produce. That would be
> sensible if we were using the same enum values in lots of places, but
> that's not the case here. So suppose we just moved the messages to the
> places where we're now calling LogStartupProgress() or
> LogStartupProgressComplete()? So something like this:
>
> if (should_report_startup_progress())
> ereport(LOG,
>  (errmsg("syncing data directory (syncfs), elapsed
> time: %ld.%02d s, current path: %s",
>secs, (usecs / 1), path)));
>
> Well, that doesn't quite work, because "secs" and "usecs" aren't going
> to exist in the caller. We can fix that easily enough: let's just make
> should_report_startup_progress take arguments long *secs, int *usecs,
> and bool done. Then the above becomes:
>
> if (should_report_startup_progress(, , false))
> ereport(LOG,
>  (errmsg("syncing data directory (syncfs), elapsed
> time: %ld.%02d s, 

Re: when the startup process doesn't (logging startup delays)

2021-07-26 Thread Robert Haas
On Mon, Jul 26, 2021 at 11:30 AM Justin Pryzby  wrote:
> > Maybe I'm missing something here, but I don't understand the purpose
> > of this. You can always combine two functions into one, but it's only
> > worth doing if you end up with less code, which doesn't seem to be the
> > case here.
>
>  4 files changed, 39 insertions(+), 71 deletions(-)

Hmm. I don't completely hate that, but I don't love it either. I don't
think the result is any easier to understand than the original, and in
some ways it's worse. In particular, you've flattened the separate
comments for each of the individual functions into a single-line
comment that is more generic than the comments it replaced. You could
fix that and you'd still be slightly ahead on LOC, but I'm not
convinced that it's actually a real win.

> > ... but I'm not exactly sure how to get there from here. Having only
> > LogStartupProgress() but having it do a giant if-statement to figure
> > out whether we're mid-phase or end-of-phase does not seem like the
> > right approach.
>
> I used a bool arg and negation to handle within a single switch.  Maybe it's
> cleaner to use a separate enum value for each DONE, and set a local done flag.

If we're going to go the route of combining the functions, I agree
that a Boolean is the way to go; I think we have some existing
precedent for 'bool finished' rather than 'bool done'.

But I kind of wonder if we should have an enum in the first place. It
feels like we've got code in a bunch of places that just exists to
decide which enum value to use, and then code someplace else that
turns around and decides which message to produce. That would be
sensible if we were using the same enum values in lots of places, but
that's not the case here. So suppose we just moved the messages to the
places where we're now calling LogStartupProgress() or
LogStartupProgressComplete()? So something like this:

if (should_report_startup_progress())
ereport(LOG,
 (errmsg("syncing data directory (syncfs), elapsed
time: %ld.%02d s, current path: %s",
   secs, (usecs / 1), path)));

Well, that doesn't quite work, because "secs" and "usecs" aren't going
to exist in the caller. We can fix that easily enough: let's just make
should_report_startup_progress take arguments long *secs, int *usecs,
and bool done. Then the above becomes:

if (should_report_startup_progress(, , false))
ereport(LOG,
 (errmsg("syncing data directory (syncfs), elapsed
time: %ld.%02d s, current path: %s",
   secs, (usecs / 1), path)));

And if this were the call site that corresponds to
LogStartupProgressComplete(), we'd replace false with true. Now, the
only real disadvantage of this that I can see is that it requires the
caller to declare 'secs' and 'usecs', which is not a big deal, but
mildly annoying perhaps. I think we can do better still with a little
macro magic. Suppose we define a macro report_startup_progress(force,
msg, ...) that expands to:

{
long secs;
int usecs;
if (startup_progress_timer_expired(, , force))
ereport(LOG, errmsg(msg, secs, usecs, ...));
}

Then the above just becomes this:

report_startup_progress(false, "syncing data directory (syncfs),
elapsed time: %ld.%02d s, current path: %s", path);

This would have the advantage that the strings we're using would be
present in the code that arranges to emit them, instead of being
removed to some other module, so I think it would be clearer. It would
also have the advantage of making it much easier to add further calls,
if someone feels they want to do that. You don't have to run around
and update enums and all the various things that use them, just copy
and paste the line above and adjust as required.

With this design, we avoid a lot of "action at a distance". We don't
define the message strings in a place far-removed from the code that
wants to emit them any more. When someone wants a new progress
message, they can just add another call to report_statup_progress()
wherever it needs to go and they're done. They don't have to go run
and update the enum and various switch statements. They're just done.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: when the startup process doesn't (logging startup delays)

2021-07-26 Thread Justin Pryzby
On Mon, Jul 26, 2021 at 10:13:09AM -0400, Robert Haas wrote:
> I don't think walkdir() has any business calling LogStartupProgress()
> at all. It's supposed to be a generic function, not one that is only
> available to be called from the startup process, or has different
> behavior there. From my point of view, the right thing is to put the
> logging calls into the particular callbacks that SyncDataDirectory()
> uses.

You're right - this is better.

On Wed, Jul 21, 2021 at 04:47:32PM +0530, Bharath Rupireddy wrote:
> > > 1) I still don't see the need for two functions LogStartupProgress and
> > > LogStartupProgressComplete. Most of the code is duplicate. I think we
> > > can just do it with a single function something like [1]:
> >
> > I agree that one function can do this more succinctly.  I think it's best to
> > use a separate enum value for START operations and END operations.
> 
> Maybe I'm missing something here, but I don't understand the purpose
> of this. You can always combine two functions into one, but it's only
> worth doing if you end up with less code, which doesn't seem to be the
> case here.

 4 files changed, 39 insertions(+), 71 deletions(-)

> ... but I'm not exactly sure how to get there from here. Having only
> LogStartupProgress() but having it do a giant if-statement to figure
> out whether we're mid-phase or end-of-phase does not seem like the
> right approach.

I used a bool arg and negation to handle within a single switch.  Maybe it's
cleaner to use a separate enum value for each DONE, and set a local done flag.

 startup[29675] LOG:  database system was interrupted; last known up at 
2021-07-26 10:23:04 CDT
 startup[29675] LOG:  syncing data directory (fsync), elapsed time: 1.38 s, 
current path: ./pg_ident.conf
 startup[29675] LOG:  data directory sync (fsync) complete after 1.72 s
 startup[29675] LOG:  database system was not properly shut down; automatic 
recovery in progress
 startup[29675] LOG:  resetting unlogged relations (cleanup) complete after 
0.00 s
 startup[29675] LOG:  redo starts at 0/17BE500
 startup[29675] LOG:  redo in progress, elapsed time: 1.00 s, current LSN: 
0/35D7CB8
 startup[29675] LOG:  redo in progress, elapsed time: 2.00 s, current LSN: 
0/54A6918
 startup[29675] LOG:  redo in progress, elapsed time: 3.00 s, current LSN: 
0/7370570
 startup[29675] LOG:  redo in progress, elapsed time: 4.00 s, current LSN: 
0/924D8A0
 startup[29675] LOG:  redo done at 0/9B8 system usage: CPU: user: 4.28 s, 
system: 0.15 s, elapsed: 4.44 s
 startup[29675] LOG:  resetting unlogged relations (init) complete after 0.03 s
 startup[29675] LOG:  checkpoint starting: end-of-recovery immediate
 startup[29675] LOG:  checkpoint complete: wrote 9872 buffers (60.3%); 0 WAL 
file(s) added, 0 removed, 8 recycled; write=0.136 s, sync=0.801 s, total=1.260 
s; sync files=21, longest=0.774 s, average=B

>From 9443005040be52225498d58678b5faa1668bd2ad Mon Sep 17 00:00:00 2001
From: Nitin 
Date: Fri, 23 Jul 2021 15:46:56 +0530
Subject: [PATCH 1/2] Shows the progress of the operations performed during
 startup process. The interval between each progress update is configurable
 and it should be mentioned in seconds

---
 src/backend/access/transam/xlog.c | 178 ++
 src/backend/postmaster/startup.c  |   1 +
 src/backend/storage/file/fd.c |  13 ++
 src/backend/storage/file/reinit.c |  18 ++
 src/backend/utils/misc/guc.c  |  13 +-
 src/backend/utils/misc/postgresql.conf.sample |   6 +
 src/include/access/xlog.h |  18 ++
 src/include/utils/timeout.h   |   1 +
 8 files changed, 247 insertions(+), 1 deletion(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 3479402272..4f052050f3 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -79,6 +79,7 @@
 #include "utils/relmapper.h"
 #include "utils/pg_rusage.h"
 #include "utils/snapmgr.h"
+#include "utils/timeout.h"
 #include "utils/timestamp.h"
 
 extern uint32 bootstrap_data_checksum_version;
@@ -397,6 +398,23 @@ static bool doRequestWalReceiverReply;
  */
 static XLogRecPtr RedoStartLSN = InvalidXLogRecPtr;
 
+/*
+ * Start timestamp of the operation that occur during startup process.
+ */
+static TimestampTz startupProcessOpStartTime;
+
+/*
+ * Indicates whether the startup progress interval mentioned by the user is
+ * elapsed or not. TRUE if timeout occurred, FALSE otherwise.
+ */
+static bool logStartupProgressTimeout;
+
+/*
+ * The interval between each progress update of the operations that occur
+ * during startup process.
+ */
+int	log_startup_progress_interval;
+
 /*--
  * Shared-memory data structures for XLOG control
  *
@@ -7351,6 +7369,8 @@ StartupXLOG(void)
 	(errmsg("redo starts at %X/%X",
 			LSN_FORMAT_ARGS(ReadRecPtr;
 
+			InitStartupProgress();
+
 			/*
 			 * main redo apply loop
 			 */
@@ -7358,6 +7378,8 @@ 

Re: when the startup process doesn't (logging startup delays)

2021-07-26 Thread Robert Haas
On Sun, Jul 25, 2021 at 1:56 PM Justin Pryzby  wrote:
> On Fri, Jul 23, 2021 at 04:09:47PM +0530, Nitin Jadhav wrote:
> > > I think walkdir() should only call LogStartupProgress(FSYNC_IN_PROGRESS, 
> > > path);
> > > when action == datadir_fsync_fname.
> >
> > I agree and fixed it.
>
> I saw that you fixed it by calling InitStartupProgress() after the walkdir()
> calls which do pre_sync_fname.  So then walkdir is calling
> LogStartupProgress(STARTUP_PROCESS_OP_FSYNC) even when it's not doing fsync,
> and then LogStartupProgress() is returning because !AmStartupProcess().
>
> That seems indirect, fragile, and confusing.  I suggest that walkdir() should
> take an argument for which operation to pass to LogStartupProgress().  You can
> pass a special enum for cases where nothing should be logged, like
> STARTUP_PROCESS_OP_NONE.

I don't think walkdir() has any business calling LogStartupProgress()
at all. It's supposed to be a generic function, not one that is only
available to be called from the startup process, or has different
behavior there. From my point of view, the right thing is to put the
logging calls into the particular callbacks that SyncDataDirectory()
uses.

> On Wed, Jul 21, 2021 at 04:47:32PM +0530, Bharath Rupireddy wrote:
> > 1) I still don't see the need for two functions LogStartupProgress and
> > LogStartupProgressComplete. Most of the code is duplicate. I think we
> > can just do it with a single function something like [1]:
>
> I agree that one function can do this more succinctly.  I think it's best to
> use a separate enum value for START operations and END operations.

Maybe I'm missing something here, but I don't understand the purpose
of this. You can always combine two functions into one, but it's only
worth doing if you end up with less code, which doesn't seem to be the
case here. The strings are all different and that's most of the
function, and the other stuff that gets done isn't the same either, so
you'd just end up with a bunch of if-statements. That doesn't seem
like an improvement.

Thinking further, I guess I understand it from the caller's
perspective. It's not necessarily clear why in some places we call
LogStartupProgress() and others LogStartupProgressComplete(). Someone
might expect a function with "complete" in the name like that to only
be called once at the very end, rather than once at the end of a
phase, and it does sort of make sense that you'd want to call one
function everywhere rather than sometimes one and sometimes the other
... but I'm not exactly sure how to get there from here. Having only
LogStartupProgress() but having it do a giant if-statement to figure
out whether we're mid-phase or end-of-phase does not seem like the
right approach.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: when the startup process doesn't (logging startup delays)

2021-07-25 Thread Justin Pryzby
On Fri, Jul 23, 2021 at 04:09:47PM +0530, Nitin Jadhav wrote:
> > I think walkdir() should only call LogStartupProgress(FSYNC_IN_PROGRESS, 
> > path);
> > when action == datadir_fsync_fname.
> 
> I agree and fixed it.

I saw that you fixed it by calling InitStartupProgress() after the walkdir()
calls which do pre_sync_fname.  So then walkdir is calling
LogStartupProgress(STARTUP_PROCESS_OP_FSYNC) even when it's not doing fsync,
and then LogStartupProgress() is returning because !AmStartupProcess().

That seems indirect, fragile, and confusing.  I suggest that walkdir() should
take an argument for which operation to pass to LogStartupProgress().  You can
pass a special enum for cases where nothing should be logged, like
STARTUP_PROCESS_OP_NONE.

On Wed, Jul 21, 2021 at 04:47:32PM +0530, Bharath Rupireddy wrote:
> 1) I still don't see the need for two functions LogStartupProgress and
> LogStartupProgressComplete. Most of the code is duplicate. I think we
> can just do it with a single function something like [1]:

I agree that one function can do this more succinctly.  I think it's best to
use a separate enum value for START operations and END operations.

   switch(operation)
   {
   case STARTUP_PROCESS_OP_SYNCFS_START:
   ereport(...);
   break;

   case STARTUP_PROCESS_OP_SYNCFS_END:
   ereport(...);
   break;

   case STARTUP_PROCESS_OP_FSYNC_START:
   ereport(...);
   break;

   case STARTUP_PROCESS_OP_FSYNC_END:
   ereport(...);
   break;

   ...

-- 
Justin




Re: when the startup process doesn't (logging startup delays)

2021-07-23 Thread Nitin Jadhav
> I still don't see the need for two functions LogStartupProgress and
> LogStartupProgressComplete. Most of the code is duplicate. I think we
> can just do it with a single function something like [1]:

Initially I had written a common function for these 2. You can see
that in the earlier version of the patch. Later separated it since it
was too much for one function. If others also agree to make it common,
I can make that change.

> Why isn't there a
> LogStartupProgressComplete(STARTUP_PROCESS_OP_REDO)? Is it because of
> the below existing log message?
> ereport(LOG,
> (errmsg("redo done at %X/%X system usage: %s",
> LSN_FORMAT_ARGS(ReadRecPtr),
> pg_rusage_show(;

Yes. Adding another log message makes it redundant.

>  I think it should be, "," after occurred instead of "."
> + * elapsed or not. TRUE if timeout occurred, FALSE otherwise.
> instead of
> + * elapsed or not. TRUE if timeout occurred. FALSE otherwise.

Fixed.

> I think walkdir() should only call LogStartupProgress(FSYNC_IN_PROGRESS, 
> path);
> when action == datadir_fsync_fname.

I agree and fixed it.

> ResetUnloggedRelations() is calling
> ResetUnloggedRelationsInTablespaceDir("base", op);
> before calling InitStartupProgress().

Fixed.

> This shows StartupXLOG() calling ResetUnloggedRelations() twice.
> Should they both be shown ?  If so, maybe they should be distinguished as 
> here:
>
>elog(DEBUG1, "resetting unlogged relations: cleanup %d init %d",
> (op & UNLOGGED_RELATION_CLEANUP) != 0,
> (op & UNLOGGED_RELATION_INIT) != 0);

Fixed. Added separate codes to distinguish.

Please find the patch attached.




On Wed, Jul 21, 2021 at 6:43 PM Justin Pryzby  wrote:
>
> I think walkdir() should only call LogStartupProgress(FSYNC_IN_PROGRESS, 
> path);
> when action == datadir_fsync_fname.
>
> ResetUnloggedRelations() is calling
> ResetUnloggedRelationsInTablespaceDir("base", op);
> before calling InitStartupProgress().
>
> This shows StartupXLOG() calling ResetUnloggedRelations() twice.
> Should they both be shown ?  If so, maybe they should be distinguished as 
> here:
>
> elog(DEBUG1, "resetting unlogged relations: cleanup %d init %d",
>  (op & UNLOGGED_RELATION_CLEANUP) != 0,
>  (op & UNLOGGED_RELATION_INIT) != 0);
>
> On Wed, Jul 21, 2021 at 12:52:24PM +0530, Nitin Jadhav wrote:
> > 2021-07-20 18:47:32.046 IST [102230] LOG:  listening on IPv4 address 
> > "127.0.0.1", port 5445
> > 2021-07-20 18:47:32.048 IST [102230] LOG:  listening on Unix socket 
> > "/tmp/.s.PGSQL.5445"
> > 2021-07-20 18:47:32.051 IST [102234] LOG:  database system was interrupted; 
> > last known up at 2021-07-20 18:46:27 IST
> > 2021-07-20 18:47:32.060 IST [102234] LOG:  data directory sync (fsync) 
> > complete after 0.00 s
> > 2021-07-20 18:47:32.060 IST [102234] LOG:  database system was not properly 
> > shut down; automatic recovery in progress
> > 2021-07-20 18:47:32.063 IST [102234] LOG:  unlogged relations reset after 
> > 0.00 s
> > 2021-07-20 18:47:32.063 IST [102234] LOG:  redo starts at 0/14EF418
> > 2021-07-20 18:47:33.063 IST [102234] LOG:  redo in progress, elapsed time: 
> > 1.00 s, current LSN: 0/5C13D28
> > 2021-07-20 18:47:34.063 IST [102234] LOG:  redo in progress, elapsed time: 
> > 2.00 s, current LSN: 0/A289160
> > 2021-07-20 18:47:35.063 IST [102234] LOG:  redo in progress, elapsed time: 
> > 3.00 s, current LSN: 0/EE2DE10
> > 2021-07-20 18:47:35.563 IST [102234] LOG:  invalid record length at 
> > 0/115C63E0: wanted 24, got 0
> > 2021-07-20 18:47:35.563 IST [102234] LOG:  redo done at 0/115C63B8 system 
> > usage: CPU: user: 3.58 s, system: 0.14 s, elapsed: 3.50 s
> > 2021-07-20 18:47:35.564 IST [102234] LOG:  unlogged relations reset after 
> > 0.00 s
> > 2021-07-20 18:47:35.706 IST [102230] LOG:  database system is ready to 
> > accept connections
>
> --
> Justin
From 85c0e711cc48bcf2f76ed8ed0257af3ecf2de306 Mon Sep 17 00:00:00 2001
From: Nitin 
Date: Fri, 23 Jul 2021 15:46:56 +0530
Subject: [PATCH] Shows the progress of the operations performed during startup
 process. The interval between each progress update is configurable and it
 should be mentioned in seconds

---
 src/backend/access/transam/xlog.c | 178 ++
 src/backend/postmaster/startup.c  |   1 +
 src/backend/storage/file/fd.c |  13 ++
 src/backend/storage/file/reinit.c |  18 +++
 src/backend/utils/misc/guc.c  |  13 +-
 src/backend/utils/misc/postgresql.conf.sample |   6 +
 src/include/access/xlog.h |  18 +++
 src/include/utils/timeout.h   |   1 +
 8 files changed, 247 insertions(+), 1 deletion(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 3479402..4f05205 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -79,6 +79,7 @@
 #include "utils/relmapper.h"
 #include "utils/pg_rusage.h"
 

Re: when the startup process doesn't (logging startup delays)

2021-07-21 Thread Justin Pryzby
I think walkdir() should only call LogStartupProgress(FSYNC_IN_PROGRESS, path);
when action == datadir_fsync_fname.

ResetUnloggedRelations() is calling
ResetUnloggedRelationsInTablespaceDir("base", op);
before calling InitStartupProgress().

This shows StartupXLOG() calling ResetUnloggedRelations() twice.
Should they both be shown ?  If so, maybe they should be distinguished as here:

elog(DEBUG1, "resetting unlogged relations: cleanup %d init %d",
 (op & UNLOGGED_RELATION_CLEANUP) != 0,
 (op & UNLOGGED_RELATION_INIT) != 0);

On Wed, Jul 21, 2021 at 12:52:24PM +0530, Nitin Jadhav wrote:
> 2021-07-20 18:47:32.046 IST [102230] LOG:  listening on IPv4 address 
> "127.0.0.1", port 5445
> 2021-07-20 18:47:32.048 IST [102230] LOG:  listening on Unix socket 
> "/tmp/.s.PGSQL.5445"
> 2021-07-20 18:47:32.051 IST [102234] LOG:  database system was interrupted; 
> last known up at 2021-07-20 18:46:27 IST
> 2021-07-20 18:47:32.060 IST [102234] LOG:  data directory sync (fsync) 
> complete after 0.00 s
> 2021-07-20 18:47:32.060 IST [102234] LOG:  database system was not properly 
> shut down; automatic recovery in progress
> 2021-07-20 18:47:32.063 IST [102234] LOG:  unlogged relations reset after 
> 0.00 s
> 2021-07-20 18:47:32.063 IST [102234] LOG:  redo starts at 0/14EF418
> 2021-07-20 18:47:33.063 IST [102234] LOG:  redo in progress, elapsed time: 
> 1.00 s, current LSN: 0/5C13D28
> 2021-07-20 18:47:34.063 IST [102234] LOG:  redo in progress, elapsed time: 
> 2.00 s, current LSN: 0/A289160
> 2021-07-20 18:47:35.063 IST [102234] LOG:  redo in progress, elapsed time: 
> 3.00 s, current LSN: 0/EE2DE10
> 2021-07-20 18:47:35.563 IST [102234] LOG:  invalid record length at 
> 0/115C63E0: wanted 24, got 0
> 2021-07-20 18:47:35.563 IST [102234] LOG:  redo done at 0/115C63B8 system 
> usage: CPU: user: 3.58 s, system: 0.14 s, elapsed: 3.50 s
> 2021-07-20 18:47:35.564 IST [102234] LOG:  unlogged relations reset after 
> 0.00 s
> 2021-07-20 18:47:35.706 IST [102230] LOG:  database system is ready to accept 
> connections

-- 
Justin




Re: when the startup process doesn't (logging startup delays)

2021-07-21 Thread Bharath Rupireddy
On Wed, Jul 21, 2021 at 12:52 PM Nitin Jadhav
 wrote:
> Please find the v5 patch attached. Kindly let me know your comments.

Thanks for the patch. Here are some comments on v5:
1) I still don't see the need for two functions LogStartupProgress and
LogStartupProgressComplete. Most of the code is duplicate. I think we
can just do it with a single function something like [1]:

2) Why isn't there a
LogStartupProgressComplete(STARTUP_PROCESS_OP_REDO)? Is it because of
the below existing log message?
ereport(LOG,
(errmsg("redo done at %X/%X system usage: %s",
LSN_FORMAT_ARGS(ReadRecPtr),
pg_rusage_show(;

3) I think it should be, "," after occurred instead of "."
+ * elapsed or not. TRUE if timeout occurred, FALSE otherwise.
instead of
+ * elapsed or not. TRUE if timeout occurred. FALSE otherwise.

[1]
+/*
+ * Logs the progress of the operations performed during the startup process.
+ * is_complete true/false indicates that the operation is finished/
+ * in-progress respectively.
+ */
+void
+LogStartupProgress(StartupProcessOp op, const char *path,
+  bool is_complete)
+{
+   long  secs;
+   int usecs;
+   int elapsed_ms;
+   int interval_in_ms;
+
+   /* If not called from the startup process then this feature is
of no use. */
+   if (!AmStartupProcess())
+   return;
+
+   /* If the feature is disabled, then no need to proceed further. */
+   if (log_startup_progress_interval < 0)
+   return;
+
+   /*
+* If the operation is in-progress and the timeout hasn't occurred, then
+* no need to log the details.
+*/
+   if (!is_complete && !logStartupProgressTimeout)
+   return;
+
+   /* Timeout has occurred. */
+   TimestampDifference(startupProcessOpStartTime,
+   GetCurrentTimestamp(),
+   , );
+
+   /*
+* If the operation is in-progress, enable the timer for the next log
+* message based on the time that current log message timer
was supposed to
+* fire.
+*/
+   if (!is_complete)
+   {
+   elapsed_ms = (secs * 1000) + (usecs / 1000);
+   interval_in_ms = log_startup_progress_interval * 1000;
+   enable_timeout_after(LOG_STARTUP_PROGRESS_TIMEOUT,
+
(interval_in_ms - (elapsed_ms % interval_in_ms)));
+   }
+
+   switch(op)
+   {
+   case STARTUP_PROCESS_OP_SYNCFS:
+   {
+   if (is_complete)
+   ereport(LOG,
+   (errmsg("data
directory sync (syncfs) complete after %ld.%02d s",
+
 secs, (usecs / 1;
+   else
+   ereport(LOG,
+
(errmsg("syncing data directory (syncfs), elapsed time: %ld.%02d s,
current path: %s",
+
 secs, (usecs / 1), path)));
+   }
+   break;
+   case STARTUP_PROCESS_OP_FSYNC:
+   {
+   if (is_complete)
+   ereport(LOG,
+   (errmsg("data
directory sync (fsync) complete after %ld.%02d s",
+
 secs, (usecs / 1;
+   else
+   ereport(LOG,
+
(errmsg("syncing data directory (fsync), elapsed time: %ld.%02d s,
current path: %s",
+
 secs, (usecs / 1), path)));
+   }
+   break;
+   case STARTUP_PROCESS_OP_REDO:
+   {
+   /*
+* No need to log redo completion
status here, as it will be
+* done in the caller.
+*/
+   if (!is_complete)
+   ereport(LOG,
+   (errmsg("redo
in progress, elapsed time: %ld.%02d s, current LSN: %X/%X",
+
 secs, (usecs / 1), LSN_FORMAT_ARGS(ReadRecPtr;
+   }
+   break;
+   case STARTUP_PROCESS_OP_RESET_UNLOGGED_REL:
+   {
+   if (is_complete)
+   ereport(LOG,
+
(errmsg("unlogged relations reset after %ld.%02d s",
+
 secs, (usecs / 1;
+   else
+   ereport(LOG,
+
(errmsg("resetting unlogged relations, elapsed time: %ld.%02d s,
current path: %s",
+
 secs, (usecs / 1), path)));
+   }
+   break;
+   default:
+   ereport(ERROR,
+

Re: when the startup process doesn't (logging startup delays)

2021-07-21 Thread Nitin Jadhav
> I am not sure I am getting the code flow correctly. From 
> CloseStartupProgress()
> naming it seems, it corresponds to InitStartupProgress() but what it is doing
> is similar to LogStartupProgress(). I think it should be renamed to be inlined
> with LogStartupProgress(), IMO.

Renamed CloseStartupProgress() to LogStartupProgressComplete().

> This part should be an assertion, it's the developer's responsibility to call
> correctly.

This code is not required at all due to the fix of the next comment.

> Since we do have a separate call for the in-progress operation and the
> end-operation, only a single enum would have been enough. If we do this, then 
> I
> think we should remove get_startup_process_operation_string() move messages to
> the respective function.

Fixed.

> I'd really like to see this enabled by default, say with a default
> interval of 10 seconds. If it has to be enabled explicitly, most
> people won't, but I think a lot of people would benefit from knowing
> why their system is slow to start up when that sort of thing happens.
> I don't see much downside to having it on by default either, since it
> shouldn't be expensive.  I think the GUC's units should be seconds, not
> milliseconds, though.

I agree that it is better to enable it by default. Changed the code
accordingly and changed the GUC's units to seconds.

> The messages you've added are capitalized, but the ones PostgreSQL
> has currently are not. You should conform to the existing style.

Fixed.

> The "crash recovery complete" message looks redundant with the "redo
> done" message. Also, in my mind, "redo" is one particular phase of
> crash recovery, so it looks really odd that "crash recovery" finishes
> first and then "redo" finishes. I think some thought is needed about
> the terminology here.

Yes. "redo" is one phase of the crash recovery. Even "resetting
unlogged relations" is also a part of the crash recovery. These 2 are
the major time consuming operations of the crash recovery task. There
is a separate log message to indicate the progress of "resetting the
unlogged relations". So instead of saying 'performing crash recovery",
it is better to say "redo in progress" and not add any additional
message at the end of redo, instead retain the existing message to
avoid redundancy.

> I'm not clear why I get a message about the data directory fsync but
> not about resetting unlogged relations. I wasn't really expecting to
> get a message about things that completed in less than the configured
> interval, although I find that I don't mind having it there either.
> But then it seems like it should be consistent across the various
> things we're timing, and resetting unlogged relations should certainly
> be one of those.

It is the same across all the things we'are timing. I was able to see
those messages on my machine. I feel there is not much overhead of
logging one message at the end of the operation even though it
completes within the configured interval. Following are the log
messages shown on my machine.

2021-07-20 18:47:32.046 IST [102230] LOG:  listening on IPv4 address
"127.0.0.1", port 5445
2021-07-20 18:47:32.048 IST [102230] LOG:  listening on Unix socket
"/tmp/.s.PGSQL.5445"
2021-07-20 18:47:32.051 IST [102234] LOG:  database system was
interrupted; last known up at 2021-07-20 18:46:27 IST
2021-07-20 18:47:32.060 IST [102234] LOG:  data directory sync (fsync)
complete after 0.00 s
2021-07-20 18:47:32.060 IST [102234] LOG:  database system was not
properly shut down; automatic recovery in progress
2021-07-20 18:47:32.063 IST [102234] LOG:  unlogged relations reset after 0.00 s
2021-07-20 18:47:32.063 IST [102234] LOG:  redo starts at 0/14EF418
.2021-07-20 18:47:33.063 IST [102234] LOG:  redo in progress, elapsed
time: 1.00 s, current LSN: 0/5C13D28
.2021-07-20 18:47:34.063 IST [102234] LOG:  redo in progress, elapsed
time: 2.00 s, current LSN: 0/A289160
.2021-07-20 18:47:35.063 IST [102234] LOG:  redo in progress, elapsed
time: 3.00 s, current LSN: 0/EE2DE10
2021-07-20 18:47:35.563 IST [102234] LOG:  invalid record length at
0/115C63E0: wanted 24, got 0
2021-07-20 18:47:35.563 IST [102234] LOG:  redo done at 0/115C63B8
system usage: CPU: user: 3.58 s, system: 0.14 s, elapsed: 3.50 s
2021-07-20 18:47:35.564 IST [102234] LOG:  unlogged relations reset after 0.00 s
2021-07-20 18:47:35.706 IST [102230] LOG:  database system is ready to
accept connections


> The way you've coded this has some drift. In a perfect world, I'd
> get a progress report at 1000.00 ms, 2000.00 ms, 3000.00 ms, etc.
> That's never going to be the case, because there's always going to be
> a slightly delay in responding to the timer interrupt. However, as
> you've coded it, the delay increases over time. The first "Performing
> crash recovery" message is only 373 ms late, but the last one is 4916
> ms late. To avoid this, you want to reschedule the timer interrupt
> based on the time the last one was supposed to fire, not the time it
> actually 

  1   2   >