Re: [HACKERS] Reducing pg_ctl's reaction time

2017-06-29 Thread Jeff Janes
On Thu, Jun 29, 2017 at 11:39 AM, Tom Lane  wrote:

> Jeff Janes  writes:
> > In the now-committed version of this, the 'pg_ctl start' returns
> > successfully as soon as the server reaches a consistent state. Which is
> OK,
> > except that it does the same thing when hot_standby=off.  When
> > hot_standby=off, I would expect it to wait for the end of recovery before
> > exiting with a success code.
>
> Um, won't it be waiting forever with that definition?
>
> regards, tom lane
>

No, this isn't streaming.  It hits the PITR limit (recovery_target_*), or
runs out of archived wal, and then it opens for business.


Cheers,

Jeff


Re: [HACKERS] Reducing pg_ctl's reaction time

2017-06-29 Thread Tom Lane
Jeff Janes  writes:
> In the now-committed version of this, the 'pg_ctl start' returns
> successfully as soon as the server reaches a consistent state. Which is OK,
> except that it does the same thing when hot_standby=off.  When
> hot_standby=off, I would expect it to wait for the end of recovery before
> exiting with a success code.

Um, won't it be waiting forever with that definition?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing pg_ctl's reaction time

2017-06-29 Thread Andres Freund


On June 29, 2017 10:19:46 AM PDT, Jeff Janes  wrote:
>On Tue, Jun 27, 2017 at 11:59 AM, Tom Lane  wrote:
>
>> I wrote:
>> > Andres Freund  writes:
>> >> On 2017-06-26 17:38:03 -0400, Tom Lane wrote:
>> >>> Hm.  Take that a bit further, and we could drop the connection
>probes
>> >>> altogether --- just put the whole responsibility on the
>postmaster to
>> >>> show in the pidfile whether it's ready for connections or not.
>>
>> >> Yea, that seems quite appealing, both from an architectural,
>simplicity,
>> >> and log noise perspective. I wonder if there's some added
>reliability by
>> >> the connection probe though? Essentially wondering if it'd be
>worthwhile
>> >> to keep a single connection test at the end. I'm somewhat
>disinclined
>> >> though.
>>
>> > I agree --- part of the appeal of this idea is that there could be
>a net
>> > subtraction of code from pg_ctl.  (I think it wouldn't have to link
>libpq
>> > anymore at all, though maybe I forgot something.)  And you get rid
>of a
>> > bunch of can't-connect failure modes, eg kernel packet filter in
>the way,
>> > which probably outweighs any hypothetical reliability gain from
>> confirming
>> > the postmaster state the old way.
>>
>> Here's a draft patch for that.  I quite like the results --- this
>seems
>> way simpler and more reliable than what pg_ctl has done up to now.
>> However, it's certainly arguable that this is too much change for an
>> optional post-beta patch.
>
>
>In the now-committed version of this, the 'pg_ctl start' returns
>successfully as soon as the server reaches a consistent state. Which is
>OK,
>except that it does the same thing when hot_standby=off.  When
>hot_standby=off, I would expect it to wait for the end of recovery
>before
>exiting with a success code.

I've not looked at the committed version, but earlier versions had code dealing 
with that difference; essentially doing what you suggest.

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing pg_ctl's reaction time

2017-06-29 Thread Jeff Janes
On Tue, Jun 27, 2017 at 11:59 AM, Tom Lane  wrote:

> I wrote:
> > Andres Freund  writes:
> >> On 2017-06-26 17:38:03 -0400, Tom Lane wrote:
> >>> Hm.  Take that a bit further, and we could drop the connection probes
> >>> altogether --- just put the whole responsibility on the postmaster to
> >>> show in the pidfile whether it's ready for connections or not.
>
> >> Yea, that seems quite appealing, both from an architectural, simplicity,
> >> and log noise perspective. I wonder if there's some added reliability by
> >> the connection probe though? Essentially wondering if it'd be worthwhile
> >> to keep a single connection test at the end. I'm somewhat disinclined
> >> though.
>
> > I agree --- part of the appeal of this idea is that there could be a net
> > subtraction of code from pg_ctl.  (I think it wouldn't have to link libpq
> > anymore at all, though maybe I forgot something.)  And you get rid of a
> > bunch of can't-connect failure modes, eg kernel packet filter in the way,
> > which probably outweighs any hypothetical reliability gain from
> confirming
> > the postmaster state the old way.
>
> Here's a draft patch for that.  I quite like the results --- this seems
> way simpler and more reliable than what pg_ctl has done up to now.
> However, it's certainly arguable that this is too much change for an
> optional post-beta patch.


In the now-committed version of this, the 'pg_ctl start' returns
successfully as soon as the server reaches a consistent state. Which is OK,
except that it does the same thing when hot_standby=off.  When
hot_standby=off, I would expect it to wait for the end of recovery before
exiting with a success code.

Cheers,

Jeff


Re: [HACKERS] Reducing pg_ctl's reaction time

2017-06-28 Thread Ants Aasma
On Wed, Jun 28, 2017 at 8:31 PM, Tom Lane  wrote:
> Andres Freund  writes:
>> On 2017-06-27 14:59:18 -0400, Tom Lane wrote:
>>> However, it's certainly arguable that this is too much change for an
>>> optional post-beta patch.
>
>> Yea, I think there's a valid case to be made for that. I'm still
>> inclined to go along with this, it seems we're otherwise just adding
>> complexity we're going to remove in a bit again.
>
> I'm not hearing anyone speaking against doing this now, so I'm going
> to go ahead with it.

+1 for going for it. Besides pg_ctl it would also help cluster
management software. In Patroni we basically reimplement pg_ctl to get
at the started postmaster pid to detect a crash during postmaster
startup earlier and to be able to reliably cancel start. Getting the
current state from the pid file sounds better than having a loop poke
the server with pg_isready.

To introduce feature creep, I would like to see more detailed states
than proposed here. Specifically I'm interested in knowing when
PM_WAIT_BACKENDS ends.

When we lose quorum we restart PostgreSQL as a standby. We use a
regular fast shutdown, but that can take a long time due to the
shutdown checkpoint. The leader lease may run out during this time so
we would have to escalate to immediate shutdown or have a watchdog
fence the node. If we knew that no user backends are left we can let
the shutdown checkpoint complete at leisure without risk for split
brain.

Regards,
Ants Aasma


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing pg_ctl's reaction time

2017-06-28 Thread Tom Lane
Alvaro Herrera  writes:
> Tom Lane wrote:
>> So when I removed the miscadmin.h include, I found out that pg_ctl is
>> also relying on PG_BACKEND_VERSIONSTR from that file.
>> 
>> There are at least three things we could do here:
>> 
>> 1. Give this up as not worth this much trouble.
>> 
>> 2. Move PG_BACKEND_VERSIONSTR into pg_config.h to go along with the
>> other version-related macros.

> pg_config.h sounds like a decent enough solution.  It's a bit strange
> this hasn't come up before, given that that symbol is used more in
> frontend environ than backend.

Right at the moment, what I've done is to stick it into port.h beside
the declaration of find_other_exec, since the existing uses are all
as parameters of find_other_exec[_or_die].  But maybe that's a bit too
expedient.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing pg_ctl's reaction time

2017-06-28 Thread Alvaro Herrera
Tom Lane wrote:
> Andres Freund  writes:

> So when I removed the miscadmin.h include, I found out that pg_ctl is
> also relying on PG_BACKEND_VERSIONSTR from that file.
> 
> There are at least three things we could do here:
> 
> 1. Give this up as not worth this much trouble.
> 
> 2. Move PG_BACKEND_VERSIONSTR into pg_config.h to go along with the
> other version-related macros.

pg_config.h sounds like a decent enough solution.  It's a bit strange
this hasn't come up before, given that that symbol is used more in
frontend environ than backend.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing pg_ctl's reaction time

2017-06-28 Thread Tom Lane
Andres Freund  writes:
> On 2017-06-28 13:31:27 -0400, Tom Lane wrote:
>> While looking this over again, I got worried about the fact that pg_ctl
>> is #including "miscadmin.h".  That's a pretty low-level backend header
>> and it wouldn't be surprising at all if somebody tried to put stuff in
>> it that wouldn't compile frontend-side.  I think we should take the
>> opportunity, as long as we're touching this stuff, to split the #defines
>> that describe the contents of postmaster.pid into a separate header file.
>> Maybe "utils/pidfile.h" ?

> Yes, that sounds like a valid concern, and solution.

So when I removed the miscadmin.h include, I found out that pg_ctl is
also relying on PG_BACKEND_VERSIONSTR from that file.

There are at least three things we could do here:

1. Give this up as not worth this much trouble.

2. Move PG_BACKEND_VERSIONSTR into pg_config.h to go along with the
other version-related macros.

3. Give PG_BACKEND_VERSIONSTR its very own new header file.

Any preferences?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing pg_ctl's reaction time

2017-06-28 Thread Andres Freund
On 2017-06-28 13:31:27 -0400, Tom Lane wrote:
> I'm not hearing anyone speaking against doing this now, so I'm going
> to go ahead with it.

Cool.


> While looking this over again, I got worried about the fact that pg_ctl
> is #including "miscadmin.h".  That's a pretty low-level backend header
> and it wouldn't be surprising at all if somebody tried to put stuff in
> it that wouldn't compile frontend-side.  I think we should take the
> opportunity, as long as we're touching this stuff, to split the #defines
> that describe the contents of postmaster.pid into a separate header file.
> Maybe "utils/pidfile.h" ?

Yes, that sounds like a valid concern, and solution.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing pg_ctl's reaction time

2017-06-28 Thread Tom Lane
Andres Freund  writes:
> On 2017-06-27 14:59:18 -0400, Tom Lane wrote:
>> However, it's certainly arguable that this is too much change for an
>> optional post-beta patch.

> Yea, I think there's a valid case to be made for that. I'm still
> inclined to go along with this, it seems we're otherwise just adding
> complexity we're going to remove in a bit again.

I'm not hearing anyone speaking against doing this now, so I'm going
to go ahead with it.

> The 8-space thing in multiple places is a bit ugly.  How about having a
> a bunch of constants declared in one place?

While looking this over again, I got worried about the fact that pg_ctl
is #including "miscadmin.h".  That's a pretty low-level backend header
and it wouldn't be surprising at all if somebody tried to put stuff in
it that wouldn't compile frontend-side.  I think we should take the
opportunity, as long as we're touching this stuff, to split the #defines
that describe the contents of postmaster.pid into a separate header file.
Maybe "utils/pidfile.h" ?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing pg_ctl's reaction time

2017-06-27 Thread Tom Lane
Andres Freund  writes:
> On 2017-06-27 14:59:18 -0400, Tom Lane wrote:
>> If we decide that it has to wait for v11,
>> I'd address Jeff's complaint by hacking the loop behavior in
>> test_postmaster_connection, which'd be ugly but not many lines of code.

> Basically increasing the wait time over time?

I was thinking of just dropping back to once-per-second after a couple
of seconds, with something along the lines of this in place of the
existing sleep at the bottom of the loop:

if (i >= 2 * WAITS_PER_SEC)
{
pg_usleep(USEC_PER_SEC);
i += WAITS_PER_SEC - 1;
}
else
pg_usleep(USEC_PER_SEC / WAITS_PER_SEC);


> The 8-space thing in multiple places is a bit ugly.  How about having a
> a bunch of constants declared in one place? Alternatively make it
> something like: status: $c, where $c is a one character code for the
> various states?

Yeah, we could add the string values as macros in miscadmin.h, perhaps.
I don't like the single-character idea --- if we do expand the number of
things reported this way, that could get restrictive.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing pg_ctl's reaction time

2017-06-27 Thread Andres Freund
Hi,

On 2017-06-27 14:59:18 -0400, Tom Lane wrote:
> Here's a draft patch for that.  I quite like the results --- this seems
> way simpler and more reliable than what pg_ctl has done up to now.

Yea, I like that too.


> However, it's certainly arguable that this is too much change for an
> optional post-beta patch.

Yea, I think there's a valid case to be made for that. I'm still
inclined to go along with this, it seems we're otherwise just adding
complexity we're going to remove in a bit again.


> If we decide that it has to wait for v11,
> I'd address Jeff's complaint by hacking the loop behavior in
> test_postmaster_connection, which'd be ugly but not many lines of code.

Basically increasing the wait time over time?


> Note that I followed the USE_SYSTEMD patch's lead as to where to report
> postmaster state changes.  Arguably, in the standby-server case, we
> should not report the postmaster is ready until we've reached consistency.
> But that would require additional signaling from the startup process
> to the postmaster, so it seems like a separate change if we want it.

I suspect we're going to want to add more states to this over time, but
as you say, there's no need to hurry.


>   /*
> +  * Report postmaster status in the postmaster.pid file, to allow pg_ctl 
> to
> +  * see what's happening.  Note that all strings written to the status 
> line
> +  * must be the same length, per comments for AddToDataDirLockFile().  We
> +  * currently make them all 8 bytes, padding with spaces.
> +  */
> + AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, "starting");

The 8-space thing in multiple places is a bit ugly.  How about having a
a bunch of constants declared in one place? Alternatively make it
something like: status: $c, where $c is a one character code for the
various states?


> @@ -1149,8 +1149,9 @@ TouchSocketLockFiles(void)
>   *
>   * Note: because we don't truncate the file, if we were to rewrite a line
>   * with less data than it had before, there would be garbage after the last
> - * line.  We don't ever actually do that, so not worth adding another kernel
> - * call to cover the possibility.
> + * line.  While we could fix that by adding a truncate call, that would make
> + * the file update non-atomic, which we'd rather avoid.  Therefore, callers
> + * should endeavor never to shorten a line once it's been written.
>   */
>  void
>  AddToDataDirLockFile(int target_line, const char *str)
> @@ -1193,19 +1194,26 @@ AddToDataDirLockFile(int target_line, co
>   srcptr = srcbuffer;
>   for (lineno = 1; lineno < target_line; lineno++)
>   {
> - if ((srcptr = strchr(srcptr, '\n')) == NULL)
> - {
> - elog(LOG, "incomplete data in \"%s\": found only %d 
> newlines while trying to add line %d",
> -  DIRECTORY_LOCK_FILE, lineno - 1, target_line);
> - close(fd);
> - return;
> - }
> - srcptr++;
> + char   *eol = strchr(srcptr, '\n');
> +
> + if (eol == NULL)
> + break;  /* not enough lines in 
> file yet */
> + srcptr = eol + 1;
>   }
>   memcpy(destbuffer, srcbuffer, srcptr - srcbuffer);
>   destptr = destbuffer + (srcptr - srcbuffer);
>  
>   /*
> +  * Fill in any missing lines before the target line, in case lines are
> +  * added to the file out of order.
> +  */
> + for (; lineno < target_line; lineno++)
> + {
> + if (destptr < destbuffer + sizeof(destbuffer))
> + *destptr++ = '\n';
> + }
> +
> + /*
>* Write or rewrite the target line.
>*/
>   snprintf(destptr, destbuffer + sizeof(destbuffer) - destptr,
> "%s\n", str);

Not this patches fault, and shouldn't be changed now, but this is a
fairly weird way to manage and update this file.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing pg_ctl's reaction time

2017-06-27 Thread Tom Lane
I wrote:
> Andres Freund  writes:
>> On 2017-06-26 17:38:03 -0400, Tom Lane wrote:
>>> Hm.  Take that a bit further, and we could drop the connection probes
>>> altogether --- just put the whole responsibility on the postmaster to
>>> show in the pidfile whether it's ready for connections or not.

>> Yea, that seems quite appealing, both from an architectural, simplicity,
>> and log noise perspective. I wonder if there's some added reliability by
>> the connection probe though? Essentially wondering if it'd be worthwhile
>> to keep a single connection test at the end. I'm somewhat disinclined
>> though.

> I agree --- part of the appeal of this idea is that there could be a net
> subtraction of code from pg_ctl.  (I think it wouldn't have to link libpq
> anymore at all, though maybe I forgot something.)  And you get rid of a
> bunch of can't-connect failure modes, eg kernel packet filter in the way,
> which probably outweighs any hypothetical reliability gain from confirming
> the postmaster state the old way.

Here's a draft patch for that.  I quite like the results --- this seems
way simpler and more reliable than what pg_ctl has done up to now.
However, it's certainly arguable that this is too much change for an
optional post-beta patch.  If we decide that it has to wait for v11,
I'd address Jeff's complaint by hacking the loop behavior in
test_postmaster_connection, which'd be ugly but not many lines of code.

Note that I followed the USE_SYSTEMD patch's lead as to where to report
postmaster state changes.  Arguably, in the standby-server case, we
should not report the postmaster is ready until we've reached consistency.
But that would require additional signaling from the startup process
to the postmaster, so it seems like a separate change if we want it.

Thoughts?

regards, tom lane

diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 2874f63..38f534f 100644
*** a/src/backend/postmaster/postmaster.c
--- b/src/backend/postmaster/postmaster.c
*** PostmasterMain(int argc, char *argv[])
*** 1341,1346 
--- 1341,1354 
  #endif
  
  	/*
+ 	 * Report postmaster status in the postmaster.pid file, to allow pg_ctl to
+ 	 * see what's happening.  Note that all strings written to the status line
+ 	 * must be the same length, per comments for AddToDataDirLockFile().  We
+ 	 * currently make them all 8 bytes, padding with spaces.
+ 	 */
+ 	AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, "starting");
+ 
+ 	/*
  	 * We're ready to rock and roll...
  	 */
  	StartupPID = StartupDataBase();
*** pmdie(SIGNAL_ARGS)
*** 2608,2613 
--- 2616,2624 
  			Shutdown = SmartShutdown;
  			ereport(LOG,
  	(errmsg("received smart shutdown request")));
+ 
+ 			/* Report status */
+ 			AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, "stopping");
  #ifdef USE_SYSTEMD
  			sd_notify(0, "STOPPING=1");
  #endif
*** pmdie(SIGNAL_ARGS)
*** 2663,2668 
--- 2674,2682 
  			Shutdown = FastShutdown;
  			ereport(LOG,
  	(errmsg("received fast shutdown request")));
+ 
+ 			/* Report status */
+ 			AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, "stopping");
  #ifdef USE_SYSTEMD
  			sd_notify(0, "STOPPING=1");
  #endif
*** pmdie(SIGNAL_ARGS)
*** 2727,2732 
--- 2741,2749 
  			Shutdown = ImmediateShutdown;
  			ereport(LOG,
  	(errmsg("received immediate shutdown request")));
+ 
+ 			/* Report status */
+ 			AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, "stopping");
  #ifdef USE_SYSTEMD
  			sd_notify(0, "STOPPING=1");
  #endif
*** reaper(SIGNAL_ARGS)
*** 2872,2877 
--- 2889,2896 
  			ereport(LOG,
  	(errmsg("database system is ready to accept connections")));
  
+ 			/* Report status */
+ 			AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, "ready   ");
  #ifdef USE_SYSTEMD
  			sd_notify(0, "READY=1");
  #endif
*** sigusr1_handler(SIGNAL_ARGS)
*** 5005,5014 
  		if (XLogArchivingAlways())
  			PgArchPID = pgarch_start();
  
! #ifdef USE_SYSTEMD
  		if (!EnableHotStandby)
  			sd_notify(0, "READY=1");
  #endif
  
  		pmState = PM_RECOVERY;
  	}
--- 5024,5041 
  		if (XLogArchivingAlways())
  			PgArchPID = pgarch_start();
  
! 		/*
! 		 * If we aren't planning to enter hot standby mode later, treat
! 		 * RECOVERY_STARTED as meaning we're out of startup, and report status
! 		 * accordingly.
! 		 */
  		if (!EnableHotStandby)
+ 		{
+ 			AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, "standby ");
+ #ifdef USE_SYSTEMD
  			sd_notify(0, "READY=1");
  #endif
+ 		}
  
  		pmState = PM_RECOVERY;
  	}
*** sigusr1_handler(SIGNAL_ARGS)
*** 5024,5029 
--- 5051,5058 
  		ereport(LOG,
  (errmsg("database system is ready to accept read only connections")));
  
+ 		/* Report status */
+ 		AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, "ready   ");
  #ifdef USE_SYSTEMD
  		sd_notify(0, "READY=1");
  #endif
diff --git 

Re: [HACKERS] Reducing pg_ctl's reaction time

2017-06-26 Thread Tom Lane
Andres Freund  writes:
> On 2017-06-26 17:38:03 -0400, Tom Lane wrote:
>> Hm.  Take that a bit further, and we could drop the connection probes
>> altogether --- just put the whole responsibility on the postmaster to
>> show in the pidfile whether it's ready for connections or not.

> Yea, that seems quite appealing, both from an architectural, simplicity,
> and log noise perspective. I wonder if there's some added reliability by
> the connection probe though? Essentially wondering if it'd be worthwhile
> to keep a single connection test at the end. I'm somewhat disinclined
> though.

I agree --- part of the appeal of this idea is that there could be a net
subtraction of code from pg_ctl.  (I think it wouldn't have to link libpq
anymore at all, though maybe I forgot something.)  And you get rid of a
bunch of can't-connect failure modes, eg kernel packet filter in the way,
which probably outweighs any hypothetical reliability gain from confirming
the postmaster state the old way.

This would mean that v10 pg_ctl could not be used to start/stop older
postmasters, which is flexibility we tried to preserve in the past.
But I see that we already gave that up in quite a few code paths because
of their attempts to read pg_control, so I think it's a concern we can
ignore.

I'll draft something up later.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing pg_ctl's reaction time

2017-06-26 Thread Andres Freund
On 2017-06-26 17:38:03 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2017-06-26 17:30:30 -0400, Tom Lane wrote:
> >> No, I don't like that at all.  Has race conditions against updates
> >> coming from the startup process.
> 
> > You'd obviously have to take the appropriate locks.  I think the issue
> > here is less race conditions, and more that architecturally we'd
> > interact with shmem too much.
> 
> Uh, we are *not* taking any locks in the postmaster.

I'm not sure why you're 'Uh'ing, when my my point pretty much is that we
do not want to do so?


> Hm.  Take that a bit further, and we could drop the connection probes
> altogether --- just put the whole responsibility on the postmaster to
> show in the pidfile whether it's ready for connections or not.

Yea, that seems quite appealing, both from an architectural, simplicity,
and log noise perspective. I wonder if there's some added reliability by
the connection probe though? Essentially wondering if it'd be worthwhile
to keep a single connection test at the end. I'm somewhat disinclined
though.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing pg_ctl's reaction time

2017-06-26 Thread Tom Lane
Andres Freund  writes:
> On 2017-06-26 17:30:30 -0400, Tom Lane wrote:
>> No, I don't like that at all.  Has race conditions against updates
>> coming from the startup process.

> You'd obviously have to take the appropriate locks.  I think the issue
> here is less race conditions, and more that architecturally we'd
> interact with shmem too much.

Uh, we are *not* taking any locks in the postmaster.

>> Yeah, that would be a different way to go at it.  The postmaster would
>> probably just write the state of the hot_standby GUC to the file, and
>> pg_ctl would have to infer things from there.

> I'd actually say we should just mirror the existing
> #ifdef USE_SYSTEMD
>   if (!EnableHotStandby)
>   sd_notify(0, "READY=1");
> #endif
> with corresponding pidfile updates - doesn't really seem necessary for
> pg_ctl to do more?

Hm.  Take that a bit further, and we could drop the connection probes
altogether --- just put the whole responsibility on the postmaster to
show in the pidfile whether it's ready for connections or not.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing pg_ctl's reaction time

2017-06-26 Thread Andres Freund
On 2017-06-26 17:30:30 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > It'd be quite possible to address the race-condition by moving the
> > updating of the control file to postmaster, to the
> > CheckPostmasterSignal(PMSIGNAL_BEGIN_HOT_STANDBY) block. That'd require
> > updating the control file from postmaster, which'd be somewhat ugly.
> 
> No, I don't like that at all.  Has race conditions against updates
> coming from the startup process.

You'd obviously have to take the appropriate locks.  I think the issue
here is less race conditions, and more that architecturally we'd
interact with shmem too much.

> > Perhaps that indicates that field shouldn't be in pg_control, but in the
> > pid file?
> 
> Yeah, that would be a different way to go at it.  The postmaster would
> probably just write the state of the hot_standby GUC to the file, and
> pg_ctl would have to infer things from there.

I'd actually say we should just mirror the existing
#ifdef USE_SYSTEMD
if (!EnableHotStandby)
sd_notify(0, "READY=1");
#endif
with corresponding pidfile updates - doesn't really seem necessary for
pg_ctl to do more?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing pg_ctl's reaction time

2017-06-26 Thread Tom Lane
Andres Freund  writes:
> It'd be quite possible to address the race-condition by moving the
> updating of the control file to postmaster, to the
> CheckPostmasterSignal(PMSIGNAL_BEGIN_HOT_STANDBY) block. That'd require
> updating the control file from postmaster, which'd be somewhat ugly.

No, I don't like that at all.  Has race conditions against updates
coming from the startup process.

> Perhaps that indicates that field shouldn't be in pg_control, but in the
> pid file?

Yeah, that would be a different way to go at it.  The postmaster would
probably just write the state of the hot_standby GUC to the file, and
pg_ctl would have to infer things from there.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing pg_ctl's reaction time

2017-06-26 Thread Andres Freund
Hi,

On 2017-06-26 16:49:07 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > Arguably we could and should improve the logic when the server has
> > started, right now it's pretty messy because we never treat a standby as
> > up if hot_standby is disabled...
> 
> True.  If you could tell the difference between "HS disabled" and "HS not
> enabled yet" from pg_control, that would make pg_ctl's behavior with
> cold-standby servers much cleaner.  Maybe it *is* worth messing with the
> contents of pg_control at this late hour.

I'm +0.5.


> My inclination for the least invasive fix is to leave the DBState
> enum alone and add a separate hot-standby state field with three
> values (disabled/not-yet-enabled/enabled).

Yea, that seems sane.


> Then pg_ctl would start
> probing the postmaster when it saw either DB_IN_PRODUCTION DBstate
> or hot-standby-enabled.  (It'd almost not have to probe the postmaster
> at all, except there's a race condition that the startup process
> will probably change the field a little before the postmaster gets
> the word to open the gates.)  On the other hand, if it saw
> DB_IN_ARCHIVE_RECOVERY with hot standby disabled, it'd stop waiting.

It'd be quite possible to address the race-condition by moving the
updating of the control file to postmaster, to the
CheckPostmasterSignal(PMSIGNAL_BEGIN_HOT_STANDBY) block. That'd require
updating the control file from postmaster, which'd be somewhat ugly.
Perhaps that indicates that field shouldn't be in pg_control, but in the
pid file?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing pg_ctl's reaction time

2017-06-26 Thread Tom Lane
Andres Freund  writes:
> Arguably we could and should improve the logic when the server has
> started, right now it's pretty messy because we never treat a standby as
> up if hot_standby is disabled...

True.  If you could tell the difference between "HS disabled" and "HS not
enabled yet" from pg_control, that would make pg_ctl's behavior with
cold-standby servers much cleaner.  Maybe it *is* worth messing with the
contents of pg_control at this late hour.

My inclination for the least invasive fix is to leave the DBState
enum alone and add a separate hot-standby state field with three
values (disabled/not-yet-enabled/enabled).  Then pg_ctl would start
probing the postmaster when it saw either DB_IN_PRODUCTION DBstate
or hot-standby-enabled.  (It'd almost not have to probe the postmaster
at all, except there's a race condition that the startup process
will probably change the field a little before the postmaster gets
the word to open the gates.)  On the other hand, if it saw
DB_IN_ARCHIVE_RECOVERY with hot standby disabled, it'd stop waiting.

Any objections to that design sketch?  Do we need to distinguish
between master and slave servers in the when-to-stop-waiting logic?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing pg_ctl's reaction time

2017-06-26 Thread Tom Lane
I wrote:
> Andres Freund  writes:
>> It'd not be unreasonble to check pg_control first, and only after that
>> indicates readyness check via the protocol.

> Hm, that's a thought.  The problem here isn't the frequency of checks,
> but the log spam.

Actually, that wouldn't help much as things stand, because you can't
tell from pg_control whether hot standby is active.  Assuming that
we want "pg_ctl start" to come back as soon as connections are allowed,
it'd have to start probing when it sees DB_IN_ARCHIVE_RECOVERY, which
means Jeff still has a problem with long recovery sessions.

We could maybe address that by changing the set of states in pg_control
(or perhaps simpler, adding a "hot standby active" flag there).  That
might have wider consequences than we really want to deal with post-beta1
though.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing pg_ctl's reaction time

2017-06-26 Thread Andres Freund
On 2017-06-26 16:26:00 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2017-06-26 16:19:16 -0400, Tom Lane wrote:
> >> Sure, what do you think an appropriate behavior would be?
> 
> > It'd not be unreasonble to check pg_control first, and only after that
> > indicates readyness check via the protocol.
> 
> Hm, that's a thought.  The problem here isn't the frequency of checks,
> but the log spam.

Right.  I think to deal with hot-standby we'd probably have to add new
state to the control file however. We don't just want to treat the
server as ready once DB_IN_PRODUCTION is reached.

Arguably we could and should improve the logic when the server has
started, right now it's pretty messy because we never treat a standby as
up if hot_standby is disabled...

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing pg_ctl's reaction time

2017-06-26 Thread Tom Lane
Andres Freund  writes:
> On 2017-06-26 16:19:16 -0400, Tom Lane wrote:
>> Sure, what do you think an appropriate behavior would be?

> It'd not be unreasonble to check pg_control first, and only after that
> indicates readyness check via the protocol.

Hm, that's a thought.  The problem here isn't the frequency of checks,
but the log spam.

> Doesn't quite seem like something backpatchable tho.

I didn't back-patch the pg_ctl change anyway, so that's no issue.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing pg_ctl's reaction time

2017-06-26 Thread Andres Freund
On 2017-06-26 16:19:16 -0400, Tom Lane wrote:
> Jeff Janes  writes:
> > The 10 fold increase in log spam during long PITR recoveries is a bit
> > unfortunate.
> 
> > 9153  2017-06-26 12:55:40.243 PDT FATAL:  the database system is starting up
> > 9154  2017-06-26 12:55:40.345 PDT FATAL:  the database system is starting up
> > 9156  2017-06-26 12:55:40.447 PDT FATAL:  the database system is starting up
> > 9157  2017-06-26 12:55:40.550 PDT FATAL:  the database system is starting up
> > ...
> 
> > I can live with it, but could we use an escalating wait time so it slows
> > back down to once a second after a while?
> 
> Sure, what do you think an appropriate behavior would be?

It'd not be unreasonble to check pg_control first, and only after that
indicates readyness check via the protocol. Doesn't quite seem like
something backpatchable tho.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing pg_ctl's reaction time

2017-06-26 Thread Tom Lane
Jeff Janes  writes:
> The 10 fold increase in log spam during long PITR recoveries is a bit
> unfortunate.

> 9153  2017-06-26 12:55:40.243 PDT FATAL:  the database system is starting up
> 9154  2017-06-26 12:55:40.345 PDT FATAL:  the database system is starting up
> 9156  2017-06-26 12:55:40.447 PDT FATAL:  the database system is starting up
> 9157  2017-06-26 12:55:40.550 PDT FATAL:  the database system is starting up
> ...

> I can live with it, but could we use an escalating wait time so it slows
> back down to once a second after a while?

Sure, what do you think an appropriate behavior would be?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing pg_ctl's reaction time

2017-06-26 Thread Jeff Janes
On Mon, Jun 26, 2017 at 12:15 PM, Tom Lane  wrote:

> Michael Paquier  writes:
> > On Mon, Jun 26, 2017 at 7:13 AM, Tom Lane  wrote:
> >> The attached proposed patch adjusts pg_ctl to check every 100msec,
> >> instead of every second, for the postmaster to be done starting or
> >> stopping.
>
> >> +#define WAITS_PER_SEC  10  /* should divide 100 evenly */
>
> > As a matter of style, you could define 100 as well in a variable
> > and refer to the variable for the division.
>
> Good idea, done that way.  (My initial thought was to use USECS_PER_SEC
> from timestamp.h, but that's declared as int64 which would have
> complicated matters, so I just made a new symbol.)
>
> > This also pops up more easily failures with 001_stream_rep.pl without
> > a patch applied from the other recent thread, so this patch had better
> > not get in before anything from
> > https://www.postgresql.org/message-id/8962.1498425...@sss.pgh.pa.us.
>
> Check.  I pushed your fix for that first.
>
> Thanks for the review!
>
> regards, tom lane
>



The 10 fold increase in log spam during long PITR recoveries is a bit
unfortunate.

9153  2017-06-26 12:55:40.243 PDT FATAL:  the database system is starting up
9154  2017-06-26 12:55:40.345 PDT FATAL:  the database system is starting up
9156  2017-06-26 12:55:40.447 PDT FATAL:  the database system is starting up
9157  2017-06-26 12:55:40.550 PDT FATAL:  the database system is starting up
...

I can live with it, but could we use an escalating wait time so it slows
back down to once a second after a while?

Cheers,

Jeff


Re: [HACKERS] Reducing pg_ctl's reaction time

2017-06-26 Thread Tom Lane
Michael Paquier  writes:
> On Mon, Jun 26, 2017 at 7:13 AM, Tom Lane  wrote:
>> The attached proposed patch adjusts pg_ctl to check every 100msec,
>> instead of every second, for the postmaster to be done starting or
>> stopping.

>> +#define WAITS_PER_SEC  10  /* should divide 100 evenly */

> As a matter of style, you could define 100 as well in a variable
> and refer to the variable for the division.

Good idea, done that way.  (My initial thought was to use USECS_PER_SEC
from timestamp.h, but that's declared as int64 which would have
complicated matters, so I just made a new symbol.)

> This also pops up more easily failures with 001_stream_rep.pl without
> a patch applied from the other recent thread, so this patch had better
> not get in before anything from
> https://www.postgresql.org/message-id/8962.1498425...@sss.pgh.pa.us.

Check.  I pushed your fix for that first.

Thanks for the review!

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing pg_ctl's reaction time

2017-06-26 Thread Michael Paquier
On Mon, Jun 26, 2017 at 7:13 AM, Tom Lane  wrote:
> The attached proposed patch adjusts pg_ctl to check every 100msec,
> instead of every second, for the postmaster to be done starting or
> stopping.  This cuts the runtime of the recovery TAP tests from around
> 4m30s to around 3m10s on my machine, a good 30% savings.  I experimented
> with different check frequencies but there doesn't seem to be anything
> to be gained by cutting the delay further (and presumably, it becomes
> counterproductive at some point due to pg_ctl chewing too many cycles).

I see with a 2~3% noise similar improvements on my laptop with
non-parallel tests. That's nice.

+#define WAITS_PER_SEC  10  /* should divide 100 evenly */
As a matter of style, you could define 100 as well in a variable
and refer to the variable for the division.

> Barring objections I'd like to push this soon.

This also pops up more easily failures with 001_stream_rep.pl without
a patch applied from the other recent thread, so this patch had better
not get in before anything from
https://www.postgresql.org/message-id/8962.1498425...@sss.pgh.pa.us.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Reducing pg_ctl's reaction time

2017-06-25 Thread Tom Lane
I still have a bee in my bonnet about how slow the recovery TAP tests
are, and especially about how low the CPU usage is while they run,
suggesting that a lot of the wall clock time is being expended on
useless sleeps.  Some analysis I did today found some low-hanging fruit
there: a significant part of the time is being spent in pg_ctl waiting
for postmaster start/stop operations.  It waits in quanta of 1 second,
but the postmaster usually starts or stops in much less than that.
(In these tests, most of the shutdown checkpoints have little to do,
so that in many cases the postmaster stops in just a couple of msec.
Startup isn't very many msec either.)

The attached proposed patch adjusts pg_ctl to check every 100msec,
instead of every second, for the postmaster to be done starting or
stopping.  This cuts the runtime of the recovery TAP tests from around
4m30s to around 3m10s on my machine, a good 30% savings.  I experimented
with different check frequencies but there doesn't seem to be anything
to be gained by cutting the delay further (and presumably, it becomes
counterproductive at some point due to pg_ctl chewing too many cycles).

This change probably doesn't offer much real-world benefit, since few
people start/stop their postmasters often, and shutdown checkpoints are
seldom so cheap on production installations.  Still, it doesn't seem
like it could hurt.  The original choice to use once-per-second checks
was made for hardware a lot slower than what most of us use now; I do
not think it's been revisited since the first implementation of pg_ctl
in 1999 (cf 5b912b089).

Barring objections I'd like to push this soon.

regards, tom lane

diff --git a/src/bin/pg_ctl/pg_ctl.c b/src/bin/pg_ctl/pg_ctl.c
index 82ac62d..1e6cb69 100644
*** a/src/bin/pg_ctl/pg_ctl.c
--- b/src/bin/pg_ctl/pg_ctl.c
*** typedef enum
*** 68,73 
--- 68,75 
  
  #define DEFAULT_WAIT	60
  
+ #define WAITS_PER_SEC	10		/* should divide 100 evenly */
+ 
  static bool do_wait = true;
  static int	wait_seconds = DEFAULT_WAIT;
  static bool wait_seconds_arg = false;
*** test_postmaster_connection(pgpid_t pm_pi
*** 531,537 
  
  	connstr[0] = '\0';
  
! 	for (i = 0; i < wait_seconds; i++)
  	{
  		/* Do we need a connection string? */
  		if (connstr[0] == '\0')
--- 533,539 
  
  	connstr[0] = '\0';
  
! 	for (i = 0; i < wait_seconds * WAITS_PER_SEC; i++)
  	{
  		/* Do we need a connection string? */
  		if (connstr[0] == '\0')
*** test_postmaster_connection(pgpid_t pm_pi
*** 701,724 
  #endif
  
  		/* No response, or startup still in process; wait */
! #ifdef WIN32
! 		if (do_checkpoint)
  		{
! 			/*
! 			 * Increment the wait hint by 6 secs (connection timeout + sleep)
! 			 * We must do this to indicate to the SCM that our startup time is
! 			 * changing, otherwise it'll usually send a stop signal after 20
! 			 * seconds, despite incrementing the checkpoint counter.
! 			 */
! 			status.dwWaitHint += 6000;
! 			status.dwCheckPoint++;
! 			SetServiceStatus(hStatus, (LPSERVICE_STATUS) );
! 		}
! 		else
  #endif
! 			print_msg(".");
  
! 		pg_usleep(100);		/* 1 sec */
  	}
  
  	/* return result of last call to PQping */
--- 703,730 
  #endif
  
  		/* No response, or startup still in process; wait */
! 		if (i % WAITS_PER_SEC == 0)
  		{
! #ifdef WIN32
! 			if (do_checkpoint)
! 			{
! /*
!  * Increment the wait hint by 6 secs (connection timeout +
!  * sleep). We must do this to indicate to the SCM that our
!  * startup time is changing, otherwise it'll usually send a
!  * stop signal after 20 seconds, despite incrementing the
!  * checkpoint counter.
!  */
! status.dwWaitHint += 6000;
! status.dwCheckPoint++;
! SetServiceStatus(hStatus, (LPSERVICE_STATUS) );
! 			}
! 			else
  #endif
! print_msg(".");
! 		}
  
! 		pg_usleep(100 / WAITS_PER_SEC); /* 1/WAITS_PER_SEC sec */
  	}
  
  	/* return result of last call to PQping */
*** do_stop(void)
*** 998,1009 
  
  		print_msg(_("waiting for server to shut down..."));
  
! 		for (cnt = 0; cnt < wait_seconds; cnt++)
  		{
  			if ((pid = get_pgpid(false)) != 0)
  			{
! print_msg(".");
! pg_usleep(100); /* 1 sec */
  			}
  			else
  break;
--- 1004,1016 
  
  		print_msg(_("waiting for server to shut down..."));
  
! 		for (cnt = 0; cnt < wait_seconds * WAITS_PER_SEC; cnt++)
  		{
  			if ((pid = get_pgpid(false)) != 0)
  			{
! if (cnt % WAITS_PER_SEC == 0)
! 	print_msg(".");
! pg_usleep(100 / WAITS_PER_SEC); /* 1/WAITS_PER_SEC sec */
  			}
  			else
  break;
*** do_restart(void)
*** 1088,1099 
  
  		/* always wait for restart */
  
! 		for (cnt = 0; cnt < wait_seconds; cnt++)
  		{
  			if ((pid = get_pgpid(false)) != 0)
  			{
! print_msg(".");
! pg_usleep(100); /* 1 sec */
  			}
  			else
  break;
--- 1095,1107 
  
  		/* always wait for restart */
  
!