Re: [HACKERS] Reducing pg_ctl's reaction time
On Thu, Jun 29, 2017 at 11:39 AM, Tom Lanewrote: > Jeff Janes writes: > > In the now-committed version of this, the 'pg_ctl start' returns > > successfully as soon as the server reaches a consistent state. Which is > OK, > > except that it does the same thing when hot_standby=off. When > > hot_standby=off, I would expect it to wait for the end of recovery before > > exiting with a success code. > > Um, won't it be waiting forever with that definition? > > regards, tom lane > No, this isn't streaming. It hits the PITR limit (recovery_target_*), or runs out of archived wal, and then it opens for business. Cheers, Jeff
Re: [HACKERS] Reducing pg_ctl's reaction time
Jeff Janeswrites: > In the now-committed version of this, the 'pg_ctl start' returns > successfully as soon as the server reaches a consistent state. Which is OK, > except that it does the same thing when hot_standby=off. When > hot_standby=off, I would expect it to wait for the end of recovery before > exiting with a success code. Um, won't it be waiting forever with that definition? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing pg_ctl's reaction time
On June 29, 2017 10:19:46 AM PDT, Jeff Janeswrote: >On Tue, Jun 27, 2017 at 11:59 AM, Tom Lane wrote: > >> I wrote: >> > Andres Freund writes: >> >> On 2017-06-26 17:38:03 -0400, Tom Lane wrote: >> >>> Hm. Take that a bit further, and we could drop the connection >probes >> >>> altogether --- just put the whole responsibility on the >postmaster to >> >>> show in the pidfile whether it's ready for connections or not. >> >> >> Yea, that seems quite appealing, both from an architectural, >simplicity, >> >> and log noise perspective. I wonder if there's some added >reliability by >> >> the connection probe though? Essentially wondering if it'd be >worthwhile >> >> to keep a single connection test at the end. I'm somewhat >disinclined >> >> though. >> >> > I agree --- part of the appeal of this idea is that there could be >a net >> > subtraction of code from pg_ctl. (I think it wouldn't have to link >libpq >> > anymore at all, though maybe I forgot something.) And you get rid >of a >> > bunch of can't-connect failure modes, eg kernel packet filter in >the way, >> > which probably outweighs any hypothetical reliability gain from >> confirming >> > the postmaster state the old way. >> >> Here's a draft patch for that. I quite like the results --- this >seems >> way simpler and more reliable than what pg_ctl has done up to now. >> However, it's certainly arguable that this is too much change for an >> optional post-beta patch. > > >In the now-committed version of this, the 'pg_ctl start' returns >successfully as soon as the server reaches a consistent state. Which is >OK, >except that it does the same thing when hot_standby=off. When >hot_standby=off, I would expect it to wait for the end of recovery >before >exiting with a success code. I've not looked at the committed version, but earlier versions had code dealing with that difference; essentially doing what you suggest. Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing pg_ctl's reaction time
On Tue, Jun 27, 2017 at 11:59 AM, Tom Lanewrote: > I wrote: > > Andres Freund writes: > >> On 2017-06-26 17:38:03 -0400, Tom Lane wrote: > >>> Hm. Take that a bit further, and we could drop the connection probes > >>> altogether --- just put the whole responsibility on the postmaster to > >>> show in the pidfile whether it's ready for connections or not. > > >> Yea, that seems quite appealing, both from an architectural, simplicity, > >> and log noise perspective. I wonder if there's some added reliability by > >> the connection probe though? Essentially wondering if it'd be worthwhile > >> to keep a single connection test at the end. I'm somewhat disinclined > >> though. > > > I agree --- part of the appeal of this idea is that there could be a net > > subtraction of code from pg_ctl. (I think it wouldn't have to link libpq > > anymore at all, though maybe I forgot something.) And you get rid of a > > bunch of can't-connect failure modes, eg kernel packet filter in the way, > > which probably outweighs any hypothetical reliability gain from > confirming > > the postmaster state the old way. > > Here's a draft patch for that. I quite like the results --- this seems > way simpler and more reliable than what pg_ctl has done up to now. > However, it's certainly arguable that this is too much change for an > optional post-beta patch. In the now-committed version of this, the 'pg_ctl start' returns successfully as soon as the server reaches a consistent state. Which is OK, except that it does the same thing when hot_standby=off. When hot_standby=off, I would expect it to wait for the end of recovery before exiting with a success code. Cheers, Jeff
Re: [HACKERS] Reducing pg_ctl's reaction time
On Wed, Jun 28, 2017 at 8:31 PM, Tom Lanewrote: > Andres Freund writes: >> On 2017-06-27 14:59:18 -0400, Tom Lane wrote: >>> However, it's certainly arguable that this is too much change for an >>> optional post-beta patch. > >> Yea, I think there's a valid case to be made for that. I'm still >> inclined to go along with this, it seems we're otherwise just adding >> complexity we're going to remove in a bit again. > > I'm not hearing anyone speaking against doing this now, so I'm going > to go ahead with it. +1 for going for it. Besides pg_ctl it would also help cluster management software. In Patroni we basically reimplement pg_ctl to get at the started postmaster pid to detect a crash during postmaster startup earlier and to be able to reliably cancel start. Getting the current state from the pid file sounds better than having a loop poke the server with pg_isready. To introduce feature creep, I would like to see more detailed states than proposed here. Specifically I'm interested in knowing when PM_WAIT_BACKENDS ends. When we lose quorum we restart PostgreSQL as a standby. We use a regular fast shutdown, but that can take a long time due to the shutdown checkpoint. The leader lease may run out during this time so we would have to escalate to immediate shutdown or have a watchdog fence the node. If we knew that no user backends are left we can let the shutdown checkpoint complete at leisure without risk for split brain. Regards, Ants Aasma -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing pg_ctl's reaction time
Alvaro Herrerawrites: > Tom Lane wrote: >> So when I removed the miscadmin.h include, I found out that pg_ctl is >> also relying on PG_BACKEND_VERSIONSTR from that file. >> >> There are at least three things we could do here: >> >> 1. Give this up as not worth this much trouble. >> >> 2. Move PG_BACKEND_VERSIONSTR into pg_config.h to go along with the >> other version-related macros. > pg_config.h sounds like a decent enough solution. It's a bit strange > this hasn't come up before, given that that symbol is used more in > frontend environ than backend. Right at the moment, what I've done is to stick it into port.h beside the declaration of find_other_exec, since the existing uses are all as parameters of find_other_exec[_or_die]. But maybe that's a bit too expedient. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing pg_ctl's reaction time
Tom Lane wrote: > Andres Freundwrites: > So when I removed the miscadmin.h include, I found out that pg_ctl is > also relying on PG_BACKEND_VERSIONSTR from that file. > > There are at least three things we could do here: > > 1. Give this up as not worth this much trouble. > > 2. Move PG_BACKEND_VERSIONSTR into pg_config.h to go along with the > other version-related macros. pg_config.h sounds like a decent enough solution. It's a bit strange this hasn't come up before, given that that symbol is used more in frontend environ than backend. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing pg_ctl's reaction time
Andres Freundwrites: > On 2017-06-28 13:31:27 -0400, Tom Lane wrote: >> While looking this over again, I got worried about the fact that pg_ctl >> is #including "miscadmin.h". That's a pretty low-level backend header >> and it wouldn't be surprising at all if somebody tried to put stuff in >> it that wouldn't compile frontend-side. I think we should take the >> opportunity, as long as we're touching this stuff, to split the #defines >> that describe the contents of postmaster.pid into a separate header file. >> Maybe "utils/pidfile.h" ? > Yes, that sounds like a valid concern, and solution. So when I removed the miscadmin.h include, I found out that pg_ctl is also relying on PG_BACKEND_VERSIONSTR from that file. There are at least three things we could do here: 1. Give this up as not worth this much trouble. 2. Move PG_BACKEND_VERSIONSTR into pg_config.h to go along with the other version-related macros. 3. Give PG_BACKEND_VERSIONSTR its very own new header file. Any preferences? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing pg_ctl's reaction time
On 2017-06-28 13:31:27 -0400, Tom Lane wrote: > I'm not hearing anyone speaking against doing this now, so I'm going > to go ahead with it. Cool. > While looking this over again, I got worried about the fact that pg_ctl > is #including "miscadmin.h". That's a pretty low-level backend header > and it wouldn't be surprising at all if somebody tried to put stuff in > it that wouldn't compile frontend-side. I think we should take the > opportunity, as long as we're touching this stuff, to split the #defines > that describe the contents of postmaster.pid into a separate header file. > Maybe "utils/pidfile.h" ? Yes, that sounds like a valid concern, and solution. - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing pg_ctl's reaction time
Andres Freundwrites: > On 2017-06-27 14:59:18 -0400, Tom Lane wrote: >> However, it's certainly arguable that this is too much change for an >> optional post-beta patch. > Yea, I think there's a valid case to be made for that. I'm still > inclined to go along with this, it seems we're otherwise just adding > complexity we're going to remove in a bit again. I'm not hearing anyone speaking against doing this now, so I'm going to go ahead with it. > The 8-space thing in multiple places is a bit ugly. How about having a > a bunch of constants declared in one place? While looking this over again, I got worried about the fact that pg_ctl is #including "miscadmin.h". That's a pretty low-level backend header and it wouldn't be surprising at all if somebody tried to put stuff in it that wouldn't compile frontend-side. I think we should take the opportunity, as long as we're touching this stuff, to split the #defines that describe the contents of postmaster.pid into a separate header file. Maybe "utils/pidfile.h" ? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing pg_ctl's reaction time
Andres Freundwrites: > On 2017-06-27 14:59:18 -0400, Tom Lane wrote: >> If we decide that it has to wait for v11, >> I'd address Jeff's complaint by hacking the loop behavior in >> test_postmaster_connection, which'd be ugly but not many lines of code. > Basically increasing the wait time over time? I was thinking of just dropping back to once-per-second after a couple of seconds, with something along the lines of this in place of the existing sleep at the bottom of the loop: if (i >= 2 * WAITS_PER_SEC) { pg_usleep(USEC_PER_SEC); i += WAITS_PER_SEC - 1; } else pg_usleep(USEC_PER_SEC / WAITS_PER_SEC); > The 8-space thing in multiple places is a bit ugly. How about having a > a bunch of constants declared in one place? Alternatively make it > something like: status: $c, where $c is a one character code for the > various states? Yeah, we could add the string values as macros in miscadmin.h, perhaps. I don't like the single-character idea --- if we do expand the number of things reported this way, that could get restrictive. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing pg_ctl's reaction time
Hi, On 2017-06-27 14:59:18 -0400, Tom Lane wrote: > Here's a draft patch for that. I quite like the results --- this seems > way simpler and more reliable than what pg_ctl has done up to now. Yea, I like that too. > However, it's certainly arguable that this is too much change for an > optional post-beta patch. Yea, I think there's a valid case to be made for that. I'm still inclined to go along with this, it seems we're otherwise just adding complexity we're going to remove in a bit again. > If we decide that it has to wait for v11, > I'd address Jeff's complaint by hacking the loop behavior in > test_postmaster_connection, which'd be ugly but not many lines of code. Basically increasing the wait time over time? > Note that I followed the USE_SYSTEMD patch's lead as to where to report > postmaster state changes. Arguably, in the standby-server case, we > should not report the postmaster is ready until we've reached consistency. > But that would require additional signaling from the startup process > to the postmaster, so it seems like a separate change if we want it. I suspect we're going to want to add more states to this over time, but as you say, there's no need to hurry. > /* > + * Report postmaster status in the postmaster.pid file, to allow pg_ctl > to > + * see what's happening. Note that all strings written to the status > line > + * must be the same length, per comments for AddToDataDirLockFile(). We > + * currently make them all 8 bytes, padding with spaces. > + */ > + AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, "starting"); The 8-space thing in multiple places is a bit ugly. How about having a a bunch of constants declared in one place? Alternatively make it something like: status: $c, where $c is a one character code for the various states? > @@ -1149,8 +1149,9 @@ TouchSocketLockFiles(void) > * > * Note: because we don't truncate the file, if we were to rewrite a line > * with less data than it had before, there would be garbage after the last > - * line. We don't ever actually do that, so not worth adding another kernel > - * call to cover the possibility. > + * line. While we could fix that by adding a truncate call, that would make > + * the file update non-atomic, which we'd rather avoid. Therefore, callers > + * should endeavor never to shorten a line once it's been written. > */ > void > AddToDataDirLockFile(int target_line, const char *str) > @@ -1193,19 +1194,26 @@ AddToDataDirLockFile(int target_line, co > srcptr = srcbuffer; > for (lineno = 1; lineno < target_line; lineno++) > { > - if ((srcptr = strchr(srcptr, '\n')) == NULL) > - { > - elog(LOG, "incomplete data in \"%s\": found only %d > newlines while trying to add line %d", > - DIRECTORY_LOCK_FILE, lineno - 1, target_line); > - close(fd); > - return; > - } > - srcptr++; > + char *eol = strchr(srcptr, '\n'); > + > + if (eol == NULL) > + break; /* not enough lines in > file yet */ > + srcptr = eol + 1; > } > memcpy(destbuffer, srcbuffer, srcptr - srcbuffer); > destptr = destbuffer + (srcptr - srcbuffer); > > /* > + * Fill in any missing lines before the target line, in case lines are > + * added to the file out of order. > + */ > + for (; lineno < target_line; lineno++) > + { > + if (destptr < destbuffer + sizeof(destbuffer)) > + *destptr++ = '\n'; > + } > + > + /* >* Write or rewrite the target line. >*/ > snprintf(destptr, destbuffer + sizeof(destbuffer) - destptr, > "%s\n", str); Not this patches fault, and shouldn't be changed now, but this is a fairly weird way to manage and update this file. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing pg_ctl's reaction time
I wrote: > Andres Freundwrites: >> On 2017-06-26 17:38:03 -0400, Tom Lane wrote: >>> Hm. Take that a bit further, and we could drop the connection probes >>> altogether --- just put the whole responsibility on the postmaster to >>> show in the pidfile whether it's ready for connections or not. >> Yea, that seems quite appealing, both from an architectural, simplicity, >> and log noise perspective. I wonder if there's some added reliability by >> the connection probe though? Essentially wondering if it'd be worthwhile >> to keep a single connection test at the end. I'm somewhat disinclined >> though. > I agree --- part of the appeal of this idea is that there could be a net > subtraction of code from pg_ctl. (I think it wouldn't have to link libpq > anymore at all, though maybe I forgot something.) And you get rid of a > bunch of can't-connect failure modes, eg kernel packet filter in the way, > which probably outweighs any hypothetical reliability gain from confirming > the postmaster state the old way. Here's a draft patch for that. I quite like the results --- this seems way simpler and more reliable than what pg_ctl has done up to now. However, it's certainly arguable that this is too much change for an optional post-beta patch. If we decide that it has to wait for v11, I'd address Jeff's complaint by hacking the loop behavior in test_postmaster_connection, which'd be ugly but not many lines of code. Note that I followed the USE_SYSTEMD patch's lead as to where to report postmaster state changes. Arguably, in the standby-server case, we should not report the postmaster is ready until we've reached consistency. But that would require additional signaling from the startup process to the postmaster, so it seems like a separate change if we want it. Thoughts? regards, tom lane diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c index 2874f63..38f534f 100644 *** a/src/backend/postmaster/postmaster.c --- b/src/backend/postmaster/postmaster.c *** PostmasterMain(int argc, char *argv[]) *** 1341,1346 --- 1341,1354 #endif /* + * Report postmaster status in the postmaster.pid file, to allow pg_ctl to + * see what's happening. Note that all strings written to the status line + * must be the same length, per comments for AddToDataDirLockFile(). We + * currently make them all 8 bytes, padding with spaces. + */ + AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, "starting"); + + /* * We're ready to rock and roll... */ StartupPID = StartupDataBase(); *** pmdie(SIGNAL_ARGS) *** 2608,2613 --- 2616,2624 Shutdown = SmartShutdown; ereport(LOG, (errmsg("received smart shutdown request"))); + + /* Report status */ + AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, "stopping"); #ifdef USE_SYSTEMD sd_notify(0, "STOPPING=1"); #endif *** pmdie(SIGNAL_ARGS) *** 2663,2668 --- 2674,2682 Shutdown = FastShutdown; ereport(LOG, (errmsg("received fast shutdown request"))); + + /* Report status */ + AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, "stopping"); #ifdef USE_SYSTEMD sd_notify(0, "STOPPING=1"); #endif *** pmdie(SIGNAL_ARGS) *** 2727,2732 --- 2741,2749 Shutdown = ImmediateShutdown; ereport(LOG, (errmsg("received immediate shutdown request"))); + + /* Report status */ + AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, "stopping"); #ifdef USE_SYSTEMD sd_notify(0, "STOPPING=1"); #endif *** reaper(SIGNAL_ARGS) *** 2872,2877 --- 2889,2896 ereport(LOG, (errmsg("database system is ready to accept connections"))); + /* Report status */ + AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, "ready "); #ifdef USE_SYSTEMD sd_notify(0, "READY=1"); #endif *** sigusr1_handler(SIGNAL_ARGS) *** 5005,5014 if (XLogArchivingAlways()) PgArchPID = pgarch_start(); ! #ifdef USE_SYSTEMD if (!EnableHotStandby) sd_notify(0, "READY=1"); #endif pmState = PM_RECOVERY; } --- 5024,5041 if (XLogArchivingAlways()) PgArchPID = pgarch_start(); ! /* ! * If we aren't planning to enter hot standby mode later, treat ! * RECOVERY_STARTED as meaning we're out of startup, and report status ! * accordingly. ! */ if (!EnableHotStandby) + { + AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, "standby "); + #ifdef USE_SYSTEMD sd_notify(0, "READY=1"); #endif + } pmState = PM_RECOVERY; } *** sigusr1_handler(SIGNAL_ARGS) *** 5024,5029 --- 5051,5058 ereport(LOG, (errmsg("database system is ready to accept read only connections"))); + /* Report status */ + AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, "ready "); #ifdef USE_SYSTEMD sd_notify(0, "READY=1"); #endif diff --git
Re: [HACKERS] Reducing pg_ctl's reaction time
Andres Freundwrites: > On 2017-06-26 17:38:03 -0400, Tom Lane wrote: >> Hm. Take that a bit further, and we could drop the connection probes >> altogether --- just put the whole responsibility on the postmaster to >> show in the pidfile whether it's ready for connections or not. > Yea, that seems quite appealing, both from an architectural, simplicity, > and log noise perspective. I wonder if there's some added reliability by > the connection probe though? Essentially wondering if it'd be worthwhile > to keep a single connection test at the end. I'm somewhat disinclined > though. I agree --- part of the appeal of this idea is that there could be a net subtraction of code from pg_ctl. (I think it wouldn't have to link libpq anymore at all, though maybe I forgot something.) And you get rid of a bunch of can't-connect failure modes, eg kernel packet filter in the way, which probably outweighs any hypothetical reliability gain from confirming the postmaster state the old way. This would mean that v10 pg_ctl could not be used to start/stop older postmasters, which is flexibility we tried to preserve in the past. But I see that we already gave that up in quite a few code paths because of their attempts to read pg_control, so I think it's a concern we can ignore. I'll draft something up later. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing pg_ctl's reaction time
On 2017-06-26 17:38:03 -0400, Tom Lane wrote: > Andres Freundwrites: > > On 2017-06-26 17:30:30 -0400, Tom Lane wrote: > >> No, I don't like that at all. Has race conditions against updates > >> coming from the startup process. > > > You'd obviously have to take the appropriate locks. I think the issue > > here is less race conditions, and more that architecturally we'd > > interact with shmem too much. > > Uh, we are *not* taking any locks in the postmaster. I'm not sure why you're 'Uh'ing, when my my point pretty much is that we do not want to do so? > Hm. Take that a bit further, and we could drop the connection probes > altogether --- just put the whole responsibility on the postmaster to > show in the pidfile whether it's ready for connections or not. Yea, that seems quite appealing, both from an architectural, simplicity, and log noise perspective. I wonder if there's some added reliability by the connection probe though? Essentially wondering if it'd be worthwhile to keep a single connection test at the end. I'm somewhat disinclined though. - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing pg_ctl's reaction time
Andres Freundwrites: > On 2017-06-26 17:30:30 -0400, Tom Lane wrote: >> No, I don't like that at all. Has race conditions against updates >> coming from the startup process. > You'd obviously have to take the appropriate locks. I think the issue > here is less race conditions, and more that architecturally we'd > interact with shmem too much. Uh, we are *not* taking any locks in the postmaster. >> Yeah, that would be a different way to go at it. The postmaster would >> probably just write the state of the hot_standby GUC to the file, and >> pg_ctl would have to infer things from there. > I'd actually say we should just mirror the existing > #ifdef USE_SYSTEMD > if (!EnableHotStandby) > sd_notify(0, "READY=1"); > #endif > with corresponding pidfile updates - doesn't really seem necessary for > pg_ctl to do more? Hm. Take that a bit further, and we could drop the connection probes altogether --- just put the whole responsibility on the postmaster to show in the pidfile whether it's ready for connections or not. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing pg_ctl's reaction time
On 2017-06-26 17:30:30 -0400, Tom Lane wrote: > Andres Freundwrites: > > It'd be quite possible to address the race-condition by moving the > > updating of the control file to postmaster, to the > > CheckPostmasterSignal(PMSIGNAL_BEGIN_HOT_STANDBY) block. That'd require > > updating the control file from postmaster, which'd be somewhat ugly. > > No, I don't like that at all. Has race conditions against updates > coming from the startup process. You'd obviously have to take the appropriate locks. I think the issue here is less race conditions, and more that architecturally we'd interact with shmem too much. > > Perhaps that indicates that field shouldn't be in pg_control, but in the > > pid file? > > Yeah, that would be a different way to go at it. The postmaster would > probably just write the state of the hot_standby GUC to the file, and > pg_ctl would have to infer things from there. I'd actually say we should just mirror the existing #ifdef USE_SYSTEMD if (!EnableHotStandby) sd_notify(0, "READY=1"); #endif with corresponding pidfile updates - doesn't really seem necessary for pg_ctl to do more? Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing pg_ctl's reaction time
Andres Freundwrites: > It'd be quite possible to address the race-condition by moving the > updating of the control file to postmaster, to the > CheckPostmasterSignal(PMSIGNAL_BEGIN_HOT_STANDBY) block. That'd require > updating the control file from postmaster, which'd be somewhat ugly. No, I don't like that at all. Has race conditions against updates coming from the startup process. > Perhaps that indicates that field shouldn't be in pg_control, but in the > pid file? Yeah, that would be a different way to go at it. The postmaster would probably just write the state of the hot_standby GUC to the file, and pg_ctl would have to infer things from there. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing pg_ctl's reaction time
Hi, On 2017-06-26 16:49:07 -0400, Tom Lane wrote: > Andres Freundwrites: > > Arguably we could and should improve the logic when the server has > > started, right now it's pretty messy because we never treat a standby as > > up if hot_standby is disabled... > > True. If you could tell the difference between "HS disabled" and "HS not > enabled yet" from pg_control, that would make pg_ctl's behavior with > cold-standby servers much cleaner. Maybe it *is* worth messing with the > contents of pg_control at this late hour. I'm +0.5. > My inclination for the least invasive fix is to leave the DBState > enum alone and add a separate hot-standby state field with three > values (disabled/not-yet-enabled/enabled). Yea, that seems sane. > Then pg_ctl would start > probing the postmaster when it saw either DB_IN_PRODUCTION DBstate > or hot-standby-enabled. (It'd almost not have to probe the postmaster > at all, except there's a race condition that the startup process > will probably change the field a little before the postmaster gets > the word to open the gates.) On the other hand, if it saw > DB_IN_ARCHIVE_RECOVERY with hot standby disabled, it'd stop waiting. It'd be quite possible to address the race-condition by moving the updating of the control file to postmaster, to the CheckPostmasterSignal(PMSIGNAL_BEGIN_HOT_STANDBY) block. That'd require updating the control file from postmaster, which'd be somewhat ugly. Perhaps that indicates that field shouldn't be in pg_control, but in the pid file? Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing pg_ctl's reaction time
Andres Freundwrites: > Arguably we could and should improve the logic when the server has > started, right now it's pretty messy because we never treat a standby as > up if hot_standby is disabled... True. If you could tell the difference between "HS disabled" and "HS not enabled yet" from pg_control, that would make pg_ctl's behavior with cold-standby servers much cleaner. Maybe it *is* worth messing with the contents of pg_control at this late hour. My inclination for the least invasive fix is to leave the DBState enum alone and add a separate hot-standby state field with three values (disabled/not-yet-enabled/enabled). Then pg_ctl would start probing the postmaster when it saw either DB_IN_PRODUCTION DBstate or hot-standby-enabled. (It'd almost not have to probe the postmaster at all, except there's a race condition that the startup process will probably change the field a little before the postmaster gets the word to open the gates.) On the other hand, if it saw DB_IN_ARCHIVE_RECOVERY with hot standby disabled, it'd stop waiting. Any objections to that design sketch? Do we need to distinguish between master and slave servers in the when-to-stop-waiting logic? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing pg_ctl's reaction time
I wrote: > Andres Freundwrites: >> It'd not be unreasonble to check pg_control first, and only after that >> indicates readyness check via the protocol. > Hm, that's a thought. The problem here isn't the frequency of checks, > but the log spam. Actually, that wouldn't help much as things stand, because you can't tell from pg_control whether hot standby is active. Assuming that we want "pg_ctl start" to come back as soon as connections are allowed, it'd have to start probing when it sees DB_IN_ARCHIVE_RECOVERY, which means Jeff still has a problem with long recovery sessions. We could maybe address that by changing the set of states in pg_control (or perhaps simpler, adding a "hot standby active" flag there). That might have wider consequences than we really want to deal with post-beta1 though. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing pg_ctl's reaction time
On 2017-06-26 16:26:00 -0400, Tom Lane wrote: > Andres Freundwrites: > > On 2017-06-26 16:19:16 -0400, Tom Lane wrote: > >> Sure, what do you think an appropriate behavior would be? > > > It'd not be unreasonble to check pg_control first, and only after that > > indicates readyness check via the protocol. > > Hm, that's a thought. The problem here isn't the frequency of checks, > but the log spam. Right. I think to deal with hot-standby we'd probably have to add new state to the control file however. We don't just want to treat the server as ready once DB_IN_PRODUCTION is reached. Arguably we could and should improve the logic when the server has started, right now it's pretty messy because we never treat a standby as up if hot_standby is disabled... - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing pg_ctl's reaction time
Andres Freundwrites: > On 2017-06-26 16:19:16 -0400, Tom Lane wrote: >> Sure, what do you think an appropriate behavior would be? > It'd not be unreasonble to check pg_control first, and only after that > indicates readyness check via the protocol. Hm, that's a thought. The problem here isn't the frequency of checks, but the log spam. > Doesn't quite seem like something backpatchable tho. I didn't back-patch the pg_ctl change anyway, so that's no issue. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing pg_ctl's reaction time
On 2017-06-26 16:19:16 -0400, Tom Lane wrote: > Jeff Janeswrites: > > The 10 fold increase in log spam during long PITR recoveries is a bit > > unfortunate. > > > 9153 2017-06-26 12:55:40.243 PDT FATAL: the database system is starting up > > 9154 2017-06-26 12:55:40.345 PDT FATAL: the database system is starting up > > 9156 2017-06-26 12:55:40.447 PDT FATAL: the database system is starting up > > 9157 2017-06-26 12:55:40.550 PDT FATAL: the database system is starting up > > ... > > > I can live with it, but could we use an escalating wait time so it slows > > back down to once a second after a while? > > Sure, what do you think an appropriate behavior would be? It'd not be unreasonble to check pg_control first, and only after that indicates readyness check via the protocol. Doesn't quite seem like something backpatchable tho. - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing pg_ctl's reaction time
Jeff Janeswrites: > The 10 fold increase in log spam during long PITR recoveries is a bit > unfortunate. > 9153 2017-06-26 12:55:40.243 PDT FATAL: the database system is starting up > 9154 2017-06-26 12:55:40.345 PDT FATAL: the database system is starting up > 9156 2017-06-26 12:55:40.447 PDT FATAL: the database system is starting up > 9157 2017-06-26 12:55:40.550 PDT FATAL: the database system is starting up > ... > I can live with it, but could we use an escalating wait time so it slows > back down to once a second after a while? Sure, what do you think an appropriate behavior would be? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing pg_ctl's reaction time
On Mon, Jun 26, 2017 at 12:15 PM, Tom Lanewrote: > Michael Paquier writes: > > On Mon, Jun 26, 2017 at 7:13 AM, Tom Lane wrote: > >> The attached proposed patch adjusts pg_ctl to check every 100msec, > >> instead of every second, for the postmaster to be done starting or > >> stopping. > > >> +#define WAITS_PER_SEC 10 /* should divide 100 evenly */ > > > As a matter of style, you could define 100 as well in a variable > > and refer to the variable for the division. > > Good idea, done that way. (My initial thought was to use USECS_PER_SEC > from timestamp.h, but that's declared as int64 which would have > complicated matters, so I just made a new symbol.) > > > This also pops up more easily failures with 001_stream_rep.pl without > > a patch applied from the other recent thread, so this patch had better > > not get in before anything from > > https://www.postgresql.org/message-id/8962.1498425...@sss.pgh.pa.us. > > Check. I pushed your fix for that first. > > Thanks for the review! > > regards, tom lane > The 10 fold increase in log spam during long PITR recoveries is a bit unfortunate. 9153 2017-06-26 12:55:40.243 PDT FATAL: the database system is starting up 9154 2017-06-26 12:55:40.345 PDT FATAL: the database system is starting up 9156 2017-06-26 12:55:40.447 PDT FATAL: the database system is starting up 9157 2017-06-26 12:55:40.550 PDT FATAL: the database system is starting up ... I can live with it, but could we use an escalating wait time so it slows back down to once a second after a while? Cheers, Jeff
Re: [HACKERS] Reducing pg_ctl's reaction time
Michael Paquierwrites: > On Mon, Jun 26, 2017 at 7:13 AM, Tom Lane wrote: >> The attached proposed patch adjusts pg_ctl to check every 100msec, >> instead of every second, for the postmaster to be done starting or >> stopping. >> +#define WAITS_PER_SEC 10 /* should divide 100 evenly */ > As a matter of style, you could define 100 as well in a variable > and refer to the variable for the division. Good idea, done that way. (My initial thought was to use USECS_PER_SEC from timestamp.h, but that's declared as int64 which would have complicated matters, so I just made a new symbol.) > This also pops up more easily failures with 001_stream_rep.pl without > a patch applied from the other recent thread, so this patch had better > not get in before anything from > https://www.postgresql.org/message-id/8962.1498425...@sss.pgh.pa.us. Check. I pushed your fix for that first. Thanks for the review! regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing pg_ctl's reaction time
On Mon, Jun 26, 2017 at 7:13 AM, Tom Lanewrote: > The attached proposed patch adjusts pg_ctl to check every 100msec, > instead of every second, for the postmaster to be done starting or > stopping. This cuts the runtime of the recovery TAP tests from around > 4m30s to around 3m10s on my machine, a good 30% savings. I experimented > with different check frequencies but there doesn't seem to be anything > to be gained by cutting the delay further (and presumably, it becomes > counterproductive at some point due to pg_ctl chewing too many cycles). I see with a 2~3% noise similar improvements on my laptop with non-parallel tests. That's nice. +#define WAITS_PER_SEC 10 /* should divide 100 evenly */ As a matter of style, you could define 100 as well in a variable and refer to the variable for the division. > Barring objections I'd like to push this soon. This also pops up more easily failures with 001_stream_rep.pl without a patch applied from the other recent thread, so this patch had better not get in before anything from https://www.postgresql.org/message-id/8962.1498425...@sss.pgh.pa.us. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Reducing pg_ctl's reaction time
I still have a bee in my bonnet about how slow the recovery TAP tests are, and especially about how low the CPU usage is while they run, suggesting that a lot of the wall clock time is being expended on useless sleeps. Some analysis I did today found some low-hanging fruit there: a significant part of the time is being spent in pg_ctl waiting for postmaster start/stop operations. It waits in quanta of 1 second, but the postmaster usually starts or stops in much less than that. (In these tests, most of the shutdown checkpoints have little to do, so that in many cases the postmaster stops in just a couple of msec. Startup isn't very many msec either.) The attached proposed patch adjusts pg_ctl to check every 100msec, instead of every second, for the postmaster to be done starting or stopping. This cuts the runtime of the recovery TAP tests from around 4m30s to around 3m10s on my machine, a good 30% savings. I experimented with different check frequencies but there doesn't seem to be anything to be gained by cutting the delay further (and presumably, it becomes counterproductive at some point due to pg_ctl chewing too many cycles). This change probably doesn't offer much real-world benefit, since few people start/stop their postmasters often, and shutdown checkpoints are seldom so cheap on production installations. Still, it doesn't seem like it could hurt. The original choice to use once-per-second checks was made for hardware a lot slower than what most of us use now; I do not think it's been revisited since the first implementation of pg_ctl in 1999 (cf 5b912b089). Barring objections I'd like to push this soon. regards, tom lane diff --git a/src/bin/pg_ctl/pg_ctl.c b/src/bin/pg_ctl/pg_ctl.c index 82ac62d..1e6cb69 100644 *** a/src/bin/pg_ctl/pg_ctl.c --- b/src/bin/pg_ctl/pg_ctl.c *** typedef enum *** 68,73 --- 68,75 #define DEFAULT_WAIT 60 + #define WAITS_PER_SEC 10 /* should divide 100 evenly */ + static bool do_wait = true; static int wait_seconds = DEFAULT_WAIT; static bool wait_seconds_arg = false; *** test_postmaster_connection(pgpid_t pm_pi *** 531,537 connstr[0] = '\0'; ! for (i = 0; i < wait_seconds; i++) { /* Do we need a connection string? */ if (connstr[0] == '\0') --- 533,539 connstr[0] = '\0'; ! for (i = 0; i < wait_seconds * WAITS_PER_SEC; i++) { /* Do we need a connection string? */ if (connstr[0] == '\0') *** test_postmaster_connection(pgpid_t pm_pi *** 701,724 #endif /* No response, or startup still in process; wait */ ! #ifdef WIN32 ! if (do_checkpoint) { ! /* ! * Increment the wait hint by 6 secs (connection timeout + sleep) ! * We must do this to indicate to the SCM that our startup time is ! * changing, otherwise it'll usually send a stop signal after 20 ! * seconds, despite incrementing the checkpoint counter. ! */ ! status.dwWaitHint += 6000; ! status.dwCheckPoint++; ! SetServiceStatus(hStatus, (LPSERVICE_STATUS) ); ! } ! else #endif ! print_msg("."); ! pg_usleep(100); /* 1 sec */ } /* return result of last call to PQping */ --- 703,730 #endif /* No response, or startup still in process; wait */ ! if (i % WAITS_PER_SEC == 0) { ! #ifdef WIN32 ! if (do_checkpoint) ! { ! /* ! * Increment the wait hint by 6 secs (connection timeout + ! * sleep). We must do this to indicate to the SCM that our ! * startup time is changing, otherwise it'll usually send a ! * stop signal after 20 seconds, despite incrementing the ! * checkpoint counter. ! */ ! status.dwWaitHint += 6000; ! status.dwCheckPoint++; ! SetServiceStatus(hStatus, (LPSERVICE_STATUS) ); ! } ! else #endif ! print_msg("."); ! } ! pg_usleep(100 / WAITS_PER_SEC); /* 1/WAITS_PER_SEC sec */ } /* return result of last call to PQping */ *** do_stop(void) *** 998,1009 print_msg(_("waiting for server to shut down...")); ! for (cnt = 0; cnt < wait_seconds; cnt++) { if ((pid = get_pgpid(false)) != 0) { ! print_msg("."); ! pg_usleep(100); /* 1 sec */ } else break; --- 1004,1016 print_msg(_("waiting for server to shut down...")); ! for (cnt = 0; cnt < wait_seconds * WAITS_PER_SEC; cnt++) { if ((pid = get_pgpid(false)) != 0) { ! if (cnt % WAITS_PER_SEC == 0) ! print_msg("."); ! pg_usleep(100 / WAITS_PER_SEC); /* 1/WAITS_PER_SEC sec */ } else break; *** do_restart(void) *** 1088,1099 /* always wait for restart */ ! for (cnt = 0; cnt < wait_seconds; cnt++) { if ((pid = get_pgpid(false)) != 0) { ! print_msg("."); ! pg_usleep(100); /* 1 sec */ } else break; --- 1095,1107 /* always wait for restart */ !