Re: [HACKERS] gitlab post-mortem: pg_basebackup waiting for checkpoint
Am Samstag, den 01.04.2017, 17:29 +0200 schrieb Magnus Hagander: > I've applied a backpatch to 9.4. Prior to that pretty much the entire > patch is a conflict, so it would need a full rewrite. Thanks! Michael -- Michael Banck Projektleiter / Senior Berater Tel.: +49 2166 9901-171 Fax: +49 2166 9901-100 Email: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gitlab post-mortem: pg_basebackup waiting for checkpoint
On Fri, Mar 31, 2017 at 8:59 AM, Magnus Haganderwrote: > > On Wed, Mar 29, 2017 at 1:05 PM, Michael Banck > wrote: > >> Hi, >> >> Am Montag, den 27.02.2017, 16:20 +0100 schrieb Magnus Hagander: >> > On Sun, Feb 26, 2017 at 9:59 PM, Tom Lane wrote: >> > Is there an argument for back-patching this? >> > >> > >> > Seems you were typing that at the same time as we did. >> > >> > >> > I'm considering it, but not swayed in either direction. Should I take >> > your comment as a vote that we should back-patch it? >> >> I've checked back into this thread, and there seems to be a +1 from Tom >> and a +(0.5-1) from Simon for backpatching, and no obvious -1s. Did you >> decide against it in the end, or is this still an open item? > > > No, I plan to work on it, so it's still an open item. I've been backlogged > with other things, but I will try to get too it soon. > > (This also includes considering Jeff's note) > > I've applied a backpatch to 9.4. Prior to that pretty much the entire patch is a conflict, so it would need a full rewrite. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Re: [HACKERS] gitlab post-mortem: pg_basebackup waiting for checkpoint
On Mon, Feb 27, 2017 at 7:46 PM, Jeff Janeswrote: > On Sun, Feb 26, 2017 at 12:32 PM, Magnus Hagander > wrote: > >> >> On Sun, Feb 26, 2017 at 8:27 PM, Michael Banck > > wrote: >> >>> Hi, >>> >>> Am Dienstag, den 14.02.2017, 18:18 -0500 schrieb Robert Haas: >>> > On Tue, Feb 14, 2017 at 4:06 PM, Alvaro Herrera >>> > wrote: >>> > > I'd rather have a --quiet mode instead. If you're running it by >>> hand, >>> > > you're likely to omit the switch, whereas when writing the cron job >>> > > you're going to notice lack of switch even before you let the job run >>> > > once. >>> > >>> > Well, that might've been a better way to design it, but changing it >>> > now would break backward compatibility and I'm not really sure that's >>> > a good idea. Even if it is, it's a separate concern from whether or >>> > not in the less-quiet mode we should point out that we're waiting for >>> > a checkpoint on the server side. >>> >>> ISTM the consensus is that there should be no output in regular mode, >>> but a message should be displayed in verbose and progress mode. >>> >>> So I went forth and also added a message in progress mode (unless >>> verbose messages are requested anyway). >>> >>> Regarding the documentation, I tried to clarify the difference between >>> the checkpoint types a bit more, but I think any further action is >>> probably a larger rewrite of the documentation on this topic. >>> >>> So attached are two patches, I've split it up in the documentation and >>> the code output part. I'll add it as one commitfest entry in the >>> "Clients" section though, as it's not really a big patch, unless >>> somebody thinks it should have a secon entry in "Documentation"? >> >> >> Agreed, and applied as one patch. Except I noticed you also fixed a >> couple of entries which were missing the progname in the messages -- I >> broke those out to a separate patch instead. >> >> Made a small change to "using as much I/O as available" rather than "as >> possible", which I think is a better wording, along with the change of the >> idle wording I suggested before. (but feel free to point it out to me if >> that's wrong). >> > > Should the below fprintf end in a \r rather than a \n, so that the the > progress message gets over-written once the checkpoint is done and we have > moved on? > > if (showprogress && !verbose) > fprintf(stderr, "waiting for checkpoint\n"); > > That would seem more in keeping with how the other progress messages > operate. > > Agreed, that makes more sense. I've pushed a patch that does this. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Re: [HACKERS] gitlab post-mortem: pg_basebackup waiting for checkpoint
On Wed, Mar 29, 2017 at 1:05 PM, Michael Banckwrote: > Hi, > > Am Montag, den 27.02.2017, 16:20 +0100 schrieb Magnus Hagander: > > On Sun, Feb 26, 2017 at 9:59 PM, Tom Lane wrote: > > Is there an argument for back-patching this? > > > > > > Seems you were typing that at the same time as we did. > > > > > > I'm considering it, but not swayed in either direction. Should I take > > your comment as a vote that we should back-patch it? > > I've checked back into this thread, and there seems to be a +1 from Tom > and a +(0.5-1) from Simon for backpatching, and no obvious -1s. Did you > decide against it in the end, or is this still an open item? No, I plan to work on it, so it's still an open item. I've been backlogged with other things, but I will try to get too it soon. (This also includes considering Jeff's note) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Re: [HACKERS] gitlab post-mortem: pg_basebackup waiting for checkpoint
Hi, Am Montag, den 27.02.2017, 16:20 +0100 schrieb Magnus Hagander: > On Sun, Feb 26, 2017 at 9:59 PM, Tom Lanewrote: > Is there an argument for back-patching this? > > > Seems you were typing that at the same time as we did. > > > I'm considering it, but not swayed in either direction. Should I take > your comment as a vote that we should back-patch it? I've checked back into this thread, and there seems to be a +1 from Tom and a +(0.5-1) from Simon for backpatching, and no obvious -1s. Did you decide against it in the end, or is this still an open item? Michael -- Michael Banck Projektleiter / Senior Berater Tel.: +49 2166 9901-171 Fax: +49 2166 9901-100 Email: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gitlab post-mortem: pg_basebackup waiting for checkpoint
On Sun, Feb 26, 2017 at 12:32 PM, Magnus Haganderwrote: > > On Sun, Feb 26, 2017 at 8:27 PM, Michael Banck > wrote: > >> Hi, >> >> Am Dienstag, den 14.02.2017, 18:18 -0500 schrieb Robert Haas: >> > On Tue, Feb 14, 2017 at 4:06 PM, Alvaro Herrera >> > wrote: >> > > I'd rather have a --quiet mode instead. If you're running it by hand, >> > > you're likely to omit the switch, whereas when writing the cron job >> > > you're going to notice lack of switch even before you let the job run >> > > once. >> > >> > Well, that might've been a better way to design it, but changing it >> > now would break backward compatibility and I'm not really sure that's >> > a good idea. Even if it is, it's a separate concern from whether or >> > not in the less-quiet mode we should point out that we're waiting for >> > a checkpoint on the server side. >> >> ISTM the consensus is that there should be no output in regular mode, >> but a message should be displayed in verbose and progress mode. >> >> So I went forth and also added a message in progress mode (unless >> verbose messages are requested anyway). >> >> Regarding the documentation, I tried to clarify the difference between >> the checkpoint types a bit more, but I think any further action is >> probably a larger rewrite of the documentation on this topic. >> >> So attached are two patches, I've split it up in the documentation and >> the code output part. I'll add it as one commitfest entry in the >> "Clients" section though, as it's not really a big patch, unless >> somebody thinks it should have a secon entry in "Documentation"? > > > Agreed, and applied as one patch. Except I noticed you also fixed a couple > of entries which were missing the progname in the messages -- I broke those > out to a separate patch instead. > > Made a small change to "using as much I/O as available" rather than "as > possible", which I think is a better wording, along with the change of the > idle wording I suggested before. (but feel free to point it out to me if > that's wrong). > Should the below fprintf end in a \r rather than a \n, so that the the progress message gets over-written once the checkpoint is done and we have moved on? if (showprogress && !verbose) fprintf(stderr, "waiting for checkpoint\n"); That would seem more in keeping with how the other progress messages operate. Cheers, Jeff
Re: [HACKERS] gitlab post-mortem: pg_basebackup waiting for checkpoint
On 26 February 2017 at 20:55, Magnus Haganderwrote: > What do others think? Changing the output behaviour of a command isn't something we usually do as a backpatch. This change doesn't affect the default behaviour so probably wouldn't make a difference to the outcome of the situation that generated this thread. Having said that, if it helps others to avoid mistakes in the future then its worth doing, so +1 to backpatch. I've looked into changing the actual underlying behaviour and I don't think its feasible, so making this change will at least allow some responsiveness from us. Thanks Michael, Magnus. -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gitlab post-mortem: pg_basebackup waiting for checkpoint
Magnus Haganderwrites: > On Sun, Feb 26, 2017 at 9:59 PM, Tom Lane wrote: >> Is there an argument for back-patching this? > I'm considering it, but not swayed in either direction. Should I take your > comment as a vote that we should back-patch it? Yeah, I'd vote for it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gitlab post-mortem: pg_basebackup waiting for checkpoint
On Sun, Feb 26, 2017 at 9:59 PM, Tom Lanewrote: > Magnus Hagander writes: > > On Sun, Feb 26, 2017 at 8:27 PM, Michael Banck < > michael.ba...@credativ.de> > > wrote: > >> ISTM the consensus is that there should be no output in regular mode, > >> but a message should be displayed in verbose and progress mode. > > > Agreed, and applied as one patch. > > Is there an argument for back-patching this? > Seems you were typing that at the same time as we did. I'm considering it, but not swayed in either direction. Should I take your comment as a vote that we should back-patch it? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Re: [HACKERS] gitlab post-mortem: pg_basebackup waiting for checkpoint
Magnus Haganderwrites: > On Sun, Feb 26, 2017 at 8:27 PM, Michael Banck > wrote: >> ISTM the consensus is that there should be no output in regular mode, >> but a message should be displayed in verbose and progress mode. > Agreed, and applied as one patch. Is there an argument for back-patching this? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gitlab post-mortem: pg_basebackup waiting for checkpoint
On Sun, Feb 26, 2017 at 9:53 PM, Michael Banckwrote: > Hi, > > Am Sonntag, den 26.02.2017, 21:32 +0100 schrieb Magnus Hagander: > > > On Sun, Feb 26, 2017 at 8:27 PM, Michael Banck > > wrote: > > > Agreed, and applied as one patch. Except I noticed you also fixed a > > couple of entries which were missing the progname in the messages -- I > > broke those out to a separate patch instead. > > Thanks! > > > Made a small change to "using as much I/O as available" rather than > > "as possible", which I think is a better wording, along with the > > change of the idle wording I suggested before. (but feel free to point > > it out to me if that's wrong). > > LGTM, I apparently missed your suggestion when I re-read the thread. > > I am just wondering whether this could/should be back-patched, maybe? It > is not a bug fix, of course, but OTOH is rather small and probably > helpful to some users on current releases. > > Good point. We should definitely back-patch the documentation updates. Not 100% sure about the others, as it's a small behaviour change. But since it's only in verbose mode, I doubt it is very likely to break anybodys scripts relying on certain output or so. What do others think? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Re: [HACKERS] gitlab post-mortem: pg_basebackup waiting for checkpoint
Hi, Am Sonntag, den 26.02.2017, 21:32 +0100 schrieb Magnus Hagander: > On Sun, Feb 26, 2017 at 8:27 PM, Michael Banck >wrote: > Agreed, and applied as one patch. Except I noticed you also fixed a > couple of entries which were missing the progname in the messages -- I > broke those out to a separate patch instead. Thanks! > Made a small change to "using as much I/O as available" rather than > "as possible", which I think is a better wording, along with the > change of the idle wording I suggested before. (but feel free to point > it out to me if that's wrong). LGTM, I apparently missed your suggestion when I re-read the thread. I am just wondering whether this could/should be back-patched, maybe? It is not a bug fix, of course, but OTOH is rather small and probably helpful to some users on current releases. Michael -- Michael Banck Projektleiter / Senior Berater Tel.: +49 2166 9901-171 Fax: +49 2166 9901-100 Email: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gitlab post-mortem: pg_basebackup waiting for checkpoint
On Sun, Feb 26, 2017 at 8:27 PM, Michael Banckwrote: > Hi, > > Am Dienstag, den 14.02.2017, 18:18 -0500 schrieb Robert Haas: > > On Tue, Feb 14, 2017 at 4:06 PM, Alvaro Herrera > > wrote: > > > I'd rather have a --quiet mode instead. If you're running it by hand, > > > you're likely to omit the switch, whereas when writing the cron job > > > you're going to notice lack of switch even before you let the job run > > > once. > > > > Well, that might've been a better way to design it, but changing it > > now would break backward compatibility and I'm not really sure that's > > a good idea. Even if it is, it's a separate concern from whether or > > not in the less-quiet mode we should point out that we're waiting for > > a checkpoint on the server side. > > ISTM the consensus is that there should be no output in regular mode, > but a message should be displayed in verbose and progress mode. > > So I went forth and also added a message in progress mode (unless > verbose messages are requested anyway). > > Regarding the documentation, I tried to clarify the difference between > the checkpoint types a bit more, but I think any further action is > probably a larger rewrite of the documentation on this topic. > > So attached are two patches, I've split it up in the documentation and > the code output part. I'll add it as one commitfest entry in the > "Clients" section though, as it's not really a big patch, unless > somebody thinks it should have a secon entry in "Documentation"? Agreed, and applied as one patch. Except I noticed you also fixed a couple of entries which were missing the progname in the messages -- I broke those out to a separate patch instead. Made a small change to "using as much I/O as available" rather than "as possible", which I think is a better wording, along with the change of the idle wording I suggested before. (but feel free to point it out to me if that's wrong). -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Re: [HACKERS] gitlab post-mortem: pg_basebackup waiting for checkpoint
Hi, Am Dienstag, den 14.02.2017, 18:18 -0500 schrieb Robert Haas: > On Tue, Feb 14, 2017 at 4:06 PM, Alvaro Herrera >wrote: > > I'd rather have a --quiet mode instead. If you're running it by hand, > > you're likely to omit the switch, whereas when writing the cron job > > you're going to notice lack of switch even before you let the job run > > once. > > Well, that might've been a better way to design it, but changing it > now would break backward compatibility and I'm not really sure that's > a good idea. Even if it is, it's a separate concern from whether or > not in the less-quiet mode we should point out that we're waiting for > a checkpoint on the server side. ISTM the consensus is that there should be no output in regular mode, but a message should be displayed in verbose and progress mode. So I went forth and also added a message in progress mode (unless verbose messages are requested anyway). Regarding the documentation, I tried to clarify the difference between the checkpoint types a bit more, but I think any further action is probably a larger rewrite of the documentation on this topic. So attached are two patches, I've split it up in the documentation and the code output part. I'll add it as one commitfest entry in the "Clients" section though, as it's not really a big patch, unless somebody thinks it should have a secon entry in "Documentation"? Michael -- Michael Banck Projektleiter / Senior Berater Tel.: +49 2166 9901-171 Fax: +49 2166 9901-100 Email: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer From bcbe19855f9f94eadf9e47a7f3b9a920a7f2a616 Mon Sep 17 00:00:00 2001 From: Michael Banck Date: Sun, 26 Feb 2017 18:06:40 +0100 Subject: [PATCH 1/2] Documentation updates regarding checkpoints for basebackups. Mention that fast and immediate checkpoints are the same, and add a paragraph to the pg_basebackup documentation about the checkpoint taken on the remote server. --- doc/src/sgml/backup.sgml| 3 ++- doc/src/sgml/ref/pg_basebackup.sgml | 10 +- 2 files changed, 11 insertions(+), 2 deletions(-) diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml index 5f009ee..9485d87 100644 --- a/doc/src/sgml/backup.sgml +++ b/doc/src/sgml/backup.sgml @@ -862,7 +862,8 @@ SELECT pg_start_backup('label', false, false); ). This is usually what you want, because it minimizes the impact on query processing. If you want to start the backup as soon as - possible, change the second parameter to true. + possible, change the second parameter to true, which will + issue an immediate checkpoint using as much I/O as possible. diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml index c9dd62c..c197630 100644 --- a/doc/src/sgml/ref/pg_basebackup.sgml +++ b/doc/src/sgml/ref/pg_basebackup.sgml @@ -419,7 +419,7 @@ PostgreSQL documentation --checkpoint=fast|spread -Sets checkpoint mode to fast or spread (default) (see ). +Sets checkpoint mode to fast (immediate) or spread (default) (see ). @@ -660,6 +660,14 @@ PostgreSQL documentation Notes + At the beginning of the backup, a checkpoint needs to be written on the + server the backup is taken from. Especially if the option + --checkpoint=fast is not used, this can take some time + during which pg_basebackup will be idle on the + server it is running on. + + + The backup will include all files in the data directory and tablespaces, including the configuration files and any additional files placed in the directory by third parties, except certain temporary files managed by -- 2.1.4 From 1e4051dff9710382b6b4f63373a304c6ce70c4ac Mon Sep 17 00:00:00 2001 From: Michael Banck Date: Sun, 26 Feb 2017 20:23:21 +0100 Subject: [PATCH 2/2] Mention initial checkpoint in pg_basebackup for verbose/progess output. Before the actual data directory contents are streamed, a checkpoint is taken on the remote server. Especially if no fast checkpoint is requested, this can take quite a while during which the pg_basebackup command apparently sits idle doing nothing. To alert the user that work is being done on the remote server, mention the checkpoint if verbose or progress output has been requested. As pg_basebackup does not output anything during regular operation, no additional output is printed in this case. Also harmonize some other verbose messages in passing. --- src/bin/pg_basebackup/pg_basebackup.c | 19 +++ 1 file changed, 15 insertions(+), 4 deletions(-) diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c index bc997dc..4b75e76 100644 ---
Re: [HACKERS] gitlab post-mortem: pg_basebackup waiting for checkpoint
On Sat, Feb 18, 2017 at 4:52 AM, Tomas Vondrawrote: > I have my doubts about this actually addressing gitlab-like mistakes, > though, because it's a helluva jump from "It's waiting and not doing > anything," to "We need to remove the datadir." (One of the reasons being > that non-empty directory is a local issue, and there's no reason why the > tool should wait instead of just reporting an error.) It's pretty clear that the gitlab postmortem involves multiple people making multiple serious errors, including failing to test that the ostensible backups could actually be restored. I was taught that rule #1 as far as backups are concerned is to test that you can restore them, so that seems like a big miss. However, I don't think the fact they made other mistakes is a reason not to improve the things we can improve and, certainly, having some way for pg_basebackup to tell you that it's waiting for the master to checkpoint will help the next person who is confused by that particular thing. That person may go on to be confused by something else, but then again maybe not. Improving the reporting in this case stands on its own merits. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gitlab post-mortem: pg_basebackup waiting for checkpoint
On Fri, Feb 17, 2017 at 4:22 PM, Tomas Vondrawrote: > What about adding a paragraph into pg_basebackup docs, explaining that > with 'fast' it does immediate checkpoint, while with 'spread' it'll wait > for a spread checkpoint. > I agree that a better, and self-contained, explanation of the behaviors that fast and spread invoke on the server should be included directly in the pg_basebackup docs. Additionally, a primary benefit of pg_basebackup is hiding the low-level details from the user and in that spirit the cross-reference link to Section 25.3.3 "Making a Base Backup Using the Low Level API" should be removed. If there is specific information there that a user of pg_basebackup needs it should be presented properly in the application documentation. The top of pg_basebackup points to the entire 25.3 chapter but the flow from there is solid - coverage of pg_basebackup occurs and points out the low level API for those whose needs are not fully served by the bundled application. If one uses pg_basebackup they should be able to stop at that point, go back to the app page, and continue reading and skip all of 25.3.3 The term "spread checkpoint" isn't actually a defined term in our docs...and aside from the word spread itself describing out a checkpoint works, it isn't used outside of pg_basebackup docs. So "it will wait for a spread checkpoint" doesn't really work - "it will start and then wait for a normal checkpoint to complete" does. More holistically (i.e., feel free to skip) This paragraph from 25.3.3: """ This is because it performs a checkpoint, and the I/O required for the checkpoint will be spread out over a significant period of time, by default half your inter-checkpoint interval (see the configuration parameter checkpoint_completion_target). This is usually what you want, because it minimizes the impact on query processing. If you want to start the backup as soon as possible, change the second parameter to true. """ is good but buried and seems like it would be more visible in Chapter 30. Reliability and the Write-Ahead Log. To there both the internals and backbackup pages could point the reader. There isn't a chapter dedicated to checkpoints - nor does there need to be - but a section in 30 seems warranted as being the official reference. Right now you have to skim the configuration variables and "WAL Configuration" and "CHECKPOINT" and "base backup API and pg_basebackup" to cover everything. A checkpoint chapter with that paragraph as a focus would allow the other items to simply say "immediate or normal checkpoint" as needed and redirect the reader for additional context as to the trade-offs of each - whether done manually or during some form of backup script. David J.
Re: [HACKERS] gitlab post-mortem: pg_basebackup waiting for checkpoint
On 02/17/2017 08:17 PM, Jim Nasby wrote: > On 2/14/17 5:18 PM, Robert Haas wrote: >> On Tue, Feb 14, 2017 at 4:06 PM, Alvaro Herrera >>wrote: >>> I'd rather have a --quiet mode instead. If you're running it by hand, >>> you're likely to omit the switch, whereas when writing the cron job >>> you're going to notice lack of switch even before you let the job run >>> once. >> >> Well, that might've been a better way to design it, but changing it >> now would break backward compatibility and I'm not really sure that's > > Meh... it's really only going to affect cronjobs or scripts, which are > easy enough to fix, and you're not going to have that many of them (or > if you do you certainly have an automated way to push the update). > I think you're underestimating the breakage and overestimating how easy it's going to be to it. It's true we'd only change this in a major version, so people should assume possible breakage and test. >> a good idea. Even if it is, it's a separate concern from whether or >> not in the less-quiet mode we should point out that we're waiting for >> a checkpoint on the server side. > > Well, --quite was suggested because of confusion from pg_basebackup > twiddling it's thumbs... I'm in favor of the '--verbose' route. People are used to that when investigating issues, and it does not break existing cron jobs. I can live with --quiet though, as long as we don't resort to some craziness along the lines "if there's tty be verbose, otherwise be quiet". I have my doubts about this actually addressing gitlab-like mistakes, though, because it's a helluva jump from "It's waiting and not doing anything," to "We need to remove the datadir." (One of the reasons being that non-empty directory is a local issue, and there's no reason why the tool should wait instead of just reporting an error.) FWIW before messing with the pg_basebackup code, perhaps we should improve the documentation and explain clearly the meaning of 'fast' and 'spread' checkpoint modes. Right now, pg_basebackup docs only say this: Sets checkpoint mode to fast or spread (default) (see Section 24.3.3). which is pretty damn useless, when you're investigating an issue. And the referenced section (Making a Base Backup Using the Low Level API) does not clearly explain how this maps to pg_start_backup(_,?). What about adding a paragraph into pg_basebackup docs, explaining that with 'fast' it does immediate checkpoint, while with 'spread' it'll wait for a spread checkpoint. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gitlab post-mortem: pg_basebackup waiting for checkpoint
On 2/14/17 5:18 PM, Robert Haas wrote: On Tue, Feb 14, 2017 at 4:06 PM, Alvaro Herrerawrote: I'd rather have a --quiet mode instead. If you're running it by hand, you're likely to omit the switch, whereas when writing the cron job you're going to notice lack of switch even before you let the job run once. Well, that might've been a better way to design it, but changing it now would break backward compatibility and I'm not really sure that's Meh... it's really only going to affect cronjobs or scripts, which are easy enough to fix, and you're not going to have that many of them (or if you do you certainly have an automated way to push the update). a good idea. Even if it is, it's a separate concern from whether or not in the less-quiet mode we should point out that we're waiting for a checkpoint on the server side. Well, --quite was suggested because of confusion from pg_basebackup twiddling it's thumbs... -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gitlab post-mortem: pg_basebackup waiting for checkpoint
On Tue, Feb 14, 2017 at 9:06 AM, Magnus Haganderwrote: Yeah, that's my view as well. I'm all for including it in verbose mode. > > *Iff* we can get a progress indicator through the checkpoint we could > include that in --progress mode. But that's a different patch, of course, > but it shouldn't be included in the default output even if we find it. > > I think it should show up in --progress mode. It would be great if we could show fine-grained progress reports on the checkpoint, but if we can't do that we should still report as fine as we are able to, which is that a checkpoint is in progress. Otherwise we are setting the perfect as the enemy of the good. Cheers, Jeff
Re: [HACKERS] gitlab post-mortem: pg_basebackup waiting for checkpoint
On Tue, Feb 14, 2017 at 4:06 PM, Alvaro Herrerawrote: > I'd rather have a --quiet mode instead. If you're running it by hand, > you're likely to omit the switch, whereas when writing the cron job > you're going to notice lack of switch even before you let the job run > once. Well, that might've been a better way to design it, but changing it now would break backward compatibility and I'm not really sure that's a good idea. Even if it is, it's a separate concern from whether or not in the less-quiet mode we should point out that we're waiting for a checkpoint on the server side. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gitlab post-mortem: pg_basebackup waiting for checkpoint
Robert Haas wrote: > On Tue, Feb 14, 2017 at 12:06 PM, Magnus Haganderwrote: > > However, outputing this info by default will make it show up in things like > > everybodys cronjobs by default. Right now a successful pg_basebackup run > > will come out with no output at all, which is how most Unix commands work, > > and brings it's own advantages. If we change that people will have to send > > all the output to /dev/null, resulting in missing the things that are > > actually important in any regard. > > I agree with that. I think having this show up in verbose mode is a > really good idea - when something just hangs, users don't know what's > going on, and that's bad. But showing it all the time seems like a > bridge too far. As the postmortem linked above shows, people will > think of things like "hey, let's try --verbose mode" when the obvious > thing doesn't work. What is really irritating to them is when > --verbose mode fails to be, uh, verbose. I'd rather have a --quiet mode instead. If you're running it by hand, you're likely to omit the switch, whereas when writing the cron job you're going to notice lack of switch even before you let the job run once. I think progress reporting ought to go to stderr anyway. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gitlab post-mortem: pg_basebackup waiting for checkpoint
On Tue, Feb 14, 2017 at 12:06 PM, Magnus Haganderwrote: > However, outputing this info by default will make it show up in things like > everybodys cronjobs by default. Right now a successful pg_basebackup run > will come out with no output at all, which is how most Unix commands work, > and brings it's own advantages. If we change that people will have to send > all the output to /dev/null, resulting in missing the things that are > actually important in any regard. I agree with that. I think having this show up in verbose mode is a really good idea - when something just hangs, users don't know what's going on, and that's bad. But showing it all the time seems like a bridge too far. As the postmortem linked above shows, people will think of things like "hey, let's try --verbose mode" when the obvious thing doesn't work. What is really irritating to them is when --verbose mode fails to be, uh, verbose. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gitlab post-mortem: pg_basebackup waiting for checkpoint
On Mon, Feb 13, 2017 at 10:33 AM, Michael Banckwrote: > Hi, > > Am Montag, den 13.02.2017, 09:31 +0100 schrieb Magnus Hagander: > > On Mon, Feb 13, 2017 at 3:29 AM, Jim Nasby > > wrote: > > On 2/11/17 4:36 AM, Michael Banck wrote: > > I guess you're right, I've moved it further down. > > There is in fact a > > message about the xlog location (unless you switch off > > wal entirely), > > but having another one right before that mentioning > > the completed > > checkpoint sounds ok to me. > > > > 1) I don't think this should be verbose output. Having a > > program sit there "doing nothing" for no apparent reason is > > just horrible UI design. > > > > > > That would include much of Unix then.. For example if I run "cp" on a > > large file it sits around "doing nothing". Same if I do "tar". No? > > The expectation for all three commands is that, even if there is no > output on stdout, they will write data to the local machine. So you can > easily monitor the progress of cp and tar by running du or something in > a different terminal. > > With pg_basebackup, nothing is happening on the local machine until the > checkpoint on the remote is finished; while this is obvious to somebody > familiar with how basebackups work internally, it appears to be not > clear at all to some users. > True. However, outputing this info by default will make it show up in things like everybodys cronjobs by default. Right now a successful pg_basebackup run will come out with no output at all, which is how most Unix commands work, and brings it's own advantages. If we change that people will have to send all the output to /dev/null, resulting in missing the things that are actually important in any regard. > > So I think notifying the user that something is happening remotely while > the local process waits would be useful, but on the other hand, > pg_basebackup does not print anything unless (i) --verbose is set or > (ii) there is an error, so I think having it mention the checkpoint in > --verbose mode only is acceptable. > Yeah, that's my view as well. I'm all for including it in verbose mode. *Iff* we can get a progress indicator through the checkpoint we could include that in --progress mode. But that's a different patch, of course, but it shouldn't be included in the default output even if we find it. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Re: [HACKERS] gitlab post-mortem: pg_basebackup waiting for checkpoint
Hi, Am Montag, den 13.02.2017, 09:31 +0100 schrieb Magnus Hagander: > On Mon, Feb 13, 2017 at 3:29 AM, Jim Nasby> wrote: > On 2/11/17 4:36 AM, Michael Banck wrote: > I guess you're right, I've moved it further down. > There is in fact a > message about the xlog location (unless you switch off > wal entirely), > but having another one right before that mentioning > the completed > checkpoint sounds ok to me. > > 1) I don't think this should be verbose output. Having a > program sit there "doing nothing" for no apparent reason is > just horrible UI design. > > > That would include much of Unix then.. For example if I run "cp" on a > large file it sits around "doing nothing". Same if I do "tar". No? The expectation for all three commands is that, even if there is no output on stdout, they will write data to the local machine. So you can easily monitor the progress of cp and tar by running du or something in a different terminal. With pg_basebackup, nothing is happening on the local machine until the checkpoint on the remote is finished; while this is obvious to somebody familiar with how basebackups work internally, it appears to be not clear at all to some users. So I think notifying the user that something is happening remotely while the local process waits would be useful, but on the other hand, pg_basebackup does not print anything unless (i) --verbose is set or (ii) there is an error, so I think having it mention the checkpoint in --verbose mode only is acceptable. Michael -- Michael Banck Projektleiter / Senior Berater Tel.: +49 2166 9901-171 Fax: +49 2166 9901-100 Email: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gitlab post-mortem: pg_basebackup waiting for checkpoint
On Mon, Feb 13, 2017 at 3:29 AM, Jim Nasbywrote: > On 2/11/17 4:36 AM, Michael Banck wrote: > >> I guess you're right, I've moved it further down. There is in fact a >> message about the xlog location (unless you switch off wal entirely), >> but having another one right before that mentioning the completed >> checkpoint sounds ok to me. >> > > 1) I don't think this should be verbose output. Having a program sit there > "doing nothing" for no apparent reason is just horrible UI design. > That would include much of Unix then.. For example if I run "cp" on a large file it sits around "doing nothing". Same if I do "tar". No? > 2) I think it'd be useful to have a way to get the status of a running > checkpoint. The checkpointer already has that info, and I think it might > even be in shared memory already. If there was a function that reported > checkpoint status pg_basebackup could poll that to provide users with live > status. That should be a separate patch though. I agree that this would definitely be useful. But it might be something that's better exposed as a server-side view? (and if pg_basebackup could poll it it would probably still not be included by default -- only if -P was given). -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Re: [HACKERS] gitlab post-mortem: pg_basebackup waiting for checkpoint
On 2/11/17 4:36 AM, Michael Banck wrote: I guess you're right, I've moved it further down. There is in fact a message about the xlog location (unless you switch off wal entirely), but having another one right before that mentioning the completed checkpoint sounds ok to me. 1) I don't think this should be verbose output. Having a program sit there "doing nothing" for no apparent reason is just horrible UI design. 2) I think it'd be useful to have a way to get the status of a running checkpoint. The checkpointer already has that info, and I think it might even be in shared memory already. If there was a function that reported checkpoint status pg_basebackup could poll that to provide users with live status. That should be a separate patch though. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gitlab post-mortem: pg_basebackup waiting for checkpoint
Hi, Am Samstag, den 11.02.2017, 11:25 +0100 schrieb Michael Banck: > Am Samstag, den 11.02.2017, 11:07 +0100 schrieb Magnus Hagander: > > As for the code, while I haven't tested it, isn't the "checkpoint > > completed" message in the wrong place? Doesn't PQsendQuery() complete > > immediately, and the check needs to be put *after* the PQgetResult() > > call? > > I guess you're right, I've moved it further down. There is in fact a > message about the xlog location (unless you switch off wal entirely), > but having another one right before that mentioning the completed > checkpoint sounds ok to me. > > There's also some inconsistencies around which messages are prepended > with "pg_basebackup: " and which are translatable; I guess all messages > printed on --verbose should be translatable? Also, as almost all > messages have a "pg_basebackup: " prefix, I've added it to the rest. Sorry, there were two typoes in the last patch, I've attached a fixed one. Michael -- Michael Banck Projektleiter / Senior Berater Tel.: +49 2166 9901-171 Fax: +49 2166 9901-100 Email: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml index c9dd62c..a298e5c 100644 --- a/doc/src/sgml/ref/pg_basebackup.sgml +++ b/doc/src/sgml/ref/pg_basebackup.sgml @@ -660,6 +660,14 @@ PostgreSQL documentation Notes + At the beginning of the backup, a checkpoint needs to be written on the + server the backup is taken from. Especially if the option + --checkpoint=fast is not used, this can take some time + during which pg_basebackup will be idle on the + server it is running on. + + + The backup will include all files in the data directory and tablespaces, including the configuration files and any additional files placed in the directory by third parties, except certain temporary files managed by diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c index b6463fa..60200a9 100644 --- a/src/bin/pg_basebackup/pg_basebackup.c +++ b/src/bin/pg_basebackup/pg_basebackup.c @@ -1754,6 +1754,11 @@ BaseBackup(void) if (maxrate > 0) maxrate_clause = psprintf("MAX_RATE %u", maxrate); + if (verbose) + fprintf(stderr, +_("%s: initiating base backup, waiting for checkpoint to complete\n"), +progname); + basebkp = psprintf("BASE_BACKUP LABEL '%s' %s %s %s %s %s %s", escaped_label, @@ -1791,6 +1796,9 @@ BaseBackup(void) strlcpy(xlogstart, PQgetvalue(res, 0, 0), sizeof(xlogstart)); + if (verbose) + fprintf(stderr, _("%s: checkpoint completed\n"), progname); + /* * 9.3 and later sends the TLI of the starting point. With older servers, * assume it's the same as the latest timeline reported by @@ -1804,8 +1812,8 @@ BaseBackup(void) MemSet(xlogend, 0, sizeof(xlogend)); if (verbose && includewal != NO_WAL) - fprintf(stderr, _("transaction log start point: %s on timeline %u\n"), -xlogstart, starttli); + fprintf(stderr, _("%s: transaction log start point: %s on timeline %u\n"), +progname, xlogstart, starttli); /* * Get the header @@ -1907,7 +1915,7 @@ BaseBackup(void) } strlcpy(xlogend, PQgetvalue(res, 0, 0), sizeof(xlogend)); if (verbose && includewal != NO_WAL) - fprintf(stderr, "transaction log end point: %s\n", xlogend); + fprintf(stderr, _("%s: transaction log end point: %s\n"), progname, xlogend); PQclear(res); res = PQgetResult(conn); @@ -2048,7 +2056,7 @@ BaseBackup(void) } if (verbose) - fprintf(stderr, "%s: base backup completed\n", progname); + fprintf(stderr, _("%s: base backup completed\n"), progname); } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gitlab post-mortem: pg_basebackup waiting for checkpoint
Hi, Am Samstag, den 11.02.2017, 11:07 +0100 schrieb Magnus Hagander: > As for the code, while I haven't tested it, isn't the "checkpoint > completed" message in the wrong place? Doesn't PQsendQuery() complete > immediately, and the check needs to be put *after* the PQgetResult() > call? I guess you're right, I've moved it further down. There is in fact a message about the xlog location (unless you switch off wal entirely), but having another one right before that mentioning the completed checkpoint sounds ok to me. There's also some inconsistencies around which messages are prepended with "pg_basebackup: " and which are translatable; I guess all messages printed on --verbose should be translatable? Also, as almost all messages have a "pg_basebackup: " prefix, I've added it to the rest. Michael -- Michael Banck Projektleiter / Senior Berater Tel.: +49 2166 9901-171 Fax: +49 2166 9901-100 Email: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml index c9dd62c..a298e5c 100644 --- a/doc/src/sgml/ref/pg_basebackup.sgml +++ b/doc/src/sgml/ref/pg_basebackup.sgml @@ -660,6 +660,14 @@ PostgreSQL documentation Notes + At the beginning of the backup, a checkpoint needs to be written on the + server the backup is taken from. Especially if the option + --checkpoint=fast is not used, this can take some time + during which pg_basebackup will be idle on the + server it is running on. + + + The backup will include all files in the data directory and tablespaces, including the configuration files and any additional files placed in the directory by third parties, except certain temporary files managed by diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c index b6463fa..874b6d6 100644 --- a/src/bin/pg_basebackup/pg_basebackup.c +++ b/src/bin/pg_basebackup/pg_basebackup.c @@ -1754,6 +1754,11 @@ BaseBackup(void) if (maxrate > 0) maxrate_clause = psprintf("MAX_RATE %u", maxrate); + if (verbose) + fprintf(stderr, +_("%s: initiating base backup, waiting for checkpoint to complete\n"), +progname); + basebkp = psprintf("BASE_BACKUP LABEL '%s' %s %s %s %s %s %s", escaped_label, @@ -1791,6 +1796,9 @@ BaseBackup(void) strlcpy(xlogstart, PQgetvalue(res, 0, 0), sizeof(xlogstart)); + if (verbose) + fprintf(stderr, _("%s: checkpoint completed\n"), progname); + /* * 9.3 and later sends the TLI of the starting point. With older servers, * assume it's the same as the latest timeline reported by @@ -1804,8 +1812,8 @@ BaseBackup(void) MemSet(xlogend, 0, sizeof(xlogend)); if (verbose && includewal != NO_WAL) - fprintf(stderr, _("transaction log start point: %s on timeline %u\n"), -xlogstart, starttli); + fprintf(stderr, _("%s: transaction log start point: %s on timeline %u\n"), +progname, xlogstart, starttli); /* * Get the header @@ -1907,7 +1915,7 @@ BaseBackup(void) } strlcpy(xlogend, PQgetvalue(res, 0, 0), sizeof(xlogend)); if (verbose && includewal != NO_WAL) - fprintf(stderr, "transaction log end point: %s\n", xlogend); + fprintf(stderr, _("%s: transaction log end point: %s\n", progname, xlogend); PQclear(res); res = PQgetResult(conn); @@ -2048,7 +2056,7 @@ BaseBackup(void) } if (verbose) - fprintf(stderr, "%s: base backup completed\n", progname); + fprintf(stderr, _("%s: base backup completed\n)", progname); } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gitlab post-mortem: pg_basebackup waiting for checkpoint
On Sat, Feb 11, 2017 at 10:38 AM, Michael Banckwrote: > Hi, > > one take-away from the Gitlab Post-Mortem[1] appears to be that after > their secondary lost replication, they were confused about what > pg_basebackup was doing when they tried to rebuild it. It just sat there > and did nothing (even with --verbose), so they assumed something was > wrong with either the primary or the connection, and restarted it > several times. > > AFAICT, it turns out the checkpoint was written on the master (they > probably did not use -c fast), but this wasn't obvious to them: > Yeah, I've seen this happen to a number of people. I think that sounds like what's happened here as well. I've considered things in the line of the patch you posted, but never got around to actually doing anything about it. > ISTM that even with WAL streaming, nothing would be written on the > client server until the checkpoint is complete, as do_pg_start_backup() > runs the checkpoint and only returns the starting WAL location > afterwards. > > The attached (untested) patch is to kick of a discussion on how to > improve the situation, it is supposed to mention the checkpoint when > --verbose is used and adds a paragraph about the checkpoint being run to > the Notes section of the documentation. > > Docs look good to me, other than claiming that pg_basebackup runs on a server (it can run anywhere). I would just say "during which pg_basebackup will appear idle". How does that sound to you? As for the code, while I haven't tested it, isn't the "checkpoint completed" message in the wrong place? Doesn't PQsendQuery() complete immediately, and the check needs to be put *after* the PQgetResult() call? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
[HACKERS] gitlab post-mortem: pg_basebackup waiting for checkpoint
Hi, one take-away from the Gitlab Post-Mortem[1] appears to be that after their secondary lost replication, they were confused about what pg_basebackup was doing when they tried to rebuild it. It just sat there and did nothing (even with --verbose), so they assumed something was wrong with either the primary or the connection, and restarted it several times. AFAICT, it turns out the checkpoint was written on the master (they probably did not use -c fast), but this wasn't obvious to them: "One of the engineers went to the secondary and wiped the data directory, then ran pg_basebackup. Unfortunately pg_basebackup would hang, producing no meaningful output, despite the --verbose option being set." [...] "Unfortunately this did not resolve the problem of pg_basebackup not starting replication immediately. One of the engineers decided to run it with strace to see what it was blocking on. strace showed that pg_basebackup was hanging in a poll call, but that did not provide any other meaningful information that might have explained why." [...] "It would later be revealed by another engineer (who wasn't around at the time) that this is normal behavior: pg_basebackup will wait for the primary to start sending over replication data and it will sit and wait silently until that time. Unfortunately this was not clearly documented in our engineering runbooks nor in the official pg_basebackup document." ISTM that even with WAL streaming, nothing would be written on the client server until the checkpoint is complete, as do_pg_start_backup() runs the checkpoint and only returns the starting WAL location afterwards. The attached (untested) patch is to kick of a discussion on how to improve the situation, it is supposed to mention the checkpoint when --verbose is used and adds a paragraph about the checkpoint being run to the Notes section of the documentation. Michael [1]https://about.gitlab.com/2017/02/10/postmortem-of-database-outage-of-january-31/ -- Michael Banck Projektleiter / Senior Berater Tel.: +49 2166 9901-171 Fax: +49 2166 9901-100 Email: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml index c9dd62c..a298e5c 100644 --- a/doc/src/sgml/ref/pg_basebackup.sgml +++ b/doc/src/sgml/ref/pg_basebackup.sgml @@ -660,6 +660,14 @@ PostgreSQL documentation Notes + At the beginning of the backup, a checkpoint needs to be written on the + server the backup is taken from. Especially if the option + --checkpoint=fast is not used, this can take some time + during which pg_basebackup will be idle on the + server it is running on. + + + The backup will include all files in the data directory and tablespaces, including the configuration files and any additional files placed in the directory by third parties, except certain temporary files managed by diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c index b6463fa..ae18c16 100644 --- a/src/bin/pg_basebackup/pg_basebackup.c +++ b/src/bin/pg_basebackup/pg_basebackup.c @@ -1754,6 +1754,9 @@ BaseBackup(void) if (maxrate > 0) maxrate_clause = psprintf("MAX_RATE %u", maxrate); + if (verbose) + fprintf(stderr, "%s: initiating base backup, waiting for checkpoint to complete\n", progname); + basebkp = psprintf("BASE_BACKUP LABEL '%s' %s %s %s %s %s %s", escaped_label, @@ -1771,6 +1774,9 @@ BaseBackup(void) disconnect_and_exit(1); } + if (verbose) + fprintf(stderr, "%s: checkpoint completed\n", progname); + /* * Get the starting xlog position */ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers