Sorry for the absense. I've reached here. At Thu, 21 Jul 2016 12:20:30 +0900, Michael Paquier <michael.paqu...@gmail.com> wrote in <cab7npqr+krkk9fqwqusjna2ob-rcq6mi8ybz3wl0kcob5ss...@mail.gmail.com> > On Thu, Jul 21, 2016 at 11:56 AM, Amit Kapila <amit.kapil...@gmail.com> wrote: > > On Thu, Jul 21, 2016 at 7:28 AM, Michael Paquier > > <michael.paqu...@gmail.com> wrote: > >> On Wed, Jul 20, 2016 at 8:56 PM, Michael Paquier > >> <michael.paqu...@gmail.com> wrote: > >>>> > >>>> Yeah, I think that is totally different angle to fix this issue, so > >>>> don't you think it is better to start a separate thread to discuss > >>>> about it for 10.0 and mark this patch as ready for committer. > >>> > >>> I'd like to tackle this problem in 10.0, but that will strongly depend > >>> on how my patches move on in CF1 and CF2. > >> > >> By the way, thank you for taking the time to provide input. I think > >> we're in good shape here now. > >> > > > > So, if I understand correctly, then we can mark the version posted by > > you upthread [1] which includes a test along with Kyotaro's fix can be > > marked as Ready for committer. If so, then please change the status > > of patch accordingly. > > Oops. I thought you did it already. So done.
Thank you very much for the intensive discussion on this. After some additional consideration, I attached two patches about comment and documentation. 0001- is a patch related to the previous patch that adds description on why we use replay end point instead of minRecoveryPoint. And 0002- is a fix and an additional mention on what would happen when pg_control is not copied last during backup. ==== About the first patch. Related to the previous patch, I found the following comment just above where it applies. xlog.c:10412 - * We return the current minimum recovery point as the backup end - * location. Note that it can be greater than the exact backup end - * location if the minimum recovery point is updated after the backup of - * pg_control. This is harmless for current uses. I haven't gave a notice on this but this seems to be necessary to edit so as to mention this fix and the gap like the following. + * minRecoveryPoint can go behind the last checkpoint's redo location when + * the checkpoint writes out no buffer. This does no harm to performing a + * recovery but such inversion seems inconsistent from the view of the + * callers and prevents them from knowing WAL segments needed by sane + * calcuation. For the reason we return the last replayed point as the + * backup end location. Note that it can be greater than the exact backup + * end location or even the minimum recovery point of pg_control at the + * time. This is harmless for current uses. ==== About the second patch. By the way, related to the following discussion in the upthread, At Tue, 19 Jul 2016 14:13:36 +0900, Michael Paquier <michael.paqu...@gmail.com> wrote in <cab7npqrxmkr2rq7_wa1azzqbfvruwu_5qtbr_keqqwgueja...@mail.gmail.com> > The thing that is really annoying btw is that there will be always a > gap between minRecoveryPoint and the actual moment where a backup > finishes because there is no way to rely on the XLOG_BACKUP_END > record. On top of that we can not be sure if pg_control has been > backed up last or not. Which is why it would be cool to document that > gap. Even with out my previous patch, the gap between minRecoveryPoint *when pg_control is backed up* and that (or the replay end) at the end of backup is crucial and should be back-patched. Addition to that, while I examined the documentation I found the following description about pg_stop_backup in continuous-archiving.html, which contradicts to its definition. > In the same connection as before, issue the command: > > SELECT * FROM pg_stop_backup(false); ... > The pg_stop_backup will return one row with three values. AFAICS pg_stop_backup returns a single value of LSN. I don't know where this comes from, but the attached patch fixes this and adds a mention on the "gap". The original description mentioned backup_label and tablespace_map but it seems not necessary. The following is the new content rewritten by the patch. | The pg_stop_backup will return the LSN when it is called. This | LSN informs upto where the backup needs WAL segment files to | complete. For a backup taken from a master, this function leaves | a backup history file to inform the segment files needed and a | server starts from the backup performes a recovery up to where a | backup-end WAL record to reach a consistent state. But things are | different for a backup taken from a standby. For the case, backup | history file won't be created and recvoery is perfomed up to the | Minimum-recovery-ending-location field in the pg_control file | included in the backup instead. The location is properly stored | in the backup if you copy the the pg_control file after all the | other files as pg_basebackup does. Note that a hot standby | started from a backup not taken in that manner will consider to | have reached a consistent state mistakenly earlier. | | Once the WAL segment files active during the backup are archived, | you are done. The file contains the LSN returned by | pg_stop_backup is the last segment that is required to form a | complete set of backup files. The "LSN" in the first line is intentionally ambiguated. It may seems too much to write here. > Another crazy idea would be to return pg_control as an extra > return field of pg_stop_backup() and encourage users to write that > back in the backup itself. This would allow closing any hole in the > current logic for backups taken from live standbys: minRecoveryPoint > would be updated directly to the last replayed LSN/TLI in the control > file. It seems to be doable at do_pg_*_backup level. non-exclusive backup code does the same thing to backup_label but no usage seen. regards, -- Kyotaro Horiguchi NTT Open Source Software Center
>From 74fc54085d20955cef4d5b7ce4b49d10351b1fbd Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi <horiguchi.kyot...@lab.ntt.co.jp> Date: Thu, 21 Jul 2016 15:45:04 +0900 Subject: [PATCH 1/2] Add the reason why we use replayEndRecPtr as recovery-end point. --- src/backend/access/transam/xlog.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index aecede1..13ef75f 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -10409,10 +10409,14 @@ do_pg_stop_backup(char *labelfile, bool waitforarchive, TimeLineID *stoptli_p) * required files are available, a user should wait for them to be * archived, or include them into the backup. * - * We return the current minimum recovery point as the backup end - * location. Note that it can be greater than the exact backup end - * location if the minimum recovery point is updated after the backup of - * pg_control. This is harmless for current uses. + * minRecoveryPoint can go behind the last checkpoint's redo location when + * the checkpoint writes out no buffer. This does no harm to performing a + * recovery but such inversion seems inconsistent from the view of the + * callers and prevents them from knowing WAL segments needed by sane + * calcuation. For the reason we return the last replayed point as the + * backup end location. Note that it can be greater than the exact backup + * end location or even the minimum recovery point of pg_control at the + * time. This is harmless for current uses. * * XXX currently a backup history file is for informational and debug * purposes only. It's not essential for an online backup. Furthermore, -- 1.8.3.1
>From 756daf723fb57c565e39c02dda235514e3bb7577 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi <horiguchi.kyot...@lab.ntt.co.jp> Date: Thu, 21 Jul 2016 15:43:15 +0900 Subject: [PATCH 2/2] Fix documentation about pg_stop_backup. In the description about continuous archiving, pg_stop_backup is wrongly described. This fixes that. Addition to that, adding a mention on what happens if pg_control is not backup last. --- doc/src/sgml/backup.sgml | 25 ++++++++++++++++--------- 1 file changed, 16 insertions(+), 9 deletions(-) diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml index 0f09d82..c4c182d 100644 --- a/doc/src/sgml/backup.sgml +++ b/doc/src/sgml/backup.sgml @@ -894,20 +894,27 @@ SELECT * FROM pg_stop_backup(false); ready to archive. </para> <para> - The <function>pg_stop_backup</> will return one row with three - values. The second of these fields should be written to a file named - <filename>backup_label</> in the root directory of the backup. The - third field should be written to a file named - <filename>tablespace_map</> unless the field is empty. These files are - vital to the backup working, and must be written without modification. + The <function>pg_stop_backup</> will return the LSN when it is called. + This LSN informs upto where the backup needs WAL segment files to + complete. For a backup taken from a master, this function leaves a backup + history file to inform the segment files needed and a server starts from + the backup performes a recovery up to where a backup-end WAL record to + reach a consistent state. But things are different for a backup taken + from a standby. For the case, backup history file won't be created and + recvoery is perfomed up to the Minimum-recovery-ending-location field in + the pg_control file included in the backup instead. The location is + properly stored in the backup if you copy the the pg_control file after + all the other files as pg_basebackup does. Note that a hot standby + started from a backup not taken in that manner will consider to have + reached a consistent state mistakenly earlier. </para> </listitem> <listitem> <para> Once the WAL segment files active during the backup are archived, you are - done. The file identified by <function>pg_stop_backup</>'s first return - value is the last segment that is required to form a complete set of - backup files. If <varname>archive_mode</> is enabled, + done. The file contains the LSN returned by <function>pg_stop_backup</> + is the last segment that is required to form a complete set of backup + files. If <varname>archive_mode</> is enabled, <function>pg_stop_backup</> does not return until the last segment has been archived. Archiving of these files happens automatically since you have -- 1.8.3.1
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers