On Mon, Jul 10, 2017 at 11:56 AM, Michael Paquier <michael.paqu...@gmail.com> wrote: > On Sat, Jul 8, 2017 at 12:50 AM, Masahiko Sawada <sawada.m...@gmail.com> > wrote: >> On Fri, Jul 7, 2017 at 3:48 PM, Michael Paquier >> <michael.paqu...@gmail.com> wrote: >>> So I would suggest the following things to address this issue: >>> 1) Generate a backup history file for backups taken at recovery as well. >>> 2) Archive it if archive_mode = always. >>> 3) Wait for both the segment of the stop point and the backup history >>> files to be archived before returning back to the caller of >>> pg_stop_backup. >>> >>> It would be nice to get all that addressed in 10. Thoughts? >> >> Yeah, I agree. Attached patch makes it works and deals with the >> history file issue. > > I had a look at the proposed patch. Here are some comments.
Thank you for reviewing the patch! > > @@ -11002,10 +11000,10 @@ do_pg_stop_backup(char *labelfile, bool > waitforarchive, TimeLineID *stoptli_p) > if (waitforarchive && XLogArchivingActive()) > { > XLByteToPrevSeg(stoppoint, _logSegNo); > - XLogFileName(lastxlogfilename, ThisTimeLineID, _logSegNo); > + XLogFileName(lastxlogfilename, stoptli, _logSegNo); > > On a standby the wait phase should not happen if archive_mode = on, > but only if archive_mode = always. So I would suggest to change this > upper condition a bit, and shuffle a bit the code to make the wait > phase happen last: > 1) stoptli_p first. > 2) Check for XLogArchivingActive or XLogArchivingAlways, then with the > NOTICE message. > 3) Do the actual wait. Thank you for the suggestion. Fixed. > This way the code doing the wait does not need to be in its long > lengthy if() branch. I think that we should replace the pg_usleep() > call with a latch to make this more responsive. That should be a > future patch. Yeah, I agree. > In backup history files generated in standbys, the STOP TIME is not > set and this results in garbage in the file. Fixed. > + If second parameter is true and on standby, <function>pg_stop_backup</> > + waits for WAL to be archived without forcibly switching WAL on standby. > + So enforcing manually a WAL switch on primary needs to happen. > Here is a reformulation: > If the second parameter wait_for_archive is true and the backup is > taken on a standby, pg_stop_backup waits for WAL to be archived when > archive_mode = always. Enforcing manually a WAL segment switch to > happen with for example pg_switch_wal() may be necessary if the > primary has low activity to allow the backup to complete. Using > statement_timeout to limit the amount of time to wait or switching > wait_for_archive to false will control the wait time, though all the > WAL segments necessary to recover into a consistent state from the > backup taken may not be archived at the time pg_stop_backup returns > its status to the caller. Thanks, fixed. > The errhint() for the wait phase should be reworked for a standby. I > would suggest for errmsg() the same thing, aka: > pg_stop_backup still waiting for all required WAL segments to be > archived (%d seconds elapsed) > But the following for a backup started in recovery. That's long but > things need to be really clear to the user: > Backup has been taken from a standby, check if the WAL segments needed > for this backup have been completed, in which case a forced segment > switch may can be needed on the primary. If a segment switch has > already happened check that your archive_command is executing > properly. pg_stop_backup can be canceled safely, but the database > backup will not be usable without all the WAL segments. Fixed. Attached updated version patch. Please review it. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
pg_stop_backup_on_standby_v3.patch
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers