On Mon, Jul 10, 2017 at 11:56 AM, Michael Paquier <[email protected]> wrote: > On Sat, Jul 8, 2017 at 12:50 AM, Masahiko Sawada <[email protected]> > wrote: >> On Fri, Jul 7, 2017 at 3:48 PM, Michael Paquier >> <[email protected]> wrote: >>> So I would suggest the following things to address this issue: >>> 1) Generate a backup history file for backups taken at recovery as well. >>> 2) Archive it if archive_mode = always. >>> 3) Wait for both the segment of the stop point and the backup history >>> files to be archived before returning back to the caller of >>> pg_stop_backup. >>> >>> It would be nice to get all that addressed in 10. Thoughts? >> >> Yeah, I agree. Attached patch makes it works and deals with the >> history file issue. > > I had a look at the proposed patch. Here are some comments.
Thank you for reviewing the patch!
>
> @@ -11002,10 +11000,10 @@ do_pg_stop_backup(char *labelfile, bool
> waitforarchive, TimeLineID *stoptli_p)
> if (waitforarchive && XLogArchivingActive())
> {
> XLByteToPrevSeg(stoppoint, _logSegNo);
> - XLogFileName(lastxlogfilename, ThisTimeLineID, _logSegNo);
> + XLogFileName(lastxlogfilename, stoptli, _logSegNo);
>
> On a standby the wait phase should not happen if archive_mode = on,
> but only if archive_mode = always. So I would suggest to change this
> upper condition a bit, and shuffle a bit the code to make the wait
> phase happen last:
> 1) stoptli_p first.
> 2) Check for XLogArchivingActive or XLogArchivingAlways, then with the
> NOTICE message.
> 3) Do the actual wait.
Thank you for the suggestion. Fixed.
> This way the code doing the wait does not need to be in its long
> lengthy if() branch. I think that we should replace the pg_usleep()
> call with a latch to make this more responsive. That should be a
> future patch.
Yeah, I agree.
> In backup history files generated in standbys, the STOP TIME is not
> set and this results in garbage in the file.
Fixed.
> + If second parameter is true and on standby, <function>pg_stop_backup</>
> + waits for WAL to be archived without forcibly switching WAL on standby.
> + So enforcing manually a WAL switch on primary needs to happen.
> Here is a reformulation:
> If the second parameter wait_for_archive is true and the backup is
> taken on a standby, pg_stop_backup waits for WAL to be archived when
> archive_mode = always. Enforcing manually a WAL segment switch to
> happen with for example pg_switch_wal() may be necessary if the
> primary has low activity to allow the backup to complete. Using
> statement_timeout to limit the amount of time to wait or switching
> wait_for_archive to false will control the wait time, though all the
> WAL segments necessary to recover into a consistent state from the
> backup taken may not be archived at the time pg_stop_backup returns
> its status to the caller.
Thanks, fixed.
> The errhint() for the wait phase should be reworked for a standby. I
> would suggest for errmsg() the same thing, aka:
> pg_stop_backup still waiting for all required WAL segments to be
> archived (%d seconds elapsed)
> But the following for a backup started in recovery. That's long but
> things need to be really clear to the user:
> Backup has been taken from a standby, check if the WAL segments needed
> for this backup have been completed, in which case a forced segment
> switch may can be needed on the primary. If a segment switch has
> already happened check that your archive_command is executing
> properly. pg_stop_backup can be canceled safely, but the database
> backup will not be usable without all the WAL segments.
Fixed.
Attached updated version patch. Please review it.
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
pg_stop_backup_on_standby_v3.patch
Description: Binary data
-- Sent via pgsql-hackers mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
