On Mon, Jul 10, 2017 at 11:56 AM, Michael Paquier
<michael.paqu...@gmail.com> wrote:
> On Sat, Jul 8, 2017 at 12:50 AM, Masahiko Sawada <sawada.m...@gmail.com> 
> wrote:
>> On Fri, Jul 7, 2017 at 3:48 PM, Michael Paquier
>> <michael.paqu...@gmail.com> wrote:
>>> So I would suggest the following things to address this issue:
>>> 1) Generate a backup history file for backups taken at recovery as well.
>>> 2) Archive it if archive_mode = always.
>>> 3) Wait for both the segment of the stop point and the backup history
>>> files to be archived before returning back to the caller of
>>> pg_stop_backup.
>>> It would be nice to get all that addressed in 10. Thoughts?
>> Yeah, I agree. Attached patch makes it works and deals with the
>> history file issue.
> I had a look at the proposed patch. Here are some comments.

Thank you for reviewing the patch!

> @@ -11002,10 +11000,10 @@ do_pg_stop_backup(char *labelfile, bool
> waitforarchive, TimeLineID *stoptli_p)
>     if (waitforarchive && XLogArchivingActive())
>     {
>         XLByteToPrevSeg(stoppoint, _logSegNo);
> -       XLogFileName(lastxlogfilename, ThisTimeLineID, _logSegNo);
> +       XLogFileName(lastxlogfilename, stoptli, _logSegNo);
> On a standby the wait phase should not happen if archive_mode = on,
> but only if archive_mode = always. So I would suggest to change this
> upper condition a bit, and shuffle a bit the code to make the wait
> phase happen last:
> 1) stoptli_p first.
> 2) Check for XLogArchivingActive or XLogArchivingAlways, then with the
> NOTICE message.
> 3) Do the actual wait.

Thank you for the suggestion. Fixed.

> This way the code doing the wait does not need to be in its long
> lengthy if() branch. I think that we should replace the pg_usleep()
> call with a latch to make this more responsive. That should be a
> future patch.

Yeah, I agree.

> In backup history files generated in standbys, the STOP TIME is not
> set and this results in garbage in the file.


> +    If second parameter is true and on standby, <function>pg_stop_backup</>
> +    waits for WAL to be archived without forcibly switching WAL on standby.
> +    So enforcing manually a WAL switch on primary needs to happen.
> Here is a reformulation:
> If the second parameter wait_for_archive is true and the backup is
> taken on a standby, pg_stop_backup waits for WAL to be archived when
> archive_mode = always. Enforcing manually a WAL segment switch to
> happen with for example pg_switch_wal() may be necessary if the
> primary has low activity to allow the backup to complete. Using
> statement_timeout to limit the amount of time to wait or switching
> wait_for_archive to false will control the wait time, though all the
> WAL segments necessary to recover into a consistent state from the
> backup taken may not be archived at the time pg_stop_backup returns
> its status to the caller.

Thanks, fixed.

> The errhint() for the wait phase should be reworked for a standby. I
> would suggest for errmsg() the same thing, aka:
> pg_stop_backup still waiting for all required WAL segments to be
> archived (%d seconds elapsed)
> But the following for a backup started in recovery. That's long but
> things need to be really clear to the user:
> Backup has been taken from a standby, check if the WAL segments needed
> for this backup have been completed, in which case a forced segment
> switch may can be needed on the primary. If a segment switch has
> already happened check that your archive_command is executing
> properly. pg_stop_backup can be canceled safely, but the database
> backup will not be usable without all the WAL segments.


Attached updated version patch. Please review it.


Masahiko Sawada
NTT Open Source Software Center

Attachment: pg_stop_backup_on_standby_v3.patch
Description: Binary data

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to