On 21 September 2017 at 16:56, Michael Paquier <michael.paqu...@gmail.com>
wrote:

> On Thu, Sep 21, 2017 at 4:40 PM, Masahiko Sawada <sawada.m...@gmail.com>
> wrote:
> > On Thu, Sep 21, 2017 at 2:25 PM, Michael Paquier
> > <michael.paqu...@gmail.com> wrote:
> >> On Thu, Sep 21, 2017 at 1:07 AM, Masahiko Sawada <sawada.m...@gmail.com>
> wrote:
> >>> The bug can happen in PostgreSQL 9.1 or higher that non-exclusive
> >>> backup has been introduced, so we should back-patch to the all
> >>> supported versions.
> >>
> >> There is a typo here right? Non-exclusive backups have been introduced
> >> in 9.6. Why would a back-patch further down be needed?
> >
> > I think the non-exclusive backups infrastructure has been introduced
> > in 9.1 for pg_basebackup. I've not checked yet that it can be
> > reproduced using pg_basebackup in PG9.1 but I thought it could happen
> > as far as I checked the code.
>
> Yep, but the deficiency is caused by the use before_shmem_exit() in
> the SQL-level functions present in 9.6~ which make the cleanup happen
> potentially twice. If you look at the code of basebackup.c, you would
> notice that the cleanups of the counters only happen within the
> PG_ENSURE_ERROR_CLEANUP() blocks, so a backpatch to 9.6 should be
> enough.
>
> >> +-  Assert(XLogCtl->Insert.nonExclusiveBackups >= 0);
> >> +   if (XLogCtl->Insert.nonExclusiveBackups > 0)
> >> +       XLogCtl->Insert.nonExclusiveBackups--;
> >> Hm, no, I don't agree. I think that instead you should just leave
> >> do_pg_abort_backup() immediately if sessionBackupState is set to
> >> SESSION_BACKUP_NONE. This variable is the link between the global
> >> counters and the session stopping the backup so I don't think that we
> >> should touch this assertion of this counter. I think that this method
> >> would be safe as well for backup start as pg_start_backup_callback
> >> takes care of any cleanup. Also because the counters are incremented
> >> before entering in the PG_ENSURE_ERROR_CLEANUP block, and
> >> sessionBackupState is updated just after leaving the block.
> >
> > I think that the assertion failure still can happen if the process
> > aborts after decremented the counter and before setting to
> > SESSION_BACKUP_NONE. Am I missing something?
>
> The process would stop at the next CHECK_FOR_INTERRUPTS() and trigger
> the cleanup at this moment. So this happens when waiting for the
> archives to be done, and the session flag is set to NONE at this
> point.
>

Another one to watch out for is that elog(...) and ereport(...) invoke
CHECK_FOR_INTERRUPTS. That's given me exciting surprises before when
combined with assertion checking and various exit cleanup hooks.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers



-- 
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Reply via email to