Re: [HACKERS] WIP: Failover Slots

Craig Ringer Tue, 25 Jul 2017 17:45:12 -0700

On 26 July 2017 at 00:16, Thom Brown <[email protected]> wrote:

> On 8 April 2016 at 07:13, Craig Ringer <[email protected]> wrote:
> > On 6 April 2016 at 22:17, Andres Freund <[email protected]> wrote:
> >
> >>
> >> Quickly skimming 0001 in [4] there appear to be a number of issues:
> >> * LWLockHeldByMe() is only for debugging, not functional differences
> >> * ReplicationSlotPersistentData is now in an xlog related header
> >> * The code and behaviour around name conflicts of slots seems pretty
> >>   raw, and not discussed
> >> * Taking spinlocks dependant on InRecovery() seems like a seriously bad
> >>   idea
> >> * I doubt that the archive based switches around StartupReplicationSlots
> >>   do what they intend. Afaics that'll not work correctly for basebackups
> >>   taken with -X, without recovery.conf
> >>
> >
> > Thanks for looking at it. Most of those are my errors. I think this is
> > pretty dead at least for 9.6, so I'm mostly following up in the hopes of
> > learning about a couple of those mistakes.
> >
> > Good catch with -X without a recovery.conf. Since it wouldn't be
> recognised
> > as a promotion and wouldn't increment the timeline, copied non-failover
> > slots wouldn't get removed. I've never liked that logic at all anyway, I
> > just couldn't think of anything better...
> >
> > LWLockHeldByMe() has a comment to the effect of: "This is meant as debug
> > support only." So that's just a dumb mistake on my part, and I should've
> > added "alreadyLocked" parameters. (Ugly, but works).
> >
> > But why would it be a bad idea to conditionally take a code path that
> > acquires a spinlock based on whether RecoveryInProgress()? It's not
> testing
> > RecoveryInProgress() more than once and doing the acquire and release
> based
> > on separate tests, which would be a problem. I don't really get the
> problem
> > with:
> >
> > if (!RecoveryInProgress())
> > {
> > /* first check whether there's something to write out */
> > SpinLockAcquire(&slot->mutex);
> > was_dirty = slot->dirty;
> > slot->just_dirtied = false;
> > SpinLockRelease(&slot->mutex);
> >
> >         /* and don't do anything if there's nothing to write */
> >         if (!was_dirty)
> >             return;
> > }
> >
> > ... though I think what I really should've done there is just always
> dirty
> > the slot in the redo functions.
>
> Are there any plans to submit a new design/version for v11?



No. The whole approach seems to have been bounced from core. I don't agree
and continue to think this functionality is desirable but I don't get to
make that call.

If time permits I will attempt to update the logical decoding on standby
patchset instead, and if possible add support for fast-forward logical
decoding that does the minimum required to correctly maintain a slot's
catalog_xmin and restart_lsn when advanced. But this won't be usable
directly for failover like failover slots are, it'll require each
application to keep track of standbys and maintain slots on them too. Or
we'll need some kind of extension/helper to sync slot state.

In the mean time, 2ndQuadrant maintains an on-disk-compatible version of
failover slots that's available for support customers.

-- 
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] WIP: Failover Slots

Reply via email to