On 3 May 2016 at 22:03, Craig Ringer <cr...@2ndquadrant.com> wrote:

> Hi all
> There's a bug (mine) in logical decoding timeline following where reading
> the first page from the segment containing a timeline switch fails to read
> from the most recent timeline in that segment. This is harmless if the old
> timeline's copy of the segment is present - but if it's been renamed to
> .partial, deleted or never copied over to a replica then decoding will
> complain that the required segment has already been removed. Just like
> without timeline following.
> The underlying problem is that timeline calculations used the record's
> start pointer and didn't properly consider continuations; they were
> record-based, not page-based like they should be.
> A corrected and handily much, much simpler patch is attached. The logic
> for finding the last timeline on a segment was massively more complex than
> it needed to be, and that wasn't the only thing.

For the record the patch this fixes got reverted as agreed in
http://www.postgresql.org/message-id/20160503165812.GA29604@alvherre.pgsql .

I will submit this patch to 9.7 along with the improvements to
pg_recvlogical and expanded test suite.

I then expect to follow on with work to clean up the use of globals to pass
timeline info through xlogreader to read page callbacks, and hopefully the
hs protocol changes etc required to allow the improved slot failover
support mechanism Petr, Andres and I discussed to work.

This patch as attached won't apply anymore, but it's trivial to apply it on
top of a cherry-picked copy of the reverted feature patch for testing or
further development.

 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Reply via email to