Re: BUG: Cascading standby fails to reconnect after falling back to archive recovery

Xuneng Zhou Thu, 29 Jan 2026 22:01:41 -0800

Hi Fujii'san,

Thanks for looking into this.


On Fri, Jan 30, 2026 at 11:12 AM Fujii Masao <[email protected]> wrote:
>
> On Thu, Jan 29, 2026 at 9:22 PM Xuneng Zhou <[email protected]> wrote:
> > Thanks for your report. I can reliably reproduce the issue on HEAD
> > using your scripts. I’ve analyzed the problem and am proposing a patch
> > to fix it.
> >
> > --- Analysis
> > When a cascading standby streams from an archive-only upstream:
> >
> > 1. The upstream's GetStandbyFlushRecPtr() returns only replay position
> > (no received-but-not-replayed buffer since there's no walreceiver)
> > 2. When streaming ends and the cascade falls back to archive recovery,
> > it can restore WAL segments from its own archive access
> > 3. The cascade's read position (RecPtr) advances beyond what the
> > upstream has replayed
> > 4. On reconnect, the cascade requests streaming from RecPtr, which the
> > upstream rejects as "ahead of flush position"
> >
> > --- Proposed Fix
> >
> > Track the last confirmed flush position from streaming
> > (lastStreamedFlush) and clamp the streaming start request when it
> > exceeds that position:
>
> I haven't read the patch yet, but doesn't lastStreamedFlush represent
> the same LSN as tliRecPtr or replayLSN (the arguments to
> WaitForWALToBecomeAvailable())? If so, we may not need to introduce
> a new variable to track this LSN.

I think they refer to different types of LSNs. I don’t have access to my
computer at the moment, but I’ll look into it and get back to you shortly.

> The choice of which LSN is used as the replication start point has varied
> over time to handle corner cases (for example, commit 06687198018).
> That makes me wonder whether we should first better understand
> why WaitForWALToBecomeAvailable() currently uses RecPtr as
> the starting point.
>
> BTW, with v1 patch, I was able to reproduce the issue using the following
steps:
>
> --------------------------------------------
> initdb -D data
> mkdir arch
> cat <<EOF >> data/postgresql.conf
> archive_mode = on
> archive_command = 'cp %p ../arch/%f'
> restore_command = 'cp ../arch/%f %p'
> EOF
> pg_ctl -D data start
> pg_basebackup -D sby1 -c fast
> cp -a sby1 sby2
> cat <<EOF >> sby1/postgresql.conf
> port = 5433
> EOF
> touch sby1/standby.signal
> pg_ctl -D sby1 start
> cat <<EOF >> sby2/postgresql.conf
> port = 5434
> primary_conninfo = 'port=5433'
> EOF
> touch sby2/standby.signal
> pg_ctl -D sby2 start
> pgbench -i -s2
> pg_ctl -D sby2 restart
> --------------------------------------------
>
> In this case, after restarting the standby connecting to another
> (cascading) standby, I observed the following error.
>
> FATAL:  could not receive data from WAL stream: ERROR:  requested
> starting point 0/04000000 is ahead of the WAL flush position of this
> server 0/03FFE8D0
>
> Regards,
>
> --
> Fujii Masao


Best,
Xuneng

Re: BUG: Cascading standby fails to reconnect after falling back to archive recovery

Reply via email to