Hi Sergey, Thanks for the report and patch. I think the analysis is right, and the fix is in the right place.
The gap traces back to commit 7185eddf, which deliberately dropped the wait_for_catchup() and switched the primary from teardown_node() to a clean stop(), on the grounds that a clean stop flushes all WAL to both standbys before exiting. That's true, but only for standbys whose walsender is *connected* at shutdown time -- and ->start() only waits for the postmaster to accept connections, not for the standby's walreceiver to have connected back to the primary. So if a standby hasn't connected yet when the primary stops, the clean-shutdown flush skips it, and we're back to the exact "standbys received different amounts of WAL -> timeline fork on reconnect" failure that 7185eddf was meant to fix. Polling pg_stat_replication until both walsenders are present closes that hole: it re-establishes the precondition the clean-stop design silently assumed. And connection is enough here -- the walsender shutdown path sends all WAL up to the shutdown checkpoint regardless of catchup state -- so there's no need to additionally check state = 'streaming'. One small thing: the rest of this file uses count(*), so I'd write count(*) = 2 rather than count(1) = 2 just for local consistency. And the comment reads a little better as something like "Wait until both standbys have connected to the primary", since by this point they've already started -- what we're waiting for is the connection. Regards, Ewan On Tue, Jun 16, 2026 at 4:01 PM Sergey Tatarintsev <[email protected]> wrote: > > Hi hackers! > > I found that after commit 7185eddf0522b3146ed1ff6e063e8e129e77c706 we > got little omission > in TAP test 004_timeline_switch: > ... > my $node_standby_1 = PostgreSQL::Test::Cluster->new('standby_1'); > ... > $node_primary->stop; > > There is no guarantee that standby_1 and standby_2 was successfully > connected to primary and start > streaming before primary stopped. > > I think we must ensure that primary knows about standby_1 and standby_2 > > -- > With best regards, > Sergey Tatarintsev, > PostgresPro
