Hi, On 2022-02-16 20:22:23 -0500, Tom Lane wrote: > There's no disconnection log entry for either, which I suppose means > that somebody didn't bother logging disconnection for walsenders ...
The thing is, we actually *do* log disconnection for walsenders: 2022-02-16 17:53:55.780 PST [2459806][walsender][:0] LOG: disconnection: session time: 0:00:01.321 user=andres database= host=[local] The only way I see for the disconnection not to be logged is an immediate shutdown. It'd be one thing if the missing walsender disconnect wasn't logged because we shut down just after. But unless I misunderstand something, between the basebackup and the failing test, we actually start standby_3: # Running: pg_basebackup -D /home/bf/build/buildfarm-serinus/HEAD/pgsql.build/src/test/recovery/tmp_check/t_019_replslot_limit_primary3_data/backup/my_backup -h /tmp/hyrtnyd2RU -p 53774 --checkpoint fast --no-sync # Backup finished ... # Running: pg_ctl -w -D /home/bf/build/buildfarm-serinus/HEAD/pgsql.build/src/test/recovery/tmp_check/t_019_replslot_limit_standby_3_data/pgdata -l /home/bf/build/buildfarm-serinus/HEAD/pgsql.build/src/test/recovery/tmp_check/log/019_replslot_limit_standby_3.log -o --cluster-name=standby_3 start and also wait till replication has caught up: Waiting for replication conn standby_3's replay_lsn to pass 0/700000 on primary3 done and only then we have not ok 17 - have walsender pid 3682154 # 3682136 # Failed test 'have walsender pid 3682154 # 3682136' # at t/019_replslot_limit.pl line 335. # '3682154 # 3682136' # doesn't match '(?^:^[0-9]+$)' Starting a node in recovery and having it connect to the primary seems like a mighty long time for a process to exit, unless it's stuck behind something. Greetings, Andres Freund