If you look at the buildfarm's failures page and filter down to just subscriptionCheck failures, what you find is that all of the last 6 such failures are in 031_column_list.pl:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=tamandua&dt=2024-02-02%2019%3A33%3A16 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=calliphoridae&dt=2024-02-02%2011%3A21%3A44 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=flaviventris&dt=2024-02-01%2020%3A34%3A29 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=flaviventris&dt=2024-02-01%2016%3A57%3A14 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=kestrel&dt=2024-01-31%2022%3A18%3A24 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=calliphoridae&dt=2024-01-30%2011%3A29%3A23 There are some further back too: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mylodon&dt=2023-11-17%2018%3A28%3A24 but this definitely got way more common in the last few days. Digging down into the logs, these all look pretty similar. Somehow things get into a state where replication connections fail with the publisher reporting "publication does not exist": 2024-02-02 19:42:23.187 UTC [1631708][not initialized][:0] LOG: connection received: host=[local] 2024-02-02 19:42:23.189 UTC [1631708][walsender][4/287:0] LOG: connection authenticated: user="bf" method=trust (/home/bf/bf-build/tamandua/HEAD/pgsql.build/testrun/subscription/031_column_list/data/t_031_column_list_publisher_data/pgdata/pg_hba.conf:117) 2024-02-02 19:42:23.189 UTC [1631708][walsender][4/287:0] LOG: replication connection authorized: user=bf application_name=sub1 2024-02-02 19:42:23.214 UTC [1631708][walsender][4/288:0] LOG: statement: SELECT pg_catalog.set_config('search_path', '', false); 2024-02-02 19:42:23.226 UTC [1631708][walsender][4/0:0] LOG: received replication command: IDENTIFY_SYSTEM 2024-02-02 19:42:23.226 UTC [1631708][walsender][4/0:0] STATEMENT: IDENTIFY_SYSTEM 2024-02-02 19:42:23.226 UTC [1631708][walsender][4/0:0] LOG: received replication command: START_REPLICATION SLOT "sub1" LOGICAL 0/15BCDD0 (proto_version '4', origin 'any', publication_names '"pub7"') 2024-02-02 19:42:23.226 UTC [1631708][walsender][4/0:0] STATEMENT: START_REPLICATION SLOT "sub1" LOGICAL 0/15BCDD0 (proto_version '4', origin 'any', publication_names '"pub7"') 2024-02-02 19:42:23.226 UTC [1631708][walsender][4/0:0] LOG: acquired logical replication slot "sub1" 2024-02-02 19:42:23.226 UTC [1631708][walsender][4/0:0] STATEMENT: START_REPLICATION SLOT "sub1" LOGICAL 0/15BCDD0 (proto_version '4', origin 'any', publication_names '"pub7"') 2024-02-02 19:42:23.242 UTC [1631708][walsender][4/0:0] LOG: starting logical decoding for slot "sub1" 2024-02-02 19:42:23.242 UTC [1631708][walsender][4/0:0] DETAIL: Streaming transactions committing after 0/15BCDD0, reading WAL from 0/15BCDD0. 2024-02-02 19:42:23.242 UTC [1631708][walsender][4/0:0] STATEMENT: START_REPLICATION SLOT "sub1" LOGICAL 0/15BCDD0 (proto_version '4', origin 'any', publication_names '"pub7"') 2024-02-02 19:42:23.243 UTC [1631708][walsender][4/0:0] LOG: logical decoding found consistent point at 0/15BCDD0 2024-02-02 19:42:23.243 UTC [1631708][walsender][4/0:0] DETAIL: There are no running transactions. 2024-02-02 19:42:23.243 UTC [1631708][walsender][4/0:0] STATEMENT: START_REPLICATION SLOT "sub1" LOGICAL 0/15BCDD0 (proto_version '4', origin 'any', publication_names '"pub7"') 2024-02-02 19:42:23.244 UTC [1631708][walsender][4/0:0] ERROR: publication "pub7" does not exist 2024-02-02 19:42:23.244 UTC [1631708][walsender][4/0:0] CONTEXT: slot "sub1", output plugin "pgoutput", in the change callback, associated LSN 0/15C7698 2024-02-02 19:42:23.244 UTC [1631708][walsender][4/0:0] STATEMENT: START_REPLICATION SLOT "sub1" LOGICAL 0/15BCDD0 (proto_version '4', origin 'any', publication_names '"pub7"') 2024-02-02 19:42:23.244 UTC [1631708][walsender][4/0:0] LOG: released logical replication slot "sub1" 2024-02-02 19:42:23.834 UTC [1631708][walsender][:0] LOG: disconnection: session time: 0:00:00.647 user=bf database=postgres host=[local] and then we just repeat that until the test times out. It fails at different points in the test script (hence, different publication names), but the pattern looks about the same. I don't see anything that 031_column_list.pl is doing that is much different from other subscription tests, so why is it the only one failing? And more to the point, what's going wrong exactly? I am suspicious that this somehow represents a failure of the historical catalog decoding logic, but I don't see how that theory explains this only breaking in one test script. regards, tom lane