Michael, I left my pipeline running the TAP test until it failed — and after some time, it did fail. I then changed the test slightly, and simply by adding a short sleep, I was able to reproduce the same failure more reliably. Moreover, attempting to restart the standby server after a failed promotion triggers startup PANIC again.
v2-recovery_tli_switch_test.pl
Description: Perl program
2025-12-23 20:45:54.994 +07 postmaster[446363] LOG: listening on Unix socket "/tmp/iUXKPzvUof/.s.PGSQL.20499" 2025-12-23 20:45:55.000 +07 startup[446369] LOG: database system was interrupted; last known up at 2025-12-23 20:45:52 +07 2025-12-23 20:45:56.673 +07 startup[446369] LOG: starting backup recovery with redo LSN 0/02000028, checkpoint LSN 0/02000080, on timeline ID 1 2025-12-23 20:45:56.673 +07 startup[446369] LOG: entering standby mode 2025-12-23 20:45:56.683 +07 startup[446369] LOG: redo starts at 0/02000028 2025-12-23 20:45:56.685 +07 startup[446369] LOG: completed backup recovery with redo LSN 0/02000028 and end LSN 0/02000120 2025-12-23 20:45:56.685 +07 startup[446369] LOG: consistent recovery state reached at 0/02000120 2025-12-23 20:45:56.685 +07 postmaster[446363] LOG: database system is ready to accept read-only connections 2025-12-23 20:45:56.693 +07 walreceiver[446370] LOG: fetching timeline history file for timeline 2 from primary server 2025-12-23 20:45:56.698 +07 walreceiver[446370] LOG: started streaming WAL from primary at 0/03000000 on timeline 1 2025-12-23 20:45:56.719 +07 walreceiver[446370] LOG: replication terminated by primary server 2025-12-23 20:45:56.719 +07 walreceiver[446370] DETAIL: End of WAL reached on timeline 1 at 0/030B20E8. 2025-12-23 20:45:56.749 +07 startup[446369] LOG: new target timeline is 2 2025-12-23 20:45:56.749 +07 startup[446369] LOG: invalid record length at 0/030B20E8: expected at least 24, got 0 2025-12-23 20:45:56.750 +07 walreceiver[446370] LOG: restarted WAL streaming at 0/03000000 on timeline 2 2025-12-23 20:46:01.785 +07 startup[446369] LOG: received promote request 2025-12-23 20:46:01.785 +07 walreceiver[446370] FATAL: terminating walreceiver process due to administrator command 2025-12-23 20:46:01.786 +07 startup[446369] LOG: redo done at 0/030B20C0 system usage: CPU: user: 0.02 s, system: 0.00 s, elapsed: 5.10 s 2025-12-23 20:46:01.786 +07 startup[446369] LOG: last completed transaction was at log time 2025-12-23 20:45:52.478734+07 2025-12-23 20:46:01.786 +07 startup[446369] PANIC: invalid magic number 0000 in WAL segment 000000020000000000000003, LSN 0/030B2000, offset 729088 2025-12-23 20:46:01.998 +07 postmaster[446363] LOG: startup process (PID 446369) was terminated by signal 6: Aborted 2025-12-23 20:46:01.998 +07 postmaster[446363] LOG: terminating any other active server processes 2025-12-23 20:46:01.998 +07 postmaster[446363] LOG: shutting down due to startup process failure 2025-12-23 20:46:02.000 +07 postmaster[446363] LOG: database system is shut down
