Hi Team,

I found an issue with PG v16.9 patroni setup where our standby node replication 
and disaster replication site replication broken with below error. It looks 
like WAL corruption which later part of archive file.


CONTEXT:  WAL redo at 184F3/F248B6F0 for Heap/LOCK: xmax: 2818115117, off:35, 
infobits: [LOCK_ONLY, EXCL_LOCK], flags: 0x00; blkref #0: rel
1663/33195/410203483, blk 25329"
PANIC:  WAL contains references to invalid pages"
CONTEXT:  WAL redo at 184F3/F248B6F0 for Heap/LOCK: xmax: 2818115117, off:35, 
infobits: [LOCK_ONLY, EXCL_LOCK], flags: 0x00; blkref #0: 
rel1663/33195/410203483, blk 25329"
WARNING:  page 25329 of relation base/33195/410203483 does not exist"
INFO: no action. I am (pg-patroni-node1-0), a secondary, and following a leader 
(pg-patroni-node2-0)"
[61]LOG:  terminating any other active server processes"
[61]LOG:  startup process (PID 72) was terminated by signal 6: Aborted"
[61]LOG:  shutting down due to startup process failure"
[61]LOG:  database system is shut down"
INFO: establishing a new patroni heartbeat connection to postgres"
INFO: Lock owner: pg-patroni-node2-0; I am pg-patroni-node1-0"
WARNING: Retry got exception: connection problems"
WARNING: Failed to determine PostgreSQL state from the connection, fallingback 
to cached role"
INFO: Error communicating with PostgreSQL. Will try again later"
WARNING: Postgresql is not running."


Primary db was not impacted, however standby node and DR site replication 
broken, I tried to reinit with latest backup + archive loading from pgbackrest 
backup but it fails with same error once the corrupt wal/archive file applying 
the changes. I had to reinit with pgbasebackup with 40TB database which took 
about 45 hrs of time.

As I understand the transcation create table ->performed DML and then drop the 
table or transaction could be rollback that makes RACE condition in WAL file 
creation and got failed while applying the same in standby/DR site.

Looks like bug. Any suggestion for this scenario.

Thanks & Regards,
Ishan Joshi

Reply via email to