Hi guys. I’m in the process of migrating a PG 9.2 cluster to PG 14.

There are a lot of differences on the configuration files between PG 9.2 and PG 
14, and I have a question that hopefully you’ll be able to help me out.

My servers are deployed in AWS on EC2 instances and I use /pgsql to store PG 
data and /data to store PG Logs, Wal files, etc. My /pgsql/14/main/pg_wal 
folder is a symlink to /data/postgresql/pg_xlogs (done this to minimize IO on 
the /pgsql EBS volume).

The restore command in the postgresql.conf file is restore_command = 'cp 
/data/wal_archive/%f %p’ - /data/wal_archive is where the master is shipping 
the WAL Files to.

—————————————————————————————————————————————————————————————————

When I deployed the slave instance and started the recovery process, I got this 
messages:

2021-12-11 02:11:52 UTC [22700]: [3-1] user=,db=,app=,client= LOG:  starting 
PostgreSQL 14.1 (Ubuntu 14.1-2.pgdg20.04+1) on x86_64-pc-linux-gnu, compiled by 
gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, 64-bit
2021-12-11 02:11:52 UTC [22700]: [4-1] user=,db=,app=,client= LOG:  listening 
on IPv4 address "0.0.0.0", port 5432
2021-12-11 02:11:52 UTC [22700]: [5-1] user=,db=,app=,client= LOG:  listening 
on IPv6 address "::", port 5432
2021-12-11 02:11:52 UTC [22700]: [6-1] user=,db=,app=,client= LOG:  listening 
on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2021-12-11 02:11:52 UTC [22702]: [1-1] user=,db=,app=,client= LOG:  database 
system was interrupted; last known up at 2021-12-10 14:57:44 UTC
2021-12-11 02:11:52 UTC [22702]: [2-1] user=,db=,app=,client= LOG:  creating 
missing WAL directory "pg_wal/archive_status"
cp: cannot stat '/data/wal_archive/00000002.history': No such file or directory
2021-12-11 02:11:52 UTC [22702]: [3-1] user=,db=,app=,client= LOG:  entering 
standby mode
2021-12-11 02:11:52 UTC [22702]: [4-1] user=,db=,app=,client= LOG:  invalid 
primary checkpoint record
2021-12-11 02:11:52 UTC [22702]: [5-1] user=,db=,app=,client= PANIC:  could not 
locate a valid checkpoint record

—————————————————————————————————————————————————————————————————

However, the wal files were present in the /data/wal_archive/ directory.

When I moved the same wal files to /pgsql/14/main/pg_wal/, it started working:

2021-12-11 02:15:35 UTC [23103]: [3-1] user=,db=,app=,client= LOG:  starting 
PostgreSQL 14.1 (Ubuntu 14.1-2.pgdg20.04+1) on x86_64-pc-linux-gnu, compiled by 
gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, 64-bit
2021-12-11 02:15:35 UTC [23103]: [4-1] user=,db=,app=,client= LOG:  listening 
on IPv4 address "0.0.0.0", port 5432
2021-12-11 02:15:35 UTC [23103]: [5-1] user=,db=,app=,client= LOG:  listening 
on IPv6 address "::", port 5432
2021-12-11 02:15:35 UTC [23103]: [6-1] user=,db=,app=,client= LOG:  listening 
on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2021-12-11 02:15:35 UTC [23105]: [1-1] user=,db=,app=,client= LOG:  database 
system was interrupted; last known up at 2021-12-10 14:57:44 UTC
cp: cannot stat '/data/wal_archive/00000002.history': No such file or directory
2021-12-11 02:15:35 UTC [23105]: [2-1] user=,db=,app=,client= LOG:  entering 
standby mode
2021-12-11 02:15:35 UTC [23105]: [3-1] user=,db=,app=,client= LOG:  database 
system was not properly shut down; automatic recovery in progress
2021-12-11 02:15:35 UTC [23105]: [4-1] user=,db=,app=,client= LOG:  redo starts 
at 6FB/D9000028
2021-12-11 02:15:35 UTC [23105]: [5-1] user=,db=,app=,client= LOG:  consistent 
recovery state reached at 6FB/DA000000
2021-12-11 02:15:35 UTC [23103]: [7-1] user=,db=,app=,client= LOG:  database 
system is ready to accept read-only connections
cp: cannot stat '/data/wal_archive/00000001000006FB000000DA': No such file or 
directory
2021-12-11 02:15:35 UTC [23113]: [1-1] user=,db=,app=,client= LOG:  started 
streaming WAL from primary at 6FB/DA000000 on timeline 1

—————————————————————————————————————————————————————————————————

Why? Why is PG looking for the wal files in the “wrong” directory? What am I 
missing here?

Thanks in advance.
Lucas

Reply via email to