Let me preface this by saying that I've set up warm standby instances quite a few times. I think I sort of hopefully know what I'm doing. pg_start_backup('stuff'), tar data directory, pg_stop_backup(), copy data directory to warm standby server, extract in data directory, etc.
We have two CentOS 5 boxes that we're trying to set up as a master -> warm standby. Both have postgres 8.4.4 installed from source. The master's postgres instance has been there for a while (a couple of months or something). I am very, very sorry if I'm missing something really simple, but I just can't seem to figure out what I'm doing wrong. Here's the process I'm following: ==master== $ psql postgres=# select pg_start_backup('<today's date>') postgres=# \q $ cd /path/to/data/directory $ tar cvzf data.tar.gz * $ scp data.tar.gz <server>:~/ ==warm standby== $ cd /path/to/data/directory $ tar xvf ~/data.tar.gz <create recovery.conf file with restore_command line and modify postgresql.conf to disable wal archiving> $ pg_ctl -D /path/to/data/directory start Here's the output after trying to start the backup instance with pg_ctl (ignoring the line about postmaster.pid already existing): server starting FATAL: incorrect checksum in control file Here's the output from pg_controldata: $ pg_controldata `pwd` WARNING: Calculated CRC checksum does not match value stored in file. Either the file is corrupt, or it has a different layout than this program is expecting. The results below are untrustworthy. pg_control version number: 843 Catalog version number: 200904091 Database system identifier: 5473004134245625319 Database cluster state: in production pg_control last modified: Wed 31 Dec 1969 05:00:00 PM MST Latest checkpoint location: B000020/0 Prior checkpoint location: A000020/0 Latest checkpoint's REDO location: B000020/1 Latest checkpoint's TimeLineID: 0 Latest checkpoint's NextXID: 57905/32791 Latest checkpoint's NextOID: 1 Latest checkpoint's NextMultiXactId: 0 Latest checkpoint's NextMultiOffset: 1282256808 Time of latest checkpoint: Wed 31 Dec 1969 05:00:00 PM MST Minimum recovery ending location: 0/4 Maximum data alignment: 0 Database block size: 8192 Blocks per segment of large relation: 16777216 WAL block size: 64 Bytes per WAL segment: 32 Maximum length of identifiers: 2000 Maximum columns in an index: 257 Maximum size of a TOAST chunk: 513657607 Date/time type storage: floating-point numbers Float4 argument passing: by reference Float8 argument passing: by reference Those timestamps are at the unix epoch - Jan 1 1970 ... what in the spinning marble?! Yeah. I'm confused, my boss is confused... We're currently running a yum -y update on those boxes, but we'd still like to know what's going on, even if a full update fixes everything. Any clues?