On Thu, Mar 5, 2015 at 10:55:28AM +0300, Vladimir Borodin wrote: > You are correct that a pg_controldata file is copied over that has > wal_level=minimal, but that should not be a problem. > > > I suppose, this is the root cause of why replica does not start as hot > standby. > It it enough to start it as warm standby, but not hot standby. > See CheckRequiredParameterValues function in xlog.c which is called inside of > StartupXLOG function.
Yes, you are correct. I spent all day building a test harness so I could automate this setup and test various failures. I was able to reproduce your failure, and you are correct that the proper fix is to set wal_level=hot_standby on the new master, and then start and stop the new cluster just before rsync. The root cause is that pg_upgrade calls pg_resetxlog -o on the new cluster _after_ the new cluster stopped for the final time, so rsync is copying the incorrect pg_controldata wal_level value. Also, even if pg_resetxlog preserved wal_level in the control file, there is no guarantee that the user configured the new cluster's wal_level for hot_standby anyway. What I have done is to update the pg_upgrade instructions to add this required step. Updated doc patch attached. (I also added the --delete flag to rsync.) Thanks so much for your detailed report. > But it could not be done with --size-only key, because control-file is > of fixed > size and rsync would skip it. Or may be everything should be copied > with > --size-only and control-file should be copied without this option. > > > Well, what happens is that there is no _new_ standby pg_controldata > file, so it is copied fully from the new master. Are you running initdb > to create the new standby --- you shouldn't be doing that as the rsync > will do that for you. > > > No, I don’t. The scenario of the problem with copying control-file was in case > when I: > 1. ran pg_upgrade on master and got control-file with "wal_level = minimal", > 2. did rsync --size-only to replica (and it got this control-file with > "wal_level = minimal"), > 3. started and stopped postgres on master to get «good» control-file with > "wal_level = hot_standby», > 4. did rsync --size-only to replica one more time. And this time control-file > is not copied because of the same size of control-file. > > Actually, if you don’t do step 2, everything works as expected. Sorry for > bothering you. Ah, yes, I think doing rsync twice is never a good suggestion. It can lead to too many failures. Doing the start/stop before rsync seems like the best solution. -- Bruce Momjian <br...@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml new file mode 100644 index 07ca0dc..e25e0d0 *** a/doc/src/sgml/backup.sgml --- b/doc/src/sgml/backup.sgml *************** tar -cf backup.tar /usr/local/pgsql/data *** 438,445 **** Another option is to use <application>rsync</> to perform a file system backup. This is done by first running <application>rsync</> while the database server is running, then shutting down the database ! server just long enough to do a second <application>rsync</>. The ! second <application>rsync</> will be much quicker than the first, because it has relatively little data to transfer, and the end result will be consistent because the server was down. This method allows a file system backup to be performed with minimal downtime. --- 438,447 ---- Another option is to use <application>rsync</> to perform a file system backup. This is done by first running <application>rsync</> while the database server is running, then shutting down the database ! server long enough to do an <command>rsync --checksum</>. ! (<option>--checksum</> is necessary because <command>rsync</> only ! has file modification-time granularity of one second.) The ! second <application>rsync</> will be quicker than the first, because it has relatively little data to transfer, and the end result will be consistent because the server was down. This method allows a file system backup to be performed with minimal downtime. diff --git a/doc/src/sgml/pgupgrade.sgml b/doc/src/sgml/pgupgrade.sgml new file mode 100644 index e1cd260..56cb45d *** a/doc/src/sgml/pgupgrade.sgml --- b/doc/src/sgml/pgupgrade.sgml *************** NET STOP postgresql-8.4 *** 315,320 **** --- 315,324 ---- NET STOP postgresql-9.0 </programlisting> </para> + + <para> + Log-shipping standby servers can remain running until a later step. + </para> </step> <step> *************** pg_upgrade.exe *** 399,404 **** --- 403,535 ---- </step> <step> + <title>Upgrade any Log-Shipping Standby Servers</title> + + <para> + If you have Log-Shipping Standby Servers (<xref + linkend="warm-standby">), follow these steps to upgrade them (before + starting any servers): + </para> + + <procedure> + + <step> + <title>Install the new PostgreSQL binaries on standby servers</title> + + <para> + Make sure the new binaries and support files are installed on all + standby servers. + </para> + </step> + + <step> + <title>Make sure the new standby data directories do <emphasis>not</> + exist</title> + + <para> + Make sure the new standby data directories do <emphasis>not</> + exist or are empty. If <application>initdb</> was run, delete + the standby server data directories. + </para> + </step> + + <step> + <title>Install custom shared object files</title> + + <para> + Install the same custom shared object files on the new standbys + that you installed in the new master cluster. + </para> + </step> + + <step> + <title>Stop standby servers</title> + + <para> + If the standby servers are still running, stop them now using the + above instructions. + </para> + </step> + + <step> + <title>Verify Standby Servers</title> + + <para> + To prevent old standby servers from being modified, run + <application>pg_controldata</> against the primary and standby + clusters and verify that the <quote>Latest checkpoint location</> + values match in all clusters. (This requires the standbys to be + shut down after the primary.) + </para> + </step> + + <step> + <title>Save configuration files</title> + + <para> + Save any configuration files from the standbys you need to keep, + e.g. <filename>postgresql.conf</>, <literal>recovery.conf</>, + as these will be overwritten or removed in the next step. + </para> + </step> + + <step> + <title>Start and stop the new master cluster</title> + + <para> + In the new master cluster, change <varname>wal_level</> to + <literal>hot_standby</> in the <filename>postgresql.conf</> file + and then start and stop the cluster. + </para> + </step> + + <step> + <title>Run <application>rsync</></title> + + <para> + From a directory that is above the old and new database cluster + directories, run this for each slave: + + <programlisting> + rsync --archive --delete --hard-links --size-only old_pgdata new_pgdata remote_dir + </programlisting> + + where <option>old_pgdata</> and <option>new_pgdata</> are relative + to the current directory, and <option>remote_dir</> is + <emphasis>above</> the old and new cluster directories on + the standby server. The old and new relative cluster paths + must match on the master and standby server. Consult the + <application>rsync</> manual page for details on specifying the + remote directory, e.g. <literal>standbyhost:/opt/PostgreSQL/</>. + <application>rsync</> will be fast when <application>pg_upgrade</>'s + <option>--link</> mode is used because it will create hard links + on the remote server rather than transferring user data. + </para> + + <para> + If you have tablespaces, you will need to run a similar + <application>rsync</> command for each tablespace directory. If you + have relocated <filename>pg_xlog</> outside the data directories, + <application>rsync</> must be run on those directories too. + </para> + </step> + + <step> + <title>Configure log-shipping to standby servers</title> + + <para> + Configure the servers for log shipping. (You do not need to run + <function>pg_start_backup()</> and <function>pg_stop_backup()</> + or take a file system backup as the slaves are still synchronized + with the master.) + </para> + </step> + + </procedure> + + </step> + + <step> <title>Restore <filename>pg_hba.conf</></title> <para> *************** pg_upgrade.exe *** 409,414 **** --- 540,554 ---- </step> <step> + <title>Start the new server</title> + + <para> + The new server can now be safely started, and then any + <application>rsync</>'ed standby servers. + </para> + </step> + + <step> <title>Post-Upgrade processing</title> <para> *************** psql --username postgres --file script.s *** 548,569 **** </para> <para> - A Log-Shipping Standby Server (<xref linkend="warm-standby">) cannot - be upgraded because the server must allow writes. The simplest way - is to upgrade the primary and use <command>rsync</> to rebuild the - standbys. You can run <command>rsync</> while the primary is down, - or as part of a base backup (<xref linkend="backup-base-backup">) - which overwrites the old standby cluster. - </para> - - <para> If you want to use link mode and you do not want your old cluster to be modified when the new cluster is started, make a copy of the old cluster and upgrade that in link mode. To make a valid copy of the old cluster, use <command>rsync</> to create a dirty copy of the old cluster while the server is running, then shut down ! the old server and run <command>rsync</> again to update the copy with any ! changes to make it consistent. You might want to exclude some files, e.g. <filename>postmaster.pid</>, as documented in <xref linkend="backup-lowlevel-base-backup">. If your file system supports file system snapshots or copy-on-write file copies, you can use that --- 688,702 ---- </para> <para> If you want to use link mode and you do not want your old cluster to be modified when the new cluster is started, make a copy of the old cluster and upgrade that in link mode. To make a valid copy of the old cluster, use <command>rsync</> to create a dirty copy of the old cluster while the server is running, then shut down ! the old server and run <command>rsync --checksum</> again to update the ! copy with any changes to make it consistent. (<option>--checksum</> ! is necessary because <command>rsync</> only has file modification-time ! granularity of one second.) You might want to exclude some files, e.g. <filename>postmaster.pid</>, as documented in <xref linkend="backup-lowlevel-base-backup">. If your file system supports file system snapshots or copy-on-write file copies, you can use that
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers