Re: [PATCHES] [BUGS] Incomplete docs for restore_command for hot standby
On Mon, 2008-02-25 at 17:56 +0600, Markus Bertheau wrote: 2008/2/22, Simon Riggs [EMAIL PROTECTED]: If you have some suggested changes, I'd be happy to hear them. Probably additions are better than just changes though. What about this: *** a/doc/src/sgml/backup.sgml --- b/doc/src/sgml/backup.sgml *** ... The FIXME of course needs replacement by someone in the know. Doc patch edited to include all of Markus' points, tidy up some related text and fix typos. Good to apply to HEAD. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk Index: doc/src/sgml/backup.sgml === RCS file: /home/sriggs/pg/REPOSITORY/pgsql/doc/src/sgml/backup.sgml,v retrieving revision 2.115 diff -c -r2.115 backup.sgml *** doc/src/sgml/backup.sgml 7 Mar 2008 01:46:41 - 2.115 --- doc/src/sgml/backup.sgml 28 Mar 2008 13:08:38 - *** *** 577,587 para It is important that the archive command return zero exit status if and only if it succeeded. Upon getting a zero result, ! productnamePostgreSQL/ will assume that the WAL segment file has been ! successfully archived, and will remove or recycle it. ! However, a nonzero status tells ! productnamePostgreSQL/ that the file was not archived; it will try ! again periodically until it succeeds. /para para --- 577,586 para It is important that the archive command return zero exit status if and only if it succeeded. Upon getting a zero result, ! productnamePostgreSQL/ will assume that the file has been ! successfully archived, and will remove or recycle it. However, a nonzero ! status tells productnamePostgreSQL/ that the file was not archived; ! it will try again periodically until it succeeds. /para para *** *** 1001,1011 para It is important that the command return nonzero exit status on failure. ! The command emphasiswill/ be asked for log files that are not present in the archive; it must return nonzero when so asked. This is not an ! error condition. Be aware also that the base name of the literal%p/ ! path will be different from literal%f/; do not expect them to be ! interchangeable. /para para --- 1000,1012 para It is important that the command return nonzero exit status on failure. ! The command emphasiswill/ be asked for files that are not present in the archive; it must return nonzero when so asked. This is not an ! error condition. Not all of the requested files will be WAL segment ! files. You should also expect requests for files with a suffix of ! literal.backup/ or literal.history/. Also be aware also that ! the base name of the literal%p/ path will be different from ! literal%f/; do not expect them to be interchangeable. /para para *** *** 1576,1594 para The magic that makes the two loosely coupled servers work together is ! simply a varnamerestore_command/ used on the standby that waits ! for the next WAL file to become available from the primary. The ! varnamerestore_command/ is specified in the filenamerecovery.conf/ file on the standby server. Normal recovery processing would request a file from the WAL archive, reporting failure if the file was unavailable. For standby processing it is normal for ! the next file to be unavailable, so we must be patient and wait for ! it to appear. A waiting varnamerestore_command/ can be written as ! a custom script that loops after polling for the existence of the next ! WAL file. There must also be some way to trigger failover, which should ! interrupt the varnamerestore_command/, break the loop and return ! a file-not-found error to the standby server. This ends recovery and ! the standby will then come up as a normal server. /para para --- 1577,1598 para The magic that makes the two loosely coupled servers work together is ! simply a varnamerestore_command/ used on the standby that, ! when asked for the next WAL file, waits for it to become available from ! the primary. The varnamerestore_command/ is specified in the filenamerecovery.conf/ file on the standby server. Normal recovery processing would request a file from the WAL archive, reporting failure if the file was unavailable. For standby processing it is normal for ! the next WAL file to be unavailable, so we must be patient and wait for ! it to appear. For files ending in literal.backup/ or ! literal.history/ there is no need to wait, though a non-zero return ! code should also be returned in this case. A waiting ! varnamerestore_command/ can be written as a custom script that loops !
Re: [PATCHES] [BUGS] Incomplete docs for restore_command for hot standby
Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches It will be applied as soon as one of the PostgreSQL committers reviews and approves it. --- Markus Bertheau wrote: 2008/2/22, Simon Riggs [EMAIL PROTECTED]: On Thu, 2008-02-21 at 08:01 +0600, Markus Bertheau wrote: Section 24.3.3.1 states about restore_command: The command will be asked for file names that are not present in the archive; it must return nonzero when so asked. Section 24.4.1 further states: The magic that makes the two loosely coupled servers work together is simply a restore_command used on the standby that waits for the next WAL file to become available from the primary. It is not clear from the first paragraph, whether the non-existing file that restore_command is being asked for is a not-yet-generated WAL file or something different. If it was a not-yet-generated WAL file, restore_command for replication would have to wait for it to appear. If it was something different, restore_command for replication would have to return an error right away. (Because else it would hang indefinitely, waiting for a file that is not going to appear). Yet I couldn't find hints in the documentation as to how these two cases can be detected by restore_command, i.e. how restore_command should tell a request for a WAL file from a request for a non-WAL file. The two sentences aren't mutually exclusive, especially when you consider they are discussing two different use cases. Why not read up on pg_standby anyway? I read about pg_standby, but this is not about solving a particular problem but about missing information in the docs. Practice (http://archives.postgresql.org/sydpug/2006-10/msg1.php) shows that this is a problem, and people use unproved heuristics ('history' substring in the requested file name). Old email written during beta. Read at your own peril. The email may be old, but the problem at hand is still relevant. Additionally, 24.3.3 contains slightly misleading information: It is important that the command return nonzero exit status on failure. The command will be asked for log files that are not present in the archive; it must return nonzero when so asked. This is not an error condition. This suggests that all non-existing files that restore_command will be asked for are log files. One could therefore reasonably assume that restore_command for replication should wait on all non-existing files. 24.3.3.1 later corrects this by stating that not only log files may be requested, but nevertheless. If you have some suggested changes, I'd be happy to hear them. Probably additions are better than just changes though. What about this: *** a/doc/src/sgml/backup.sgml --- b/doc/src/sgml/backup.sgml *** *** 1001,1011 restore_command = 'cp /mnt/server/archivedir/%f %p' para It is important that the command return nonzero exit status on failure. ! The command emphasiswill/ be asked for log files that are not present ! in the archive; it must return nonzero when so asked. This is not an ! error condition. Be aware also that the base name of the literal%p/ ! path will be different from literal%f/; do not expect them to be ! interchangeable. /para para --- 1001,1011 para It is important that the command return nonzero exit status on failure. ! The command emphasiswill/ be asked for log and other files that are ! not present in the archive; it must return nonzero when so asked. This is ! not an error condition. Be aware also that the base name of the ! literal%p/ path will be different from literal%f/; do not expect ! them to be interchangeable. /para para *** *** 1576,1594 archive_command = 'local_backup_script.sh' para The magic that makes the two loosely coupled servers work together is ! simply a varnamerestore_command/ used on the standby that waits ! for the next WAL file to become available from the primary. The ! varnamerestore_command/ is specified in the filenamerecovery.conf/ file on the standby server. Normal recovery processing would request a file from the WAL archive, reporting failure if the file was unavailable. For standby processing it is normal for ! the next file to be unavailable, so we must be patient and wait for ! it to appear. A waiting varnamerestore_command/ can be written as ! a custom script that loops after polling for the existence of the next ! WAL file. There must also be some way to trigger failover, which should ! interrupt the