On Wed, Nov 13, 2013 at 3:42 AM, Peter Eisentraut <pete...@gmx.net> wrote: > When an external recovery command such as restore_command or > archive_cleanup_command fails, it just reports "return code 34567" or > something, but we have facilities to do decode this properly, so use > them.
I think this is a very good idea, but you should go a bit further: document the special relationship restore_command has to special return codes. Currently, the documentation says: "It is important that the archive command return zero exit status if and only if it succeeded. Upon getting a zero result, PostgreSQL will assume that the WAL segment file has been successfully archived, and will remove or recycle it. However, a nonzero status tells PostgreSQL that the file was not archived; it will try again periodically until it succeeds." Yes, this concerns archive_command (where return code values that are non-zero *are* never distinguished), but nothing much is said about the return code of restore_command specifically anywhere else, so it's implied that it's exactly inverse to archive_command. In reality, some special return codes have a significance to restore_command: they make recovery abort, because they're taking as proxies for various failures that it isn't sensible to continue recovery in the event of. We're talking about the difference between recovery aborting, and recovery having conceptually "reached the end of the WAL stream", so it's very surprising that this isn't documented currently. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers