The PostgreSQL documentation describes cp (on UNIX/Linux) or copy (on Windows) as an example for archive_command. However, cp/copy does not sync the copied data to disk. As a result, the completed WAL segments would be lost in the following sequence:

1. A WAL segment fills up.

2. The archiver process archives the just filled WAL segment using archive_command. That is, cp/copy reads the WAL segment file from pg_xlog/ and writes to the archive area. At this point, the WAL file is not persisted to the archive area yet, because cp/copy doesn't sync the writes.

3. The checkpoint processing removes the WAL segment file from pg_xlog/.

4. The OS crashes.  The filled WAL segment doesn't exist anywhere any more.

Considering the "reliable" image of PostgreSQL and widespread use in enterprise systems, I think something should be done. Could you give me your opinions on the right direction? Although the doc certainly escapes by saying "(This is an example, not a recommendation, and might not work on all platforms.)", it seems from pgsql-xxx MLs that many people are following this example.

* Improve the example in the documentation.
But what command can we use to reliably sync just one file?

* Provide some command, say pg_copy, which copies a file synchronously by using fsync(), and describes in the doc something like "for simple use cases, you can use pg_copy as the standard reliable copy command."

Related to this topic, pg_basebackup doesn't fsync the backed up files. I'm afraid this too is different from what the users expect --- I guess they would expect the backup is certainly available after pg_basebackup completes even if the machine crashes.


Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to