On 12/01/2015 10:44 PM, Peter Eisentraut wrote:
On 11/27/15 8:18 AM, Michael Paquier wrote:
On Fri, Nov 27, 2015 at 8:17 PM, Tomas Vondra
<tomas.von...@2ndquadrant.com> wrote:
So, what's going on? The problem is that while the rename() is atomic, it's
not guaranteed to be durable without an explicit fsync on the parent
directory. And by default we only do fdatasync on the recycled segments,
which may not force fsync on the directory (and ext4 does not do that,
Yeah, that seems to be the way the POSIX spec clears things.
"If _POSIX_SYNCHRONIZED_IO is defined, the fsync() function shall
force all currently queued I/O operations associated with the file
indicated by file descriptor fildes to the synchronized I/O completion
state. All I/O operations shall be completed as defined for
synchronized I/O file integrity completion."
If I understand that right, it is guaranteed that the rename() will be
atomic, meaning that there will be only one file even if there is a
crash, but that we need to fsync() the parent directory as mentioned.

I don't see anywhere in the spec that a rename needs an fsync of the
 directory to be durable. I can see why that would be needed in
practice, though. File system developers would probably be able to
give a more definite answer.

Yeah, POSIX is the smallest common denominator. In this case the spec seems not to require this durability guarantee (rename without fsync on directory), which allows a POSIX-compliant filesystem.

At least that's my conclusion from reading https://lwn.net/Articles/322823/

However, as I explained in the original post, it's more complicated as this only seems to be problem with fdatasync. I've been unable to reproduce the issue with wal_sync_method=fsync.


Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to