On 04/24/2011 10:06 PM, Daniel Farina wrote:
On Thu, Apr 21, 2011 at 8:51 PM, Greg Smith<g...@2ndquadrant.com>  wrote:
There's still the "fsync'd a data block but not the directory entry yet"
issue as fall-out from this too.  Why doesn't PostgreSQL run into this
problem?  Because the exact code sequence used is this one:

open
write
fsync
close

And Linux shouldn't ever screw that up, or the similar rename path.  Here's
what the close man page says, from http://linux.die.net/man/2/close :
Theodore Ts'o addresses this *exact* sequence of events, and suggests
if you want that rename to definitely stick that you must fsync the
directory:

http://www.linuxfoundation.org/news-media/blogs/browse/2009/03/don%E2%80%99t-fear-fsync

Not exactly. That's talking about the sequence used for creating a file, plus a rename. When new WAL files are being created, I believe the ugly part of this is avoided. The path when WAL files are recycled using rename does seem to be the one with the most likely edge case.

The difficult case Tso's discussion is trying to satisfy involves creating a new file and then swapping it for an old one atomically. PostgreSQL never does that exactly. It creates new files, pads them with zeros, and then starts writing to them; it also renames old files that are already of the correctly length. Combined with the fact that there are always fsyncs after writes to the files, and this case really isn't exactly the same as any of the others people are complaining about.

--
Greg Smith   2ndQuadrant US    g...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to