At 12:38 AM -0500 2/20/05, Tom Lane wrote:
Dominic Giampaolo <[EMAIL PROTECTED]> writes:
 I believe that what the above comment refers to is the fact that
 fsync() is not sufficient to guarantee that your data is on stable
 storage and on MacOS X we provide a fcntl(), called F_FULLFSYNC,
 to ask the drive to flush all buffered data to stable storage.

I've been looking for documentation on this without a lot of luck ("man fcntl" on OS X 10.3.8 has certainly never heard of it). It's not completely clear whether this subsumes fsync() or whether you're supposed to fsync() and then use the fcntl.

My understanding is that you're supposed to fsync() and then use the fcntl, but I'm not the filesystems expert. (Dominic, who wrote the original message that I forwarded, is.)


I've filed a bug report asking for better documentation about this to be placed in the fsync man page. <radar://4012378>


Also, isn't it fundamentally at the wrong level?  One would suppose that
the drive flush operation is going to affect everything the drive
currently has queued, not just the one file.  That makes it difficult
if not impossible to use efficiently.

I think the intent is to make the fcntl more accurate in time, as the ability to do so appears in hardware.


One of the advantages Apple has is the ability to set very specific requirements for our hardware. So if a block specific flush command becomes part of the ATA spec, Apple can require vendors to support it, and support it correctly, before using those drives.

On the other hand, as Dominic described, once the hardware is external (like a firewire enclosure), we lose that leverage.


At 12:42 PM -0500 2/20/05, Greg Stark wrote:
Dominic Giampaolo <[EMAIL PROTECTED]> writes:

 > In most cases you do not need such a heavy handed operation and fsync() is
 > good enough.

Really? Can you think of a single application for which this definition of fsync is useful?

Kernel buffers are transparent to the application, just as the disk buffer is.
It doesn't matter to an application whether the data is sitting in a kernel
buffer, or a buffer in the disk, it's equivalent. If fsync doesn't guarantee
the writes actually end up on non-volatile disk then as far as the application
is concerned it's just an expensive noop.

I think the intent of fsync() is closer to what you describe, but the convention is that fsync() hands responsibility to the disk hardware. That's how every other Unix seems to handle fsync() too. This gives you good performance, and if you combine a smart fsync()ing application with reliable storage hardware (like an XServe RAID that battery backs it's own write caches), you get the best combination.


If you know you have unreliable hardware, and critical reliability issues, then you can use the fcntl, which seems to be more control than other OSes give.

-pmb

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Reply via email to