On 10/26/10 21:17, Chuck Swiger wrote:
On Oct 26, 2010, at 11:33 AM, Marc G. Fournier wrote:
Someone recently posted on one of the PostgreSQL Blogs concerning fsync on 
Linux/Windows/Mac OS X, but failed to make any comments on any of the BSDs ... 
the post has to do with how fsync works on the various OSs, and am curious as 
to whether or not this is something that also afflicts us:

http://rhaas.blogspot.com/2010/10/wal-reliability.html

 From reading our man page, I see no warnings similar to what the other OSs
have, specifically:

Mac OS X: For applications that require tighter guarantees about the
          integrity of their data, Mac OS X provides the F_FULLFSYNC fcntl

Linux: If the underlying hard disk has write caching enabled, then the
       data may not really be on permanent storage when fsync() /
       fdatasync() return.

So, do we hide the fact, or are, in fact, not afflicted by this?


Whether the data actually gets written and the on-disk cache itself flushed 
seems to depend on a sysctl called hw.ata.wc for FreeBSD or the dkctl setting 
in NetBSD; write-caching seems to always default to on because otherwise people 
scream bloody murder about the factor of ten reduction in write performance 
with it off.  Further, by default (ie, FFSv2 with soft updates), data changes 
are synced out when you do an fsync(), but metadata changes are done 
asynchronously-- which is exactly what MacOS X does.

In other words, if you have write-caching on, no effort is made to invoke ATA_FLUSHCACHE 
or SCSI "SYNCHRONIZE CACHE" to make sure that your disk has actually written 
the bits to permanent storage.

To clarify: all this is in case write-caching happens on disk drives or on disk controllers.

The common way to deploy servers for a long time now is to have a disk controller with RAID capabilities and its own RAM cache which is backed by a battery or a capacitor. This controller in turn switches on-drive write caches off. All of the RAID controllers I've seen have a toggle for this last part (on-drive write caches) and it was always turned off by default (though it doesn't hurt to check).

To emulate this with desktop drives, as cswiger said, hw.ata.wc should be turned off, with the expected influence on drive performance.

All this is valid for UFS. ZFS on the other hand *should* use BIO_FLUSH where appropriate, so it should be safer with desktop drives. OTOH ZFS is so complex that it's hard to say if an error occurs what has caused it.


_______________________________________________
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"

Reply via email to