On Jul 15, 2010, at 8:16 PM, Scott Carey wrote: > On Jul 15, 2010, at 12:35 PM, Ben Chobot wrote: > >> On Jul 15, 2010, at 9:30 AM, Scott Carey wrote: >> >>>> Many raid controllers are smart enough to always turn off write caching on >>>> the drives, and also disable the feature on their own buffer without a >>>> BBU. Add a BBU, and the cache on the controller starts getting used, but >>>> *not* the cache on the drives. >>> >>> This does not make sense. >>> Write caching on all hard drives in the last decade are safe because they >>> support a write cache flush command properly. If the card is "smart" it >>> would issue the drive's write cache flush command to fulfill an fsync() or >>> barrier request with no BBU. >> >> You're missing the point. If the power dies suddenly, there's no time to >> flush any cache anywhere. That's the entire point of the BBU - it keeps the >> RAM powered up on the raid card. It doesn't keep the disks spinning long >> enough to flush caches. > > If the power dies suddenly, then the data that is in the OS RAM will also be > lost. What about that? > > Well it doesn't matter because the DB is only relying on data being persisted > to disk that it thinks has been persisted to disk via fsync().
Right, we agree that only what has been fsync()'d has a chance to be safe.... > The data in the disk cache is the same thing as RAM. As long as fsync() > works _properly_ which is true for any file system + disk combination with a > damn (not HFS+ on OSX, not FAT, not a few other things), then it will tell > the drive to flush its cache _before_ fsync() returns. There is NO REASON > for a raid card to turn off a drive cache unless it does not trust the drive > cache. In write-through mode, it should not return to the OS with a fsync, > direct write, or other "the OS thinks this data is persisted now" call until > it has flushed the disk cache. That does not mean it has to turn off the > disk cache. ...and here you are also right in that a write-through write cache is safe, with or without a battery. A write-through cache is a win for things that don't often fsync, but my understanding is that with a database, you end up fsyncing all the time, which makes a write-through cache not worth very much. The only good way to get good *database* performance out of spinning media is with a write-back cache, and the only way to make that safe is to hook up a BBU. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance