On Jul 15, 2010, at 8:16 PM, Scott Carey wrote:

> On Jul 15, 2010, at 12:35 PM, Ben Chobot wrote:
> 
>> On Jul 15, 2010, at 9:30 AM, Scott Carey wrote:
>> 
>>>> Many raid controllers are smart enough to always turn off write caching on 
>>>> the drives, and also disable the feature on their own buffer without a 
>>>> BBU. Add a BBU, and the cache on the controller starts getting used, but 
>>>> *not* the cache on the drives.
>>> 
>>> This does not make sense.
>>> Write caching on all hard drives in the last decade are safe because they 
>>> support a write cache flush command properly.  If the card is "smart" it 
>>> would issue the drive's write cache flush command to fulfill an fsync() or 
>>> barrier request with no BBU.
>> 
>> You're missing the point. If the power dies suddenly, there's no time to 
>> flush any cache anywhere. That's the entire point of the BBU - it keeps the 
>> RAM powered up on the raid card. It doesn't keep the disks spinning long 
>> enough to flush caches.
> 
> If the power dies suddenly, then the data that is in the OS RAM will also be 
> lost.  What about that? 
> 
> Well it doesn't matter because the DB is only relying on data being persisted 
> to disk that it thinks has been persisted to disk via fsync().

Right, we agree that only what has been fsync()'d has a chance to be safe....

> The data in the disk cache is the same thing as RAM.  As long as fsync() 
> works _properly_ which is true for any file system + disk combination with a 
> damn (not HFS+ on OSX, not FAT, not a few other things), then it will tell 
> the drive to flush its cache _before_ fsync() returns.  There is NO REASON 
> for a raid card to turn off a drive cache unless it does not trust the drive 
> cache.  In write-through mode, it should not return to the OS with a fsync, 
> direct write, or other "the OS thinks this data is persisted now" call until 
> it has flushed the disk cache.  That does not mean it has to turn off the 
> disk cache.

...and here you are also right in that a write-through write cache is safe, 
with or without a battery. A write-through cache is a win for things that don't 
often fsync, but my understanding is that with a database, you end up fsyncing 
all the time, which makes a write-through cache not worth very much. The only 
good way to get good *database* performance out of spinning media is with a 
write-back cache, and the only way to make that safe is to hook up a BBU.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Reply via email to