Bruce Momjian wrote:
I always assumed SCSI disks had a write-through cache and therefore
didn't need a drive cache flush comment.

There's more detail on all this mess at http://wiki.postgresql.org/wiki/SCSI_vs._IDE/SATA_Disks and it includes this perception, which I've recently come to believe isn't actually correct anymore. Like the IDE crowd, it looks like one day somebody said "hey, we lose every write heavy benchmark badly because we only have a write-through cache", and that principle got lost along the wayside. What has been true, and I'm staring to think this is what we've all been observing rather than a write-through cache, is that the proper cache flushing commands have been there in working form for so much longer that it's more likely your SCSI driver and drive do the right thing if the filesystem asks them to. SCSI SYNCHRONIZE CACHE has a much longer and prouder history than IDE's FLUSH_CACHE and SATA's FLUSH_CACHE_EXT.

It's also worth noting that many current SAS drives, the current SCSI incarnation, are basically SATA drives with a bridge chipset stuck onto them, or with just the interface board swapped out. This one reason why top-end SAS capacities lag behind consumer SATA drives. They use the consumers as beta testers to get the really fundamental firmware issues sorted out, and once things are stable they start stamping out the version with the SAS interface instead. (Note that there's a parallel manufacturing approach that makes much smaller SAS drives, the 2.5" server models or those at higher RPMs, that doesn't go through this path. Those are also the really expensive models, due to economy of scale issues). The idea that these would have fundamentally different write cache behavior doesn't really follow from that development model.

At this point, there are only two common differences between "consumer" and "enterprise" hard drives of the same size and RPM when there are directly matching ones:

1) You might get SAS instead of SATA as the interface, which provides the more mature command set I was talking about above--and therefore may give you a sane write-back cache with proper flushing, which is all the database really expects.

2) The timeouts when there's a read/write problem are tuned down in the enterprise version, to be more compatible with RAID setups where you want to push the drive off-line when this happens rather than presuming you can fix it. Consumers would prefer that the drive spent a lot of time doing heroics to try and save their sole copy of the apparently missing data.

You might get a slightly higher grade of parts if you're lucky too; I wouldn't count on it though. That seems to be saved for the high RPM or smaller size drives only.

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Reply via email to