On Mon, Dec 20, 2010 at 5:59 PM, George Henke/NYLIC <[email protected]> wrote:
> The IO Supervisor has not kept up with the hardware. > > It still thinks of a disk device as a "spinning platter" when in fact it is a > rank of RAID devices striped over numerous HDs and cached in a disk > controller from where it is actually being read thereby permitting multiple > IOs to the same device number.. The architecture guarantees that the I/O's to the device are serialized, that is the 2nd queued I/O only starts when the first one completes. This architecture is exploited by OS and applications to ensure that data on disk is in a consistent state. Ever heard of shops where 5000 PROFS users had to go through "fsck" on their CMS disk after a power failure? ;-) Sometimes that guarantee is not needed. Often when two completely unrelated I/O's go for different data that happens to reside on the the same real volume, you couldn't care less. Or when the OS does not provide such a guarantee anyway (aka "lazy write") you can't exploit the hardware guarantee. When the channel program guarantees that they are really unrelated (so not writing or reading the same data) we can leave it to the I/O subsystem to change the order if it makes sense. Whether it makes sense is not easy to tell. It make a lot of sense for your hardware vendor. It makes sense when you do single-threaded lab benchmarks that need to saturate the I/O subsystem. There's a lot of cases where it does not make sense (like when SFS does its own smart things to spread I/O). If someone has relevant data, I'm always interested to see whether it makes sense... Rob
