Tom Livingston <[EMAIL PROTECTED]> writes:
> Last fall I set up a big file server as a personal project.  It was
> surprisingly easy for me to get going, but in the process I learned that
> linux-raid (maybe raid in general?) serializes it's access to the disks...
> so if you have a five drive RAID5 set, you would read in raid-order instead
> of simultaneously reading from all the disks at once.  So, ok, it's still

Well, yes. If they're on the same bus, you have to serialize them
somehow, since they can't be talking over each other. 

Now, with SCSI and tagged command queuing, you can still issue
multiple commands for each of the drives and have them perform the I/O 
as they as fast as they can. The drives are working in parallel, but
the data into or out of the memory still has to be sent serially over
the bus. 

Even if you split the drives to multiple buses, the I/O _still_ needs
to be fundamentally serialized because there's only one memory bus and 
only one device can be using it at once. 

Since the buses are much faster than the disk spindles, you get to
send stuff into the drive buffers and have the physical drive access
overlap, though. And even when a disk isn't transferring stuff, it's
still working, because there are all the seeks etc to be taken into
account as well.

> I have an ide based raid system, and visually it appears to be true.  10
> drives stacked in a case with external access lights show the first drive
> set being hit, then the second and so on.  It zips down the case fast, but
> it's noticeably going one way, and not all at once or randomly.

Pretty common with RAID, in fact the pattern I would expect. On the
other hand, I have a system with 5 Ultra2 SCSI disks in a RAID-5, and
the normal access pattern is that all of the drives light up almost
simultaneously every second or so. That, to me, seems like more of a
waste - the bus traffic is in bursts, while if it was continuous the
top load for the same amount of bandwidth wouldn't be as high. Still,
since the RAID system does issue commands to all drives in parallel,
and they all have a command queue that can hold multiple I/O requests
at once, I have to trust that the access settles into the most
efficient pattern. The stress-test bandwidth I see from the array
certainly shows more than 3 times the capacity of a single drive, so
obviously it's working.

> SCSI also seems to suffer from this fate, as people don't report benchmarks
> that don't show anything close to the improvement one might expect from
> disks being read in parallel.

That's more an issue of bad configuration. You have to analyse what
you're going to use the array for, and configure it accordingly. A
RAID-5 setup with large chunk size will give you high bandwidth for a
single stream (this would be good for a video editing workstation, for 
example), while a RAID-0+1 setup would not show as high bandwidth for
a single file, but would allow more efficient access to many files at
once (such as in a large mail, news or web server).

-- 
Osma Ahvenlampi

Reply via email to