> If you run two bonnies you will see that your read performance gets better
> (well the sum of the read performance will be superior to that of one disk).
The question in that case is: is that because of superior RAID1 transfer rate or
because of the fact, that 2 bonnies run on the same disk would produce
additional seeks, lowering single disk performance !?
> RAID-1 will distribute reads to the two disks, but it's not a gain if you
> only read one consecutive piece of data.
Why ?
If I want to have a big block of data (say 1MB), couldn't the requests be
spread on both mirrored disks to effectively double transfer rate ?
> If the two disks should share the reading, they should seek all the time
> (disk A should skip the blocks disk B > just read). That doesn't improve
> read performance.
Assuming you have to read data that is on mirrored disks on 100 cylinders.
If you use 1 disk, you have to do an initial seek to first cylinder and the 99
seeks to the other cylinders. Results in seek time O(100).
If you use 2 mirrored disks, you would have to do an initial seek on both
disks, but that could happen concurrently - as all 99 other seeks. The problem
is that (with low request size) you would read only half of the data on
each cylinder (one disk one half, other disk other half). So if the data blocks
are randomly positioned on tracks and cylinders, this would result in half
transfer rate * 2 = same transfer rate as single disk.
This seems to have happened in our tests.
IF one could spread requests so that they will match geometry, one could issue
requests track-wise:
disk A reads ONLY cyl x head 2*n
disk B reads ONLY cyl x head 2*n+1
Problem: physical disk geometry today is
- unknown to SW / SCSI controller
- no easy thing (more sectors on outer tracks than on inner tracks)
So I think this would be quite hard to fully exploit via software.
But this means:
If one uses quite small requests spread onto two mirror disks, it doesn't get
faster due to "skipping" of sectors, lowering data xfer rate.
But that should get better (although never optimal) if request size is about
physical track size, so sektor skipping is less (but more head switches occur
and it doesn't take different time if you switch from head 1 to 2 or
from 1 to 3). Also it should be OK if request size is about cylinder size,
assuming that a seek to next cylinder isn't much faster than a seek skipping
one cylinder.
So how do I change request size given to disk subsystem ? Is that the same as
the "chunk size" in case of RAID-1 ?
If yes, I'm still wondering: I tried 4KB, 32KB and 128KB - no big difference.
Maybe I should get some physical geometry data of my disks and try to calculate
some reasonable value (as far as possible).
> However, if concurrent reads take place, you will see a performance gain
> from the read distribution.
Yes, because 2 hdds can seek twice as often as 1 can. But you don't need RAID-1
for that - just using 2 separate disks for different tasks already does that.
> Remember, an N disk RAID-0 is N times more prone to failure than one single
> drive. This gets ugly when you have many disks, especially because you'll
> likely loose the entire filesystem if a disk goes down.
Yes, I know ...
BTW: this is also something to think about with RAID-5: the more disks you use,
the more likely one (or even two) of them will fail. Making a single RAID-5 out
of many disks is of course MUCH more safe than making a RAID-0 out of them -
but still no good idea. One should only make medium disk-count RAID-5 arrays
(low count wastes to much space and also has low performance, high count is more
likely to have single or double disk failures [and maybe has too much
performance for you SCSI channels ;-] ).
Thomas
--
Thomas Waldmann (com_ma, Computer nach Masz)
email: [EMAIL PROTECTED] www: www.com-ma.de
Please be patient if sending me email, response may be slow.
Bitte Geduld, wenn Sie mir email senden, Antwort kann dauern.