On Thu, Jul 29, 1999 at 03:53:49AM -0700, Tim Moore wrote:
> > Can anyone explain why a software raid5 array of N disks has
> > significantly lower read performance than a raid0 array of N-1 disks?
>
> There's no parity calculation overhead in RAID0.
I don't see why there should be any perity calculation in raid5
for reads when there are no failures. Just read the data.
Unless, for some reason, md spends time verifying the parity,
but that seems unnecessary, unless there is some reason to believe things
are messed up (e.g., you've had a failure).
> > I'm using 8 Seagate ST317242A drives in UDMA-66 mode, with 4 Promise
> > Ultra-66 cards, one drive per channel. This is with linux-2.2.10
> > + 2.2.10.uniform-ide-6.20.eridanus.patch from http://www.kernel.dk/ide,
> > + raid0145-19990724-2.2.10.bz2 and raidtools-19990724-0.90.tar.bz2.
>
> Would you mind publishing your U/66 init setup and motherboard
> hardware? I
> don't know that anyone's published a working U/66 on pre 2.3.x kernels.
There's not much more to it that the list of patches I mentioned.
You have to turn on PDC20246/PDC20262 when configuring the kernel,
and for >2 ultra66 boards, you also need to enable the
"Special UDMA Feature", because of Promise BIOS limitations.
My raidtab looks like this:
raiddev /dev/md0
raid-level 5
nr-raid-disks 8
nr-spare-disks 0
persistent-superblock 1
chunk-size 32
device /dev/hde1
raid-disk 0
device /dev/hdg1
raid-disk 1
device /dev/hdi1
raid-disk 2
device /dev/hdk1
raid-disk 3
device /dev/hdm1
raid-disk 4
device /dev/hdo1
raid-disk 5
device /dev/hdq1
raid-disk 6
device /dev/hds1
raid-disk 7
Here's my /proc/mdstat:
Personalities : [raid5]
read_ahead 1024 sectors
md0 : active raid5 hds1[7] hdq1[6] hdo1[5] hdm1[4] hdk1[3] hdi1[2] hdg1[1] hde1[0]
117890752 blocks level 5, 32k chunk, algorithm 0 [8/8] [UUUUUUUU]
unused devices: <none>
And my df:
Filesystem 1k-blocks Used Available Use% Mounted on
/dev/hda2 1739751 1564182 85656 95% /
/dev/md0 117390352 2765284 108730532 2% /mnt/mnt
hdparm output (using hdparm-3.5i, but I didn't change anything):
/dev/hde:
multcount = 0 (off)
I/O support = 0 (default 16-bit)
unmaskirq = 0 (off)
using_dma = 1 (on)
keepsettings = 0 (off)
nowerr = 0 (off)
readonly = 0 (off)
readahead = 8 (on)
geometry = 33416/16/63, sectors = 33683328, start = 0
Model=ST317242A, FwRev=3.09, SerialNo=7BP0180T
Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs RotSpdTol>.5% }
RawCHS=33416/16/63, TrkSize=0, SectSize=0, ECCbytes=0
BuffType=0(?), BuffSize=512kB, MaxMultSect=16, MultSect=off
DblWordIO=no, maxPIO=2(fast), DMA=yes, maxDMA=2(fast)
CurCHS=33416/16/63, CurSects=33683328, LBA=yes, LBAsects=33683328
tDMA={min:120,rec:120}, DMA modes: mword0 mword1 mword2
IORDY=on/off, tPIO={min:240,w/IORDY:120}, PIO modes: mode3 mode4
UDMA modes: mode0 mode1 mode2 mode3 *mode4
Drive Supports : ATA-1 ATA-2 ATA-3 ATA-4
I played around a bit with setting things like multcount, unmaskirq, etc.,
but I don't recally that it made much difference.
The machine has a Tyan Tiger 100 mainboard and 512MB of PC-100 memory.
> > On a Pentium II/400 I get ~60MB/s reading a file with raid0 on 6 drives,
> > but <40MB/s with raid5 on 8 drives.
>
> Sustained? How are you measuring? The ST317242A's are rated at a
> fairly typical 8.5 MBytes/sec sustained. Your numbers are pretty good.
Sustained, reading a large file (1 or 2-epsilon GB). I get similar
numbers whether I do it with dd or with Bonnie. One problem I'm having with
measurements is a lack of repeatability. I see about 10% variation from
run to run. For the ST317242A, Seagate simply claims >8.5MB/s.
One thing I have observed in comparing various recent cheap Seagate ATA
drives against fancy SCSI drives (Seagate Cheetah 9LP and IBM Ultrastar
18ZX), is greater variation in sustained transfer rates from the inner
to the outer cylinders. I've measured >14MB/s sustained at the fast end
of the ST317242A. The slow end is much closer to the claim of 8.5MB/s.
On the fancy SCSI drives, I see much less variation, like from 12-15MB/s.
Jan Edler
NEC Research Institute