On Fri, Jul 30, 1999 at 02:01:19AM -0700, Tim Moore wrote:
> Take any drive and divide it up evenly into 6-8 partitions.  Perform a
> simple dd(1) or hdparm(8) read test from each and watch the sustained
> speed drop by about 20% from the outside to the inside most partition.
> 
> It's a 5400 RPM drive and will not deliver more than ~10MB/s.

The rotational speed is not the sole determining factor.
As the density per track goes up, if you keep the rotational speed
the same, the sustained throughput should go up.
This is why I think today's drives don't really need UDMA-66,
but tomorrow's drives will.

Here are some raid0 examples from my test system.
In summary, there's a fair amount of measurement variation, but I see
something like 9-15MB/s on a single drive, depending on which end
of the drive is being accessed.  I get similar results from dd
and from hdparm -t.  When I put raid0 on 8 of these drives,
I get 28-35MB/s accessing /dev/md*, depending on whether I stripe
the fast or slow end of the devices.  But when I access a file
on a mounted filesystem on those same md* devices, I get 38MB/s to 60MB/s.

Some possible questions:
1) Why do I get about the same performance with 8 drives as with 6?
   There is no evidence of CPU saturation.  I changed from a p2/400
   to a p3/450 (12.5% clock increase), and got maybe 3% improvement.
   One hypothesis is that the overhead of copying from kernel to user
   dominates.  I will try to find out (e.g., try the raw io patches
   or other hacks).
2) Why does reading a file go so much faster than reading /dev/md*?
   The same thing does not seem true for /dev/hd*.
   I still guess it's some difference in readahead, but have not verified
3) Why is raid0 on N disks faster than raid5 on N+1 disks for reading,
   when there are no failures?  This is the question I asked before,
   and still seems to be open.  I wonder if the answer is related to
   my question 2.

Anyway, here is the boring stuff.
  
# ./hdparm -i /dev/hde

/dev/hde:

 Model=ST317242A, FwRev=3.09, SerialNo=7BP0180T
 Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs RotSpdTol>.5% }
 RawCHS=33416/16/63, TrkSize=0, SectSize=0, ECCbytes=0
 BuffType=0(?), BuffSize=512kB, MaxMultSect=16, MultSect=off
 DblWordIO=no, maxPIO=2(fast), DMA=yes, maxDMA=2(fast)
 CurCHS=33416/16/63, CurSects=33683328, LBA=yes, LBAsects=33683328
 tDMA={min:120,rec:120}, DMA modes: mword0 mword1 mword2 
 IORDY=on/off, tPIO={min:240,w/IORDY:120}, PIO modes: mode3 mode4 
 UDMA modes: mode0 mode1 mode2 mode3 *mode4 
 Drive Supports : ATA-1 ATA-2 ATA-3 ATA-4 

# fdisk -l /dev/hde

Disk /dev/hde: 16 heads, 63 sectors, 33416 cylinders
Units = cylinders of 1008 * 512 bytes

   Device Boot    Start       End    Blocks   Id  System
/dev/hde1             1      2081   1048792+  83  Linux
/dev/hde2          2082     31335  14744016   83  Linux
/dev/hde3         31336     33416   1048824   83  Linux
# time -f%e dd bs=32k count=32k of=/dev/null if=/dev/hde1
32768+0 records in
32768+0 records out
66.50
# echo scale=2\; 1024 / 66.5 | bc
15.39
# time -f%e dd bs=32k count=32k of=/dev/null if=/dev/hde3
32768+0 records in
32768+0 records out
111.42
# echo scale=2\; 1024 / 111.42 | bc
9.19
# ./hdparm -tT /dev/hde[13]

/dev/hde1:
 Timing buffer-cache reads:   128 MB in  1.12 seconds =114.29 MB/sec
 Timing buffered disk reads:  64 MB in  4.21 seconds =15.20 MB/sec

/dev/hde3:
 Timing buffer-cache reads:   128 MB in  1.00 seconds =128.00 MB/sec
 Timing buffered disk reads:  64 MB in  6.85 seconds = 9.34 MB/sec


Ok, so all that establishes the basic speed of a single disk, reading
both inner and outer cylinders, and it shows that hdparm -t reports
similar numbers to dd.


# cat /proc/mdstat
Personalities : [raid0]
read_ahead 1024 sectors
md2 : active raid0 hds3[7] hdq3[6] hdo3[5] hdm3[4] hdk3[3] hdi3[2] hdg3[1] hde3[0] 
8389632 blocks 32k chunks
md1 : active raid0 hds2[7] hdq2[6] hdo2[5] hdm2[4] hdk2[3] hdi2[2] hdg2[1] hde2[0] 
117951488 blocks 32k chunks
md0 : active raid0 hds1[7] hdq1[6] hdo1[5] hdm1[4] hdk1[3] hdi1[2] hdg1[1] hde1[0] 
8389632 blocks 32k chunks
unused devices: <none>
# time -f%e dd bs=32k count=32k of=/dev/null if=/dev/md0
32768+0 records in
32768+0 records out
31.07
# echo scale=2\; 1024 / 31.07 | bc
32.95
# time -f%e dd bs=32k count=32k of=/dev/null if=/dev/md2
32768+0 records in
32768+0 records out
36.16
# echo scale=2\; 1024 / 36.16 | bc
28.31
# ./hdparm -t /dev/md[02]

/dev/md0:
 Timing buffered disk reads:  64 MB in  1.83 seconds =34.97 MB/sec

/dev/md2:
 Timing buffered disk reads:  64 MB in  2.26 seconds =28.32 MB/sec



So all that establishes that md gives me only 28-35MB/s when reading
directly.  Now, let's look at file performance.



# mount
/dev/hda2 on / type ext2 (rw)
none on /proc type proc (rw)
none on /dev/pts type devpts (rw,mode=0622)
/dev/md0 on /mnt/mnt type ext2 (rw)
/dev/md2 on /mnt/mnt2 type ext2 (rw)
# time -f%e dd bs=32k count=32k of=/dev/null if=/mnt/mnt/junkusr.tar
32768+0 records in
32768+0 records out
20.70
# echo scale=2\; 1024 / 20.70 | bc
49.46
# time -f%e dd bs=32k count=32k of=/dev/null if=/mnt/mnt2/junkusr.tar
32768+0 records in
32768+0 records out
17.12
# echo scale=2\; 1024 / 17.12 | bc
59.81




That shows that file performance is much higher, but also shows
curiously that the filesystem on /dev/md2 is the faster one,
even though it is on the slower cylinders, as indicated by the
/dev/hde[13] tests above.  Strange.

Jan Edler
NEC Research Institute

Reply via email to