Hello,

i did some measurements on the impact that unaligned partitions/slices
have on the "new" harddrives that use ondisk 4kB sectors and export
them as 512B sectors. [1]
My tests were done on a Western Digital WD10EARS. [2]


CONCLUSION:
Having unaligned partition/slices on those disks leads to noticable
performance penalty under realworld workloads.


IMPLICATIONS:
1. The rounding of unit sizes to cylinder boundaries by disklabel has
   to be evaluated.
2. A FAQ entry for the "advanced format" disks is needed to tell people
   to set the XP jumper. (more on that later)
   If disklabel is not modified, that entry would also have to explain
   the alignement implications and "how to use a calculator".


TEST RESULTS:

- sequential write/read speeds ---

  dd bs     | aligned   | unaligned | wd10eads*
            |           |           |
   4k write |  97433116 |  86349673 |  80762241  (bytes/sec)
  64k w     | 101273894 |  85616298 |  81234814
   1m w     |  98291974 |  79201231 |  83113302
            |           |           |
   4k read  | 103706513 | 104434701 |  82723667
  64k r     | 105136468 | 104453140 |  85552816
   1m r     | 104228605 | 104921901 |  85650289

  (* wd10eads is the previous generation to the wd10ears with 32mb
  cache and usual 512B ondisk sectors. Disk is in a different system!
  That system is not idle so actual numbers might be higher.)


- extracting a source tree ---

  aligned   :  6m26.31s
  unaligned : 14m30.30s


- build kernel / make obj / make build ---

             | aligned   | unaligned
  kernel     |  2m27.94s |  2m48.12s
  make obj   |  0m28.51s |  1m01.41s
  make build | 36m07.27s | 70m51.58s




EXPLANAITIONS (or whatever :):

Those numbers are kinda scary.
I would not have expected such bad results for the builds from my
earlier sequential rw tests i sent to m...@.

(Just to make it clear, if the partition/slices are not aligned,
 the disk has to read every 4k sector it wants to write to, before it
 can actually do that. The 64MB of cache help to elevate that up to
 some point.)

This drive has a "XP legacy jumper". (Same as WD15EARS and WD20EARS.)
It is intended to be used for Windows XP systems with a single partition
over the whole drive.
XP uses the same 63 sector offset as OpenBSD does.
Setting this jumper, transparently alignes the 63 sectors infront of a
4k sector boundary.
When that jumper is set, slices inside the partition only have to be
multiples of 8 big.
The issue is with disklables rounding down to the nearest cylinder
boundary.
This will mess up the nice multiplication by 1024, which would lead to
a size divisable by 8.
The rounding down is always done when using units, but not when
requesting a size without a unit/in sectors.
So slices can be aligned that way "by hand".
That rounding to cylinders is not needed, afaik.
So without that, a simple "rtfaq! set the damn jumper!" would be
enough, to get the best performance out of such harddisks.


Below you can find more info about my test setup and the test outputs.


Cheers,

- Robert



[1] http://www.wdc.com/advformat
[2] http://www.wdc.com/en/products/products.asp?driveid=763



TESTS:

"aligned"   == XP jumper set
"unaligned" == XP jumper NOT set*
  (* without the jumper,
     the partition/slices are off by one 512B sector.)
  

I installed a snapshot i had on hand (see dmesg) and went from there.
  (Fresh installl without the jumper.)
Source-tree used is -current from some hours ago.
I sync'ed before every test.
disk layout and ramdisk was the same in both scenarios.


- dmesg ---
OpenBSD 4.6-current (GENERIC.MP) #40: Tue Dec 29 01:02:20 MST 2009
    dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 3488088064 (3326MB)
avail mem = 3388391424 (3231MB)
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.5 @ 0xf0730 (61 entries)
bios0: vendor American Megatrends Inc. version "1104" date 09/11/2009
bios0: ASUSTeK Computer INC. P5QL-E
acpi0 at bios0: rev 2
acpi0: tables DSDT FACP APIC MCFG OEMB HPET OSFR
acpi0: wakeup devices P0P2(S4) P0P3(S4) P0P1(S4) UAR1(S4) PS2K(S4) PS2M(S4) 
EUSB(S4) USBE(S4) P0P5(S4) P0P6(S4) P0P7(S4) P0P8(S4) P0P9(S4) GBEC(S4) 
USB0(S4) USB1(S4) USB2(S4) USB3(S4) USB4(S4) USB5(S4) USB6(S4) P0P4(S4)
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee00000: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Pentium(R) Dual-Core CPU E5200 @ 2.50GHz, 3325.54 MHz
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,EST,TM2,CX16,xTPR,NXE,LONG
cpu0: 2MB 64b/line 8-way L2 cache
cpu0: apic clock running at 266MHz
cpu1 at mainbus0: apid 1 (application processor)
cpu1: Pentium(R) Dual-Core CPU E5200 @ 2.50GHz, 3325.06 MHz
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,EST,TM2,CX16,xTPR,NXE,LONG
cpu1: 2MB 64b/line 8-way L2 cache
ioapic0 at mainbus0: apid 2 pa 0xfec00000, version 20, 24 pins
acpihpet0 at acpi0: 14318179 Hz
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus 1 (P0P2)
acpiprt2 at acpi0: bus -1 (P0P3)
acpiprt3 at acpi0: bus 5 (P0P1)
acpiprt4 at acpi0: bus -1 (P0P5)
acpiprt5 at acpi0: bus 3 (P0P8)
acpiprt6 at acpi0: bus 2 (P0P9)
acpiprt7 at acpi0: bus 4 (P0P4)
acpicpu0 at acpi0
acpicpu1 at acpi0
aibs at acpi0 not configured
acpibtn0 at acpi0: PWRB
cpu0: unknown Enhanced SpeedStep CPU, msr 0x061a4c1f06004c1f
cpu0: using only highest and lowest power states
cpu0: Enhanced SpeedStep 3325 MHz: speeds: 15200, 1200 MHz
pci0 at mainbus0 bus 0
pchb0 at pci0 dev 0 function 0 "Intel G45 Host" rev 0x02
ppb0 at pci0 dev 1 function 0 "Intel G45 PCIE" rev 0x02: apic 2 int 16 (irq 10)
pci1 at ppb0 bus 1
vga1 at pci1 dev 0 function 0 "ATI Radeon HD 4850" rev 0x00
wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
azalia0 at pci1 dev 0 function 1 "ATI Radeon HD 48xx HD Audio" rev 0x00: apic 2 
int 17 (irq 11)
azalia0: no supported codecs
azalia0: initialization failure, detaching
uhci0 at pci0 dev 26 function 0 "Intel 82801JI USB" rev 0x00: apic 2 int 16 
(irq 10)
uhci1 at pci0 dev 26 function 1 "Intel 82801JI USB" rev 0x00: apic 2 int 21 
(irq 14)
uhci2 at pci0 dev 26 function 2 "Intel 82801JI USB" rev 0x00: apic 2 int 18 
(irq 15)
ehci0 at pci0 dev 26 function 7 "Intel 82801JI USB" rev 0x00: apic 2 int 18 
(irq 15)
usb0 at ehci0: USB revision 2.0
uhub0 at usb0 "Intel EHCI root hub" rev 2.00/1.00 addr 1
ppb1 at pci0 dev 28 function 0 "Intel 82801JI PCIE" rev 0x00: apic 2 int 17 
(irq 11)
pci2 at ppb1 bus 4
ppb2 at pci0 dev 28 function 4 "Intel 82801JI PCIE" rev 0x00: apic 2 int 17 
(irq 11)
pci3 at ppb2 bus 3
jmb0 at pci3 dev 0 function 0 "JMicron JMB363 IDE/SATA" rev 0x03
ahci0 at jmb0: apic 2 int 16 (irq 10), AHCI 1.0
scsibus0 at ahci0: 32 targets
pciide0 at jmb0: DMA, channel 0 wired to native-PCI, channel 1 wired to 
native-PCI
pciide0: using apic 2 int 16 (irq 10) for native-PCI interrupt
atapiscsi0 at pciide0 channel 0 drive 0
scsibus1 at atapiscsi0: 2 targets
cd0 at scsibus1 targ 0 lun 0: <TOSHIBA, DVD-ROM SD-M1712, J004> ATAPI 5/cdrom 
removable
atapiscsi1 at pciide0 channel 0 drive 1
scsibus2 at atapiscsi1: 2 targets
cd1 at scsibus2 targ 0 lun 0: <_NEC, DVD_RW ND-3500AG, 2.18> ATAPI 5/cdrom 
removable
cd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 2
cd1(pciide0:0:1): using PIO mode 4, Ultra-DMA mode 2
pciide0: channel 1 disabled (no drives)
ppb3 at pci0 dev 28 function 5 "Intel 82801JI PCIE" rev 0x00: apic 2 int 16 
(irq 10)
pci4 at ppb3 bus 2
ale0 at pci4 dev 0 function 0 "Attansic Technology L1E" rev 0xb0: AR8121, apic 
2 int 17 (irq 11), address 00:22:15:00:12:34
atphy0 at ale0 phy 0: F1 10/100/1000 PHY, rev. 9
uhci3 at pci0 dev 29 function 0 "Intel 82801JI USB" rev 0x00: apic 2 int 23 
(irq 3)
uhci4 at pci0 dev 29 function 1 "Intel 82801JI USB" rev 0x00: apic 2 int 19 
(irq 5)
uhci5 at pci0 dev 29 function 2 "Intel 82801JI USB" rev 0x00: apic 2 int 18 
(irq 15)
ehci1 at pci0 dev 29 function 7 "Intel 82801JI USB" rev 0x00: apic 2 int 23 
(irq 3)
usb1 at ehci1: USB revision 2.0
uhub1 at usb1 "Intel EHCI root hub" rev 2.00/1.00 addr 1
ppb4 at pci0 dev 30 function 0 "Intel 82801BA Hub-to-PCI" rev 0x90
pci5 at ppb4 bus 5
"Creative Labs SoundBlaster Audigy LS" rev 0x00 at pci5 dev 1 function 0 not 
configured
"AT&T/Lucent FW322 1394" rev 0x70 at pci5 dev 3 function 0 not configured
pcib0 at pci0 dev 31 function 0 "Intel 82801JIR LPC" rev 0x00
ahci1 at pci0 dev 31 function 2 "Intel 82801JI AHCI" rev 0x00: apic 2 int 19 
(irq 5), AHCI 1.2
scsibus3 at ahci1: 32 targets
sd0 at scsibus3 targ 0 lun 0: <ATA, WDC WD10EARS-00Y, 80.0> SCSI3 0/direct fixed
sd0: 953869MB, 512 bytes/sec, 1953525168 sec total
ichiic0 at pci0 dev 31 function 3 "Intel 82801JI SMBus" rev 0x00: apic 2 int 18 
(irq 15)
iic0 at ichiic0
iic0: addr 0x1e 01=01 02=01 10=0f 11=01 12=01 13=0f 20=05 21=01 22=01 23=05 
31=01 32=01 words 00=0001 01=0101 02=0100 03=0000 04=0000 05=0000 06=0000 
07=0000
iic0: addr 0x20 01=80 02=17 03=7f 10=00 19=b0 20=20 21=00 25=20 26=b2 38=74 
39=03 4a=64 6a=2c 78=02 79=08 7a=00 7b=00 7e=82 80=00 8b=31 8c=bb 96=8d 99=41 
9a=98 9b=01 d0=00 d1=03 d2=72 d3=72 d4=03 d5=02 d6=01 d7=9b d8=6b d9=00 da=00 
db=00 dc=00 dd=00 de=00 df=00 e0=00 e1=00 e2=10 e3=10 e4=10 e5=10 e6=10 e7=10 
e8=10 e9=10 ea=10 ec=07 ee=00 f1=08 f5=02 f6=02 f9=00 fa=00 fb=50 words 00=ffff 
01=8037 02=1766 03=7fff 04=ffff 05=ffff 06=ffff 07=ffff
spdmem0 at iic0 addr 0x50: 2GB DDR2 SDRAM non-parity PC2-6400CL5
spdmem1 at iic0 addr 0x52: 2GB DDR2 SDRAM non-parity PC2-6400CL5
usb2 at uhci0: USB revision 1.0
uhub2 at usb2 "Intel UHCI root hub" rev 1.00/1.00 addr 1
usb3 at uhci1: USB revision 1.0
uhub3 at usb3 "Intel UHCI root hub" rev 1.00/1.00 addr 1
usb4 at uhci2: USB revision 1.0
uhub4 at usb4 "Intel UHCI root hub" rev 1.00/1.00 addr 1
usb5 at uhci3: USB revision 1.0
uhub5 at usb5 "Intel UHCI root hub" rev 1.00/1.00 addr 1
usb6 at uhci4: USB revision 1.0
uhub6 at usb6 "Intel UHCI root hub" rev 1.00/1.00 addr 1
usb7 at uhci5: USB revision 1.0
uhub7 at usb7 "Intel UHCI root hub" rev 1.00/1.00 addr 1
isa0 at pcib0
isadma0 at isa0
com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
pckbc0 at isa0 port 0x60/5
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0: console keyboard, using wsdisplay0
pcppi0 at isa0 port 0x61
midi0 at pcppi0: <PC speaker>
spkr0 at pcppi0
lm0 at isa0 port 0x290/8: W83627DHG
fdc0 at isa0 port 0x3f0/6 irq 6 drq 2
mtrr: Pentium Pro MTRR support
uhidev0 at uhub7 port 1 configuration 1 interface 0 "Razer Razer Copperhead 
Laser Mouse" rev 1.10/21.00 addr 2
uhidev0: iclass 3/0
ums0 at uhidev0: 7 buttons, Z dir
wsmouse0 at ums0 mux 0
uhidev1 at uhub7 port 1 configuration 1 interface 1 "Razer Razer Copperhead 
Laser Mouse" rev 1.10/21.00 addr 2
uhidev1: iclass 3/1
ukbd0 at uhidev1: 8 modifier keys, 6 key codes
wskbd1 at ukbd0 mux 1
wskbd1: connecting to wsdisplay0
vscsi0 at root
scsibus4 at vscsi0: 256 targets
softraid0 at root
root on sd0a swap on sd0b dump on sd0b

(( fwiw, jacob: that azalia is just disabled.
   the iic0 should be an asus ai booster.
   still have to modify the live driver for that soundblaster 5.1 vx. ))


- fdisk ---
Disk: sd0       geometry: 121601/255/63 [1953525168 Sectors]
Offset: 0       Signature: 0xAA55
            Starting         Ending         LBA Info:
 #: id      C   H   S -      C   H   S [       start:        size ]
-------------------------------------------------------------------------------
 0: 00      0   0   0 -      0   0   0 [           0:           0 ] unused      
 1: 00      0   0   0 -      0   0   0 [           0:           0 ] unused      
 2: 00      0   0   0 -      0   0   0 [           0:           0 ] unused      
*3: A6      0   1   1 - 121600 254  63 [          63:  1953520002 ] OpenBSD


- disklabel ---
# /dev/rsd0c:
type: SCSI
disk: SCSI disk
label: WDC WD10EARS-00Y
flags:
bytes/sector: 512
sectors/track: 63
tracks/cylinder: 255
sectors/cylinder: 16065
cylinders: 121601
total sectors: 1953525168
rpm: 3600
interleave: 1
boundstart: 63
boundend: 1953520065
drivedata: 0 

16 partitions:
#                size           offset  fstype [fsize bsize  cpg]
  a:         20000000               63  4.2BSD   2048 16384    1 # /
  b:          2000000         20000063    swap                   
  c:       1953525168                0  unused                   
  d:         20000000         22000063  4.2BSD   2048 16384    1 # /usr
  e:         20000000         42000063  4.2BSD   2048 16384    1 # /usr/obj
  f:         20000000         62000063  4.2BSD   2048 16384    1 # /usr/src


- ramdisk ---
# fgrep ramdisk /etc/fstab
swap /ramdisk mfs rw,nodev,nosuid,-s=2000000 0 0



### aligned partition/slices ###

- sequential write/read ---
# dd if=/dev/zero of=/tmp/testfile.4k bs=4k count=524288
524288+0 records in
524288+0 records out
2147483648 bytes transferred in 22.040 secs (97433116 bytes/sec)
# dd if=/dev/zero of=/tmp/testfile.64k bs=64k count=32768
32768+0 records in
32768+0 records out
2147483648 bytes transferred in 21.204 secs (101273894 bytes/sec)
# dd if=/dev/zero of=/tmp/testfile.1m bs=1m count=2048
2048+0 records in
2048+0 records out
2147483648 bytes transferred in 21.848 secs (98291974 bytes/sec)
# dd of=/dev/null if=/tmp/testfile.4k bs=4k count=524288
524288+0 records in
524288+0 records out
2147483648 bytes transferred in 20.707 secs (103706513 bytes/sec)
# dd of=/dev/null if=/tmp/testfile.64k bs=64k count=32768
32768+0 records in
32768+0 records out
2147483648 bytes transferred in 20.425 secs (105136468 bytes/sec)
# dd of=/dev/null if=/tmp/testfile.1m bs=1m count=2048
2048+0 records in
2048+0 records out
2147483648 bytes transferred in 20.603 secs (104228605 bytes/sec)

- extract source tarball ---
# ls -l /ramdisk
total 298752
-rw-r--r--  1 root  wheel  152866732 Jan  6 19:19 src.tgz
# cd /usr/src
# time tar xzf /ramdisk/src.tgz
    6m26.31s real     0m3.72s user     0m7.49s system

- build kernel / make obj / make build ---
# cd /usr/src/sys/arch/amd64/compile/GENERIC.MP
# time ( make depend && make )
[ ... ]
    2m27.94s real     2m1.78s user     0m23.48s system

# cd /usr/src && time make obj
[ ... ]
    0m28.51s real     0m2.41s user     0m5.43s system 

# cd /usr/src && time make build
[ ... ]
   36m7.27s real    19m31.87s user     7m31.80s system



### unaligned partition/slices (no jumper) ###

- sequential write/read ---
# dd if=/dev/zero of=/tmp/testfile.4k bs=4k count=524288
524288+0 records in
524288+0 records out
2147483648 bytes transferred in 24.869 secs (86349673 bytes/sec)
# dd if=/dev/zero of=/tmp/testfile.64k bs=64k count=32768
32768+0 records in
32768+0 records out
2147483648 bytes transferred in 25.082 secs (85616298 bytes/sec)
# dd if=/dev/zero of=/tmp/testfile.1m bs=1m count=2048
2048+0 records in
2048+0 records out
2147483648 bytes transferred in 27.114 secs (79201231 bytes/sec)
# dd of=/dev/null if=/tmp/testfile.4k bs=4k count=524288
524288+0 records in
524288+0 records out
2147483648 bytes transferred in 20.562 secs (104434701 bytes/sec)
# dd of=/dev/null if=/tmp/testfile.64k bs=64k count=32768
32768+0 records in
32768+0 records out
2147483648 bytes transferred in 20.559 secs (104453140 bytes/sec)
# dd of=/dev/null if=/tmp/testfile.1m bs=1m count=2048
2048+0 records in
2048+0 records out
2147483648 bytes transferred in 20.467 secs (104921901 bytes/sec)

- extract source tarball ---
# ls -l /ramdisk
total 298752
-rw-r--r--  1 root  wheel  152866732 Jan  6 21:12 src.tgz
# cd /usr/src
# time tar xzf /ramdisk/src.tgz
   14m30.30s real     0m4.15s user     0m6.15s system

- build kernel / make obj / make build ---
# time ( make depend && make )
[ ... ]
    2m48.12s real     2m1.03s user     0m24.14s system

# cd /usr/src && time make obj
[ ... ]
    1m1.41s real     0m2.14s user     0m5.99s system

# cd /usr/src && time make build
[ ... ]
   70m51.58s real    19m31.95s user     7m27.84s system


### wd10eads (just for comparison) ###

- sequential write/read ---
# dd if=/dev/zero of=/wd10eads/testfile.4k bs=4k count=524288                   
    
524288+0 records in
524288+0 records out
2147483648 bytes transferred in 26.590 secs (80762241 bytes/sec)
# dd if=/dev/zero of=/wd10eads/testfile.64k bs=64k count=32768
32768+0 records in
32768+0 records out
2147483648 bytes transferred in 26.435 secs (81234814 bytes/sec)
# dd if=/dev/zero of=/wd10eads/testfile.1m bs=1m count=2048    
2048+0 records in
2048+0 records out
2147483648 bytes transferred in 25.838 secs (83113302 bytes/sec)
# dd of=/dev/null if=/wd10eads/testfile.4k bs=4k count=524288  
524288+0 records in
524288+0 records out
2147483648 bytes transferred in 25.959 secs (82723667 bytes/sec)
# dd of=/dev/null if=/wd10eads/testfile.64k bs=64k count=32768 
32768+0 records in
32768+0 records out
2147483648 bytes transferred in 25.101 secs (85552816 bytes/sec)
# dd of=/dev/null if=/wd10eads/testfile.1m bs=1m count=2048    
2048+0 records in
2048+0 records out
2147483648 bytes transferred in 25.072 secs (85650289 bytes/sec)

- extract source tarball ---
# cd /wd10eads/                                                                 
    
# time tar xzf /ramdisk/src.tgz                                                 
   
    1m33.80s real     0m8.01s user     0m12.15s system

(( If you read this far, have a cookie
   and wonder with me about that quick extraction...
   The system this drive is in has the same board,
   but everything else is slower and not idle when meassured...))

Reply via email to