I'll jump onto this thread here and post my own experience.

I recently decided to build a home NAS/iSCSI server based on Osol with ZFS. I 
wanted to reuse my 3 WD10EACS drives (which were serving a linux MD array at 
the time) but to improve performance I wanted a fourth drive in my new NAS. 
Also, I needed some place to dump the data off the old array before building 
the raidz.

unfortunately, the WD10EACS drive is now out of stock and I wound up getting a 
WD10EARS instead, figuring that the 64M cache would more than make up for the 
fact that this is a slightly different drive.

Initially, I was very disappointed with the performance I was seeing from my 
single disk zpool but I was also experimenting with dedupe  and wrote this up 
as the primary cause of the lousy performance. Only now that I have completed 
the migration to a 4-disk raidz1 do I realize that I have in fact got a huge 
problem due to the 4K sector size of the EARS drive.

I am running opensolaris snv_134 and zpool version 22. 

When doing a DD test this is what happens:
dd if=/dev/zero of=/space/ddtest bs=128k count=4096
^C2734+0 records in
2734+0 records out
358350848 bytes (358 MB) copied, 10.9503 s, 32.7 MB/s

                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    0.0   60.6    0.0 4799.3  0.0  1.1    0.0   18.1   0  99 c7t3d0
    0.0  214.3    0.0 17644.0  0.0  0.4    0.0    1.9   1  35 c7t0d0
    0.0  220.3    0.0 18117.2  0.0  0.4    0.0    1.9   1  34 c7t1d0
    0.0  231.5    0.0 18681.2  0.0  0.4    0.0    1.6   1  36 c7t2d0
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    0.0   69.0    0.0 5513.8  0.0  1.0    0.0   14.4   0  99 c7t3d0
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    0.0   58.0    0.0 4490.9  0.0  1.1    0.0   19.2   0  99 c7t3d0
    0.0    2.0    0.0    3.0  0.0  0.0    0.0    0.1   0   0 c7t0d0
    0.0    2.0    0.0    3.0  0.0  0.0    0.0    0.1   0   0 c7t1d0
    0.0    2.0    0.0    3.0  0.0  0.0    0.0    0.2   0   0 c7t2d0

c7t3d0 is the EARS disk, while the other three are EACS disks with 512byte 
sectors and 16M cache.

As you can see, the three old drives are basically just coasting along while 
the 4K drive is just completely swamped with unaligned IO.

This is with a jumper shorting pins 7&8; results were identical with the pins 
unjumpered.

And my initial results were even worse - until I set zfs_vdev_max_pending=1 my 
system would be nearly frozen during long-running writes.

Obviously the root problem here is that the disk reports a sector size of 512 
regardless of wether the jumper has been set or not, and that zfs therefore 
splits writes into 512byte blocks. This forces the disk to read in the 
remaining 3.5K of the sector before writing a 4K block, which I guess is a very 
effective sequential-to-random  conversion. 

Several threads hint that there should be code in recent builds to handle 4K 
drives, but if so then clearly it does not work out of the box.

Are there any tunable parameters to achieve a logical block size of 4K? Or 
should I run down to the local hardware emporium and get an older replacement 
while I still have the chance?
-- 
This message posted from opensolaris.org
_______________________________________________
storage-discuss mailing list
storage-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/storage-discuss

Reply via email to