On Mon, 3 May 2010, The Big T wrote:

Based on 50MB/s write per disk the stripe should be pulling ~700MB/s

I must be hitting a bottleneck of a wall here somewere and would like to track 
it down.

Watching the box access LEDS during a long write is interesting as well... the data access/write is in little patches, drive access for 2-3 seconds, 1 second no acces, 2-3 seconds of access, etc. Now one of the drives is accessing wile the others are not wich is odd.... iostat doesent report any odd writes on that drive either... wich is kinda odd...

ZFS writes transaction groups ("TXG") in large bursts. After the data gets blasted to the hardware, ZFS then requests a cache flush of all the involved devices and this cache flush must complete before ZFS can transition to the next transaction group. If this cache flush takes a long time, then it would place a limit on performance. For example, you might have one drive which takes a long time to respond to that cache flush request. You might have one or two drives which are slow to accept writes in general.

I recommend using 'iostat -x 30' to see what the per-drive latencies look like under load. Look for some drives much much longer service times than the others.

If this is a zfs issue, then it is best discussed on the zfs-discuss list, and if you look at the archives for that list you will see many such discussions.

Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/
_______________________________________________
storage-discuss mailing list
storage-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/storage-discuss

Reply via email to