On Fri, Dec 18, 2009 at 7:50 PM, Steffen Plotner <swplot...@amherst.edu> wrote: > Hello, > > When driving reads or writes to ZFS continuously to a commerical backend > storage (42 disk array with hardware raid controller exposing disk via fiber > channel interface), I see time frames during which no IO takes place when it > should. Reading the underlying raw disk device does not present this > problem. > > The pauses begin after 3-4 minutes of IO have taken place, and then they > appear after every 60 seconds. At that time the pauses last for about 3 > seconds and IO resumes. The server itself does nothing else during these > times. The server is a Dell 1950 with dual quad core CPUs with 32GB of RAM. > I have performed the same tests on a second piece of identical hardware with > the same results. > > I have included a link to a graph that depicts what I see at the fiber > channel interface of the server: > http://www3.amherst.edu/~swplotner/comstar/debug/zfs_pauses.png > > Here is one of the pauses shown with zpool iostat: > vg_satabeast8_vol0 1004G 3.08T 3.09K 0 120M 0 > vg_satabeast8_vol0 1004G 3.08T 3.09K 0 109M 0 > vg_satabeast8_vol0 1004G 3.08T 3.00K 0 105M 0 > vg_satabeast8_vol0 1004G 3.08T 3.02K 0 111M 0 > vg_satabeast8_vol0 1004G 3.08T 3.25K 0 118M 0 > vg_satabeast8_vol0 1004G 3.08T 3.17K 0 111M 0 > vg_satabeast8_vol0 1004G 3.08T 2.34K 0 86.5M 0 > vg_satabeast8_vol0 1004G 3.08T 3.19K 0 115M 0 > vg_satabeast8_vol0 1004G 3.08T 2.24K 0 67.2M 0 > vg_satabeast8_vol0 1004G 3.08T 0 0 0 0 <- > these 3 lines are 1 pause in the graph (3 seconds worth) > > vg_satabeast8_vol0 1004G 3.08T 0 0 0 0 <- > the graph shows 3 of those pauses. > vg_satabeast8_vol0 1004G 3.08T 716 0 33.5M 0 <- > vg_satabeast8_vol0 1004G 3.08T 3.36K 0 112M 0 > vg_satabeast8_vol0 1004G 3.08T 4.14K 0 113M 0 > vg_satabeast8_vol0 1004G 3.08T 3.82K 0 111M 0 > vg_satabeast8_vol0 1004G 3.08T 2.09K 0 72.8M 0 > vg_satabeast8_vol0 1004G 3.08T 3.27K 0 102M 0 > vg_satabeast8_vol0 1004G 3.08T 2.88K 0 102M 0 > vg_satabeast8_vol0 1004G 3.08T 3.14K 0 114M 0 > vg_satabeast8_vol0 1004G 3.08T 2.65K 0 97.6M 0 > vg_satabeast8_vol0 1004G 3.08T 2.93K 0 105M 0 > > The pauses are of concern - actually - they are a problem since no IO is > being processed for several seconds each minute and if they could be removed > could make zfs really fast.
I'm not sure why you are seeing those but here we have a similar problem, although not as consistent as yours. Whenever we have a disk trying multiple reads to recover data from a bad block, ZFS will hang for as much as 3-5 minutes until the disk gives up. I find that behavior understandable but wish it wouldn't halt all I/O to the machine. Perhaps you should look at what's happening Fibre Channel interface that could be making ZFS wait on some operation. Just an idea, I'm not a ZFS expert by any means. -- Giovanni P. Tirloni _______________________________________________ storage-discuss mailing list storage-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/storage-discuss