On Mon, Dec 17, 2007 at 04:07:47PM -0600, jtw wrote: > Thanks for your help, > > Maybe in the documentation there could be a note that using "ms" devices on > large numbers of luns is not recommended due to cpu spikes. We have added > another system with a similar number of users on ma (mm,mr) devices and have > seen a huge improvement. > > It seems like syncing the metadata on "ms" was taking too long and causing > everything else to be delayed which then spikes the cpu.
I think your lockstat data doesn't support that conclusion, but I'm happy that you found a work-around. When you mentioned the ms/md devices I was hoping I could write this off as a "small-blocks" problem (md devices use small blocks at the start of the file), but it really looks like you just have a whole lot of lease traffic happening on this machine. On the other hand, you didn't show me your samtrace output and maybe that would show excessive "small-blocks" activity. I would expect this, given the md devices, and it could be why your other system is doing better than the first system. I hope the next patch for 4.6 will have some helpful updates to our "small-blocks" code. So maybe this was a combination of heavy lease traffic and excessive "small-blocks" activity. Dean
