On Mon, Dec 17, 2007 at 04:07:47PM -0600, jtw wrote:
> Thanks for your help,
> 
> Maybe in the documentation there could be a note that using "ms" devices on 
> large numbers of luns is not recommended due to cpu spikes.  We have added 
> another system with a similar number of users on ma (mm,mr) devices and have 
> seen a huge improvement.
> 
> It seems like syncing the metadata on "ms" was taking too long and causing 
> everything else to be delayed which then spikes the cpu.

I think your lockstat data doesn't support that conclusion, but I'm happy that
you found a work-around.  When you mentioned the ms/md devices I was hoping I
could write this off as a "small-blocks" problem (md devices use small blocks
at the start of the file), but it really looks like you just have a whole lot
of lease traffic happening on this machine.

On the other hand, you didn't show me your samtrace output and maybe that
would show excessive "small-blocks" activity.  I would expect this, given the
md devices, and it could be why your other system is doing better than the
first system.  I hope the next patch for 4.6 will have some helpful updates to
our "small-blocks" code.

So maybe this was a combination of heavy lease traffic and excessive
"small-blocks" activity.

Dean

Reply via email to