Thanks for your help,

Maybe in the documentation there could be a note that using "ms" devices on 
large numbers of luns is not recommended due to cpu spikes.  We have added 
another system with a similar number of users on ma (mm,mr) devices and have 
seen a huge improvement.

It seems like syncing the metadata on "ms" was taking too long and causing 
everything else to be delayed which then spikes the cpu.

-Jeff


Dean Roehrich wrote:
> On Thu, Dec 13, 2007 at 03:08:31PM -0600, jtw wrote:
>> The QFS version in use is 4.6 (4.6.25)
> 
> Thanks, I've looked through your lockstat data.  You're hitting two locks (per
> filesystem) pretty hard.  The first controls the lease chain for that
> filesystem, and the second controls the list of outstanding client-to-server
> messages on that filesystem.  The locks are m_lease_mutex and m_cl_mutex, for
> anyone who knows them.
> 
> I'm not saying this explains the 100% system time on that CPU, but it's clear
> you're spending a lot of time on these two locks.
> 
> This is a problem area we know we have to improve.  Unfortunately, I think
> it's much bigger than a bugfix.
> 
> Dean

Reply via email to