Thanks for your help, Maybe in the documentation there could be a note that using "ms" devices on large numbers of luns is not recommended due to cpu spikes. We have added another system with a similar number of users on ma (mm,mr) devices and have seen a huge improvement.
It seems like syncing the metadata on "ms" was taking too long and causing everything else to be delayed which then spikes the cpu. -Jeff Dean Roehrich wrote: > On Thu, Dec 13, 2007 at 03:08:31PM -0600, jtw wrote: >> The QFS version in use is 4.6 (4.6.25) > > Thanks, I've looked through your lockstat data. You're hitting two locks (per > filesystem) pretty hard. The first controls the lease chain for that > filesystem, and the second controls the list of outstanding client-to-server > messages on that filesystem. The locks are m_lease_mutex and m_cl_mutex, for > anyone who knows them. > > I'm not saying this explains the 100% system time on that CPU, but it's clear > you're spending a lot of time on these two locks. > > This is a problem area we know we have to improve. Unfortunately, I think > it's much bigger than a bugfix. > > Dean
