Re: [Lustre-discuss] High Load and high system CPU for mds

Oleg Drokin Mon, 01 Mar 2010 11:35:41 -0800

Hello!

On Feb 28, 2010, at 9:31 PM, huangql wrote:
> We got a problem that the MDS has high load value and the system CPU is up to 
> 60% when running chown command on client. It's strange that the load value 
> and system CPU didn't decrease to the normal level as long as it getted high. 
> Even we can't do anything on clients and OSS. You can see the information 
> with top command as follows:


How many files did that chown command affected (was it a chown -R for some huge 
directory tree?).
Essentially chown (setattr) works in two steps, first it changes MDS attributes 
then it queues an async RPC for
every file object to update the attributes on OST. If there are many files that 
are getting updated this way,
there would be a lot of such messages queued and all the messages are sent at 
once with no rate limiting.
Thisis consistent with what you are seeing here, ptlrpcd is busy 
sending/receiving RPCs (ptlrpcd is lustre
thread that handles async RPCs sending/completion) and individual socklnd 
threads are also busy processing
network transfers (also I think the code in lnet is not tuned to process huge 
amounts of outstanding RPCs
which leads to additional CPU overhead in that case).

So on the surface it looks like everything performs as expected, though 
certainly lustre might have
behaved better.
How long did you wait with this high cpu utilization before deciding to reboot 
and how many files
were affected by the chown?

Bye,
    Oleg
_______________________________________________
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Re: [Lustre-discuss] High Load and high system CPU for mds

Reply via email to