A guess for you to consider:

A very common cause of ksoftirqd load is a hypervisor putting memory pressure 
on a VM.  At least VMWare, and I think KVM and others, use IRQs to implement 
some of their memory management and it can show up like this.


That would of course mean it's not really the ptlrpc module, I'm not sure how 
carefully you verified that it is causing this.  (Obviously your 'remove it, 
check, add it, check' method is sound, but if you just checked once or twice, 
you may have been wrong through bad luck or you could've been right at your 
limit of available memory.)

________________________________
From: lustre-discuss <lustre-discuss-boun...@lists.lustre.org> on behalf of 
Dilger, Andreas <andreas.dil...@intel.com>
Sent: Wednesday, September 27, 2017 11:50:03 AM
To: Hans Henrik Happe
Cc: Shehata, Amir; lustre-discuss; Olaf Weber
Subject: Re: [lustre-discuss] 2.10.0 CentOS6.9 ksoftirqd CPU load

On Sep 26, 2017, at 01:10, Hans Henrik Happe <ha...@nbi.dk> wrote:
>
> Hi,
>
> Did anyone else experience CPU load from ksoftirqd after 'modprobe
> lustre'? On an otherwise idle node I see:
>
>  PID USER      PR   NI VIRT  RES  SHR S %CPU  %MEM TIME+   COMMAND
>    9 root      20   0     0    0    0 S 28.5  0.0  2:05.58 ksoftirqd/1
>
>
>   57 root      20   0     0    0    0 R 23.9  0.0  2:22.91 ksoftirqd/13
>
> The sum of those two is about 50% CPU.
>
> I have narrowed it down to the ptlrpc module. When I remove that, it stops.
>
> I also tested the 2.10.1-RC1, which is the same.

If you can run "echo l > /proc/sysrq-trigger" it will report the processes
that are currently running on the CPUs of your system to the console (and
also /var/log/messages, if it can write everything in time).

You might need to do this several times to get a representative sample of
the ksoftirqd process stacks to see what they are doing that is consuming
so much CPU.

Alternately, "echo t > /proc/sysrq-trigger" will report the stacks of all
processes to the console (and /v/l/m), but there will be a lot of them,
and no better chance that it catches what ksoftirqd is doing 25% of the time.

Cheers, Andreas
--
Andreas Dilger
Lustre Principal Architect
Intel Corporation







_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to