Papp Tamás wrote:
> The logs are full with this:
> 
> Nov 19 20:03:32 node1 kernel: BUG: soft lockup - CPU#3 stuck for 10s! 
> [ll_ost_80:4894]
> Nov 19 20:03:32 node1 kernel: CPU 3:
<snip>
> Nov 19 20:03:34 node1 kernel: Lustre: Skipped 40339060 previous similar 
> messages 0; still busy with 3 active RPCs

We had the same problem with 1.8.x.x.

We set lnet.printk=0 on our OSS nodes and it has helped us dramatically 
- we have not seen the 'soft lockup' problem since.

sysctl -w lnet.printk=0

This will turn off all but 'emerg' messages from lnet.

It would be interesting to know if this avoided the lockups for you, too.

Cheers,
Craig


_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Reply via email to