Here at LLNL we know of at least one cause of 100% CPU utilization 
observed on LNET gigE<->elan routers during a heavy read load.  We
tracked this issue back to a needed tuning for our e1000 network
cards to prevent RX descriptor overflow.  While we mainly saw this
on our router nodes I see no reason it couldn't manifest itself on
a heavily loaded tcp attached client or server.  The details of this 
bug and a debug patch which will log the RX descriptor overflows 
if they are occuring can be found in bug10077.

  https://bugzilla.lustre.org/show_bug.cgi?id=10077

  At LLNL we now make a habit out of maxing out the RX descriptor
queue on any lustre node using an e1000 adaptor.  Add the 
module option 'RxDescriptors=4096' to your modprobe.conf or 
use 'ethtool -G ethX rx 4096' to tune it on the fly.

  I have not yet investigated if the tg3 driver or bcm5700
drivers suffer from similar issues.  They may.

-- 
Good luck,
Brian


> On 2006-11-15, at 12:36 , Peter Bojanic wrote:
> > Shane,
> >
> > Regarding the read performance issues you mentioned today...
>
> Some further news...
>
> - Sandia (Redstorm) reports 100% CPU utilization during reads on both
> Linux and Cray systems; they're running Unicos 1.4, based on Lustre
> 1.4.6
>
> ===> Lee, is there a Bugzilla reported by Sandia that describes this
> issue?
>
> - Indiana University reports a similar read performance drop-off but
> with Lustre 1.4.7.x. I've escalated the following Bugzilla to our IO
> team with the highest priority:
>
> Bug 11194: simultaneous reads of same one stripe file show decrease
> in performance
>
> ===> mjmac, can you please make sure we've got Cluster Surveys for
> both Indiana University and for Rizzo at ORNL so engineering can
> understand the scope of these systems
>
> Thanks,
> Peter
>
> _______________________________________________
> Lustre-discuss mailing list
> [email protected]
> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss

_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss

Reply via email to