Re: [Beowulf] Good IB network performance when using 7 cores, poor performance on all 8?

Brian Dobbins Thu, 24 Apr 2014 10:32:19 -0700

Hi Joe,

Median changes by more than factor of 2. And the distribution tail is
> *huge*.
> FWIW: 6.2 was a terrible release.  If you have to use pure RHEL, get to
> 6.5+.  And there are many tunables you need to look at.
>


  Thanks for your reply - I may look into asking our IT squad to put 6.5 on
a set of nodes for testing, but playing with the tunables is probably the
first step.  I don't have root access and can't switch things up, but a few
of the power options (eg, /sys/module/pcie_aspm/parameters/policy) are
already looking like decent things to switch around, as that's in a 'power
save' state currently on the poorly performing nodes, whereas it doesn't
even exist on the 5.5 nodes.


> Bigger view ... have you isolated a CPU for IB handling, so at 7 cores,
> your machine is full (1 for IB and 7 for apps), but at 8 cores you are
> contending for resources (8 for apps + 1 for IB)?
> Are you running the app with taskset (explicitly or implicitly)?
>

  In the test we're running, there isn't any local processing outside of
the communication, really - each task, bound to its own core, is simply
sending messages, in a giant loop.  While there are clearly 8 cores all
talking to 1 IB device, each one (I believe) mmaps its own range and
handles its own message processing, and furthermore this definitely works
before, so it doesn't seem like a resource contention issue unless it's
something to do with mmap on the versions we're running.  I did double
check that we're not having processes migrating between cores, though.

  Mostly, I'm poking around kernel tunables right now and making a list of
things that might indicate the issue.  I'll also take a deeper look at
/proc/interrupts during a run soon, too.

  Thanks again,
  - Brian

_______________________________________________
Beowulf mailing list, [email protected] sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Re: [Beowulf] Good IB network performance when using 7 cores, poor performance on all 8?

Reply via email to