Re: [Beowulf] Weird blade performs worse as more cpus are used?

Faraz Hussain Tue, 26 Sep 2017 18:30:09 -0700

The issue seems is now resolved after I did a full power down ( coldboot )! No idea what caused the issue in the first place.


Quoting Joe Landman <joe.land...@gmail.com>:

On 09/14/2017 11:34 AM, Faraz Hussain wrote:
Earlier I had posted about one of our blades running 30-50% slowerthan other ones despite having identical hardware and OS. Ifollowed the suggestions and compared cpu temperature, memory,dmesg and sysctl. Everything looks the same.
I then used "perf stat" to compare speed of pigz ( parralel gzip ).The results are quite interesting. Using one cpu, the slow blade isas fast as the rest! But as I use more cpus, the speed decreaseslinearly from 3.1Ghz to 0.4 Ghz. See snippets from "perf stat"command below. All tests were on /tmp to eliminate any nfs issue.And same behavior is observed with any multi-threaded program.
What does numastat report? /tmp is a ramdisk or tmpfs? Are thenodes/cpus otherwise idle? What does lscpu on a good/bad node report?
If it decreases on a 1/Ncpu curve, then you have a fixed sizedresource bandwidth contention issue you are fighting. The questionis what.
--

Joe Landman
e: joe.land...@gmail.com
t: @hpcjoe
w: https://scalability.org
g: https://github.com/joelandman
l: https://www.linkedin.com/in/joelandman

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visithttp://www.beowulf.org/mailman/listinfo/beowulf




_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Re: [Beowulf] Weird blade performs worse as more cpus are used?

Reply via email to