On Friday, 11 August 2017 12:39:07 AM AEST Faraz Hussain wrote:

> I thought it may have to do with cpu scaling, i.e when the kernel
> changes the cpu speed depending on the workload. But we do not have
> that enabled on these machines.

Just to add to the excellent suggestions from others: have you compared BIOS/
UEFI settings & versions across these nodes to ensure they're identical?

Also remember that the kernel can enable C states that hurt performance even 
if they are disabled in the BIOS/UEFI.   This was painfully apparent on our 
first SandyBridge cluster that almost failed the performance part of acceptance 
testing until it got found.

Now we boot all nodes with this in the kernel cmdline:

intel_idle.max_cstate=0 processor.max_cstate=1 intel_pstate=disable

Best of luck!
Chris
-- 
 Christopher Samuel        Senior Systems Administrator
 Melbourne Bioinformatics - The University of Melbourne
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to