On Friday, 11 August 2017 12:39:07 AM AEST Faraz Hussain wrote:
> I thought it may have to do with cpu scaling, i.e when the kernel
> changes the cpu speed depending on the workload. But we do not have
> that enabled on these machines.
Just to add to the excellent suggestions from others: have you compared BIOS/
UEFI settings & versions across these nodes to ensure they're identical?
Also remember that the kernel can enable C states that hurt performance even
if they are disabled in the BIOS/UEFI. This was painfully apparent on our
first SandyBridge cluster that almost failed the performance part of acceptance
testing until it got found.
Now we boot all nodes with this in the kernel cmdline:
intel_idle.max_cstate=0 processor.max_cstate=1 intel_pstate=disable
Best of luck!
Christopher Samuel Senior Systems Administrator
Melbourne Bioinformatics - The University of Melbourne
Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit