You can try utilizing SLURM_HINT=nomultithread in the user's environment.  This 
allows you to have mutlithreading turned on in the BIOS but in use by default 
with Slurm.  Just keep in mind it's a hint.You would configure Slurm as if it 
had hyperthreading on like so:NodeName=compute-[029-083] RealMemory=64000 
Sockets=2 CoresPerSocket=12 ThreadsPerCore=2 State=UNKNOWNThen when you wanted 
to test with hyperthreading you would use --hint=multithread on the srun or 
sbatch command line. 
-----------------------------------------------------------------------------------On
 05/18/16, Davide Vanzo<[email protected]> wrote: The thing is that 
disabling HT via OS or via BIOS may not be the same as you can see in this 
thread: 
https://software.intel.com/en-us/forums/software-tuning-performance-optimization-platform-monitoring/topic/480007Moreover,
 I wouldn't be surprised if hwloc (which SLURM uses for affinity binding) may 
be "insensitive" to OS disabled HT. Hoverver when you disable it via BIOS there 
will be no ambiguity.DavideOn Wed, 2016-05-18 at 07:58 -0700, Jason Bacon 
wrote:No, opted against that in case we want to experiment with hyperthreading 
in the future without having to reboot.How might that affect SLURM?Thanks,     
JBOn 05/18/16 09:24, Davide Vanzo wrote:Jason,have you tried disabling HT from 
bios instead of doing from the OS?DavideOn Wed, 2016-05-18 at 06:02 -0700, 
Jason Bacon wrote:Just leaving a trail for future Googlers.  My colleague did 
an extensivesearch for answers and came up empty.We ran into an issue after 
disabling hyperthreading on one of our CentOSclusters.Here's the scenario:-    
Our compute nodes had hyperthreading enabled while we evaluated thecosts and 
benefits.-    SLURM was configured to schedule only one job per real core. 
Forexample, nodes with 24 cores / 48 virtual are configured as 
follows:NodeName=compute-[029-083] RealMemory=64000 Sockets=2 
CoresPerSocket=12ThreadsPerCore=1 State=UNKNOWN-    I added a command to 
/etc/rc.d/rc.local to disable hyperthreadingon the next reboot.-    No changes 
were made to slurm.conf.-    After rebooting with hyperthreading disabled, 
certain jobs landingon the node would fail with the following error:      
slurmstepd: Failed task affinity setup-    Restarting the scheduler cleared up 
the issue.Does anybody know what would cause this?  My best hypothesis is 
thatslurmctld is caching some probed hardware info from slurmd that changedwhen 
hyperthreading was disabled.Cheers,      Jason

Reply via email to