There is a bug in the task affinity code that produces incorrect results 
for jobs with cpus-per-task >1 when the second distribution method is 
"block" (srun option -m xxxxx:block).  Here is an example to show the 
problem:

slurm.conf settings:
SelectType=select/cons_res
SelectTypeParameters=CR_Core,CR_CORE_DEFAULT_DIST_BLOCK
TaskPlugin=task/affinity
TaskPluginParam=sched,cores
NodeName=n8 NodeHostname=bones NodeAddr=bones Sockets=2 CoresPerSocket=4 
ThreadsPerCore=1 Procs=8
PartitionName=bones-only    Nodes=n8  State=UP

[sulu] (slurm) etc> srun -p bones-only -c 3 -n 2 -m block:block  -l cat 
/proc/self/status | grep Cpus_allowed_list | sort
0: Cpus_allowed_list:   0-4,6
1: Cpus_allowed_list:   0-3,6

The attached patch fixes the problem in 2.2.1.  Here are the results after 
the patch is applied:

[sulu] (slurm) etc> srun -p bones-only -c 3 -n 2 -m block:block  -l cat 
/proc/self/status | grep Cpus_allowed_list | sort
0: Cpus_allowed_list:   0,2,4
1: Cpus_allowed_list:   1,3,6

Regards,
Martin

Attachment: task_affinity_2.2.1.patch
Description: Binary data

Reply via email to