Unfortunately, my previous patch for this problem breaks the 
--ntasks-per-node option.  Here is a corrected version.
Regards,
Martin

Index: src/plugins/select/cons_res/job_test.c
===================================================================
RCS file: /cvsroot/slurm/slurm/src/plugins/select/cons_res/job_test.c,v
retrieving revision 1.1.1.28
diff -u -r1.1.1.28 job_test.c
--- src/plugins/select/cons_res/job_test.c      3 Mar 2011 19:18:10 -0000 
1.1.1.28
+++ src/plugins/select/cons_res/job_test.c      8 Apr 2011 22:08:29 -0000
@@ -512,7 +512,8 @@
        } else {
                j = avail_cpus / cpus_per_task;
                num_tasks = MIN(num_tasks, j);
-               avail_cpus = num_tasks * cpus_per_task;
+               if (job_ptr->details->ntasks_per_node)
+                   avail_cpus = num_tasks * cpus_per_task;
        }
 
        if ((job_ptr->details->ntasks_per_node &&





Martin Perry/US/BULL 
04/08/2011 02:15 PM

To
[email protected]
cc
[email protected], "[email protected]" 
<[email protected]>
Subject
cons_res core allocation problem





With cons_res, the default method for allocating cores within nodes is 
cyclic allocation across sockets, as shown in the following examples.

Node bones (n8) CPU layout (Slurm numbering):
Socket 0: CPU_IDs 0,1,2,3
Socket 1: CPU_IDs 4,5,6,7

SelectType=select/cons_res
SelectTypeParameters=CR_Core

[sulu] (slurm) etc> srun -p bones-only -n6 -c1 scontrol --details show job 
| grep CPU_IDs
     ...
     Nodes=n8 CPU_IDs=0-2,4-6 Mem=0

[sulu] (slurm) etc> srun -p bones-only -n3 -c2 scontrol --details show job 
| grep CPU_IDs
     ...
     Nodes=n8 CPU_IDs=0-2,4-6 Mem=0

However, for certain combinations of node layout and -cpus-per-task > 1, 
the default is not honored.  Slurm uses block allocation instead:

[sulu] (slurm) etc> srun -p bones-only -n2 -c3 scontrol --details show job 
| grep CPU_IDs
     ...
     Nodes=n8 CPU_IDs=0-5 Mem=0

The problem appears to be in function _allocate_cores in 
src/plugins/select/cons_res/job_test.c. It sometimes returns an incorrect 
value for the number of CPUs that can be used on the node. The patch below 
fixes the problem in 2.2.4. 

Regards,
Martin

Index: src/plugins/select/cons_res/job_test.c
===================================================================
RCS file: /cvsroot/slurm/slurm/src/plugins/select/cons_res/job_test.c,v
retrieving revision 1.1.1.28
diff -u -r1.1.1.28 job_test.c
--- src/plugins/select/cons_res/job_test.c      3 Mar 2011 19:18:10 -0000 
1.1.1.28
+++ src/plugins/select/cons_res/job_test.c      8 Apr 2011 20:37:29 -0000
@@ -508,11 +508,10 @@
                num_tasks = MIN(num_tasks, 
job_ptr->details->ntasks_per_node);
 
        if (cpus_per_task < 2) {
-               avail_cpus = num_tasks;
+               num_tasks = avail_cpus;
        } else {
                j = avail_cpus / cpus_per_task;
                num_tasks = MIN(num_tasks, j);
-               avail_cpus = num_tasks * cpus_per_task;
        }
 
        if ((job_ptr->details->ntasks_per_node &&

Reply via email to