hi all,
Many time, there are no difference between "proc" and "slot". But when
you use "mpirun -cpus-per-proc X", slot have X procs.
On orte/mca/rmaps/base/rmaps_base_common_mappers.c, there are a
confusion between proc and slot. this little error impact mapping action:
On OMPI last version with 32 cores compute node:
salloc -n 8 -c 8 mpirun -bind-to-core -bycore ./a.out
[rank:0]<stdout>: host:compute18
[rank:1]<stdout>: host:compute19
[rank:2]<stdout>: host:compute18
[rank:3]<stdout>: host:compute19
[rank:4]<stdout>: host:compute18
[rank:5]<stdout>: host:compute19
[rank:6]<stdout>: host:compute18
[rank:7]<stdout>: host:compute19
with patch:
[rank:0]<stdout>: host:compute18
[rank:1]<stdout>: host:compute18
[rank:2]<stdout>: host:compute18
[rank:3]<stdout>: host:compute18
[rank:4]<stdout>: host:compute19
[rank:5]<stdout>: host:compute19
[rank:6]<stdout>: host:compute19
[rank:7]<stdout>: host:compute19
Can you say, if my patch is correct ?
Thanks you
Damien
diff -r 97ad060b8e48 orte/mca/rmaps/base/rmaps_base_common_mappers.c
--- a/orte/mca/rmaps/base/rmaps_base_common_mappers.c Thu Oct 14 11:05:54
2010 +0200
+++ b/orte/mca/rmaps/base/rmaps_base_common_mappers.c Mon Oct 18 13:57:22
2010 +0200
@@ -191,7 +191,8 @@
if (0 == node->slots_alloc) {
num_procs_to_assign = 1;
} else {
- num_possible_procs = node->slots_alloc /
jdata->map->cpus_per_rank;
+ //In rmaps_base_common_mappers 'num_possible_procs' define
number of ranks
+ num_possible_procs = node->slots_alloc;
if (0 == num_possible_procs) {
num_procs_to_assign = 1;
} else {
@@ -199,7 +200,8 @@
}
}
} else {
- num_possible_procs = (node->slots_alloc - node->slots_inuse) /
jdata->map->cpus_per_rank;
+ //In rmaps_base_common_mappers 'num_possible_procs' define number
of ranks
+ num_possible_procs = (node->slots_alloc - node->slots_inuse);
if (0 == num_possible_procs) {
num_procs_to_assign = 1;
} else {
diff -r 97ad060b8e48 orte/mca/rmaps/base/rmaps_base_support_fns.c
--- a/orte/mca/rmaps/base/rmaps_base_support_fns.c Thu Oct 14 11:05:54
2010 +0200
+++ b/orte/mca/rmaps/base/rmaps_base_support_fns.c Mon Oct 18 13:57:22
2010 +0200
@@ -339,7 +339,7 @@
ORTE_JOBID_PRINT(jdata->jobid), current_node->name));
/* Be sure to demarcate the slots for this proc as claimed from the node */
- current_node->slots_inuse += cpus_per_rank;
+ current_node->slots_inuse += 1;
/* see if this node is oversubscribed now */
if (current_node->slots_inuse > current_node->slots) {