Folks,
currently, default mapping policy on master is different than v2.x.
my preliminary question is : when will the master mapping policy land
into the release branch ?
v2.0.0 ? v2.x ? v3.0.0 ?
here are some commands and their output (both n0 and n1 have 16 cores
each, mpirun runs on n0)
first, let's force 2 slots per node via the --host parameter, and play
with mapping
[gilles@n0 ~]$ mpirun --tag-output --host n0:2,n1:2 -np 4 hostname | sort
[1,0]<stdout>:n0
[1,1]<stdout>:n0
[1,2]<stdout>:n1
[1,3]<stdout>:n1
[gilles@n0 ~]$ mpirun --tag-output --host n0:2,n1:2 -np 4 --map-by
socket hostname | sort
[1,0]<stdout>:n0
[1,1]<stdout>:n0
[1,2]<stdout>:n1
[1,3]<stdout>:n1
/* so far so good, default mapping is --map-by socket, and mapping looks
correct to me */
[gilles@n0 ~]$ mpirun --tag-output --host n0:2,n1:2 -np 4 --map-by node
hostname | sort
[1,0]<stdout>:n0
[1,1]<stdout>:n1
[1,2]<stdout>:n0
[1,3]<stdout>:n1
/* mapping looks correct to me too */
now let's force 4 slots per node
[gilles@n0 ~]$ mpirun --tag-output --host n0:4,n1:4 -np 4 --map-by node
hostname | sort
[1,0]<stdout>:n0
[1,1]<stdout>:n1
[1,2]<stdout>:n0
[1,3]<stdout>:n1
/* same output than previously, looks correct to me */
[gilles@n0 ~]$ mpirun --tag-output --host n0:4,n1:4 -np 4 --map-by
socket hostname | sort
[1,0]<stdout>:n0
[1,1]<stdout>:n0
[1,2]<stdout>:n0
[1,3]<stdout>:n0
/* all tasks run on n0, even if i explicitly requested --map-by socket,
that looks wrong to me */
[gilles@n0 ~]$ mpirun --tag-output --host n0:4,n1:4 -np 4 hostname | sort
[1,0]<stdout>:n0
[1,1]<stdout>:n0
[1,2]<stdout>:n0
[1,3]<stdout>:n0
/* same output than previously, which makes sense to me since the
default mapping policy is --map-by socket,
but all tasks run on n0, which still looks wrong to me */
if i do not force the number of slots, i get the same output (16 cores
are detected on each node) regardless the --map-by socket option.
it seems --map-by core is used, regardless what we pass on the command line.
in the last cases, is running all tasks on one node the intended behavior ?
if yes, which mapping option can be used to run the first 2 tasks on the
first node, and the last 2 tasks on the second nodes ?
Cheers,
Gilles