You are welcome to raise the question of default mapping behavior on master yet
again, but please do so on a separate thread so we can make sense of it.
Note that I will not be making more modifications of that behavior, so if
someone feels strongly that they want it to change, please go ahead
Thanks Nathan,
sorry for the confusion, what i observed was a consequence of something
else ...
mpirun --host n0,n1 -np 4 a.out
/* n0 and n1 have 16 cores each */
runs 4 instances of a.out on n0 (and nothing on n1)
if i run with -np 32, then 16 tasks run on each node.
with v2.x, the
add_procs is always called at least once. This is how we set up shared
memory communication. It will then be invoked on-demand for non-local
peers with the reachability argument set to NULL (because the bitmask
doesn't provide any benefit when adding only 1 peer).
-Nathan
On Tue, May 17, 2016
Sounds like something has been broken - what Jeff describes is the intended
behavior
> On May 16, 2016, at 8:00 AM, Gilles Gouaillardet
> wrote:
>
> Jeff,
>
> this is not what I observed
> (tcp btl, 2 to 4 nodes with one task per node, cutoff=0)
> the add_procs
Jeff,
this is not what I observed
(tcp btl, 2 to 4 nodes with one task per node, cutoff=0)
the add_procs of the tcp btl is invoked once with the 4 tasks.
I checked the sources and found cutoff only controls if the modex is
invoked once for all at init, or on demand.
Cheers,
Gilles
On Monday,
We changed the way BTL add_procs is invoked on master and v2.x for scalability
reasons.
In short: add_procs is only invoked the first time you talk to a given peer.
The cutoff switch is an override to that -- if the sizeof COMM_WORLD is less
than the cutoff, we revert to the old behavior of
it seems I misunderstood some things ...
add_procs is always invoked, regardless the cutoff value.
cutoff is used to retrieve processes info via the modex "on demand" vs at
init time.
Please someone correct me and/or elaborate if needed
Cheers,
Gilles
On Monday, May 16, 2016, Gilles
i cannot reproduce this behavior.
note mca_btl_tcp_add_procs is invoked once per tcp component (e.g. once
per physical NIC)
so you might want to explicitly select one nic
mpirun --mca btl_tcp_if_include xxx ...
my printf output are the same and regardless the mpi_add_procs_cutoff value
Sorry, I accidentally pressed 'Send' before I was done writing the last
mail. What I wanted to ask was what is the parameter mpi_add_procs_cutoff
and why adding it seems to make a difference in the code path but not in
the end result of the program? How would it help me debug my problem?
Thank
Hello Gilles
Setting -mca mpi_add_procs_cutoff 1024 indeed makes a difference to the
output, as follows:
With -mca mpi_add_procs_cutoff 1024:
reachable = 0x1
(Note that add_procs was called once and the value of 'reachable is
correct')
Without -mca mpi_add_procs_cutoff 1024
reachable =
At first glance, that seems a bit odd...
are you sure you correctly print the reachable bitmap ?
I would suggest you add some instrumentation to understand what happens
(e.g., printf before opal_bitmap_set_bit() and other places that prevent
this from happening)
one more thing ...
now, master
Hello Gilles
Thanks for jumping in to help again. Actually, I had already tried some of
your suggestions before asking for help.
I have several interconnects that can run both openib and tcp BTL. To
simplify things, I explicitly mentioned TCP:
mpirun -np 2 -hostfile ~/hostfile -mca pml ob1 -mca
did you check the add_procs callbacks ?
(e.g. mca_btl_tcp_add_procs() for the tcp btl)
this is where the reachable bitmap is set, and I guess this is what you are
looking for.
keep in mind that if several btl can be used, the one with the higher
exclusivity is used
(e.g. tcp is never used if
Hello all
I have been struggling with this issue for a while and figured it might be
a good idea to ask for help.
Where (in the code path) is the connectivity map created?
I can see that it is *used* in mca_bml_r2_endpoint_add_btl(), but obviously
I am not setting it up right, because this
14 matches
Mail list logo