On Wednesday, September 21, 2016, Gilles Gouaillardet <
gilles.gouaillar...@gmail.com> wrote:

> George,
>
> let's consider the case where "lo" is *not* excluded via the
> btl_tcp_if_exclude MCA param
> (if i understand correctly, the following is also true if "lo" is
> included via the btl_tcp_if_include MCA param)
>
> currently, and because of/thanks to the test that is done "deep inside"
> 1) on a disconnected laptop, mpirun --mca btl tcp,self ... fails with
> 2 tasks or more because tasks cannot reach each other
> 2) on a (connected) cluster, "lo" is never used and mpirun --mca btl
> tcp,self ... does not hang when tasks are running on two nodes or more
>
> with your proposal :
> 3) on a disconnected laptop, mpirun --mca btl tcp,self ... works with
> any number of taks, because "lo" is used by btl/tcp
> 4) on a (connected) cluster, "lo" is used and mpirun --mca btl
> tcp,self ... will very likely hang when tasks are running on two nodes
> or more
>
> am i right so far ?


No, you are missing the fact that thanks to our if_exclude (which contains
by default 127.0.0.0/24) we will never use lo (not even with my patch).
Thus, local interfaces will remain out of reach for most users, with the
exception of those that manually force the inclusion of lo via if_include.

On a cluster where a user explicitly enable lo, there will be some hiccups
during startup. However, as Paul states we explicitly discourage people of
doing that in the README. Second, the connection over lo will eventually
timeout, and lo it will be dropped and all pending communications will be
redirected through another TCP interface.

Cheers,
George.


> my concern is 4)
> as Paul pointed out, we can consider this is not an issue since this
> is a user/admin mistake, and we do not care whether this is an honest
> one or not. that being said, this is not very friendly since something
> that is working fine today will (likely) start hanging when your patch
> is merged.
>
> my suggestion differs since it is basically 2) and 3), which can be
> seen as the best of both worlds
>
> makes sense ?
>
> as a side note, there were some discussions about automatically adding
> the self btl,
> and even offering a user friendly alternative to --mca btl xxx
> (for example --networks shm,infiniband. today Open MPI does not
> provide any alternative to btl/self. also infiniband can be used via
> btl/openib, mtl/mxm or libfabric, which makes it painful to
> blacklist). i cannot remember the outcome of the discussion (if any).
>
> Cheers,
>
> Gilles
>
> On Thu, Sep 22, 2016 at 4:57 AM, George Bosilca <bosi...@icl.utk.edu
> <javascript:;>> wrote:
> > Gilles,
> >
> > I don't understand how your proposal is any different than what we have
> > today. I quote "If [locality flag is set], then we could keep a hard
> coded
> > test so 127.x.y.z address (and IPv6 equivalent) are never used (even if
> > included or not excluded) for inter node communication". We already have
> a
> > hardcoded test to prevent 127.x.y.z addresses from being used. In fact we
> > have 2 tests, one because this address range is part of our default
> > if_exclude, and then a second test (that only does something useful in
> case
> > you manually added lo* to if_include) deep inside the IP matching logic.
> >
> >   George.
> >
> >
> > On Wed, Sep 21, 2016 at 12:36 PM, Gilles Gouaillardet
> > <gilles.gouaillar...@gmail.com <javascript:;>> wrote:
> >>
> >> George,
> >>
> >> i got that, and i consider my suggestion as an improvement to your
> >> proposal.
> >>
> >> if i want to exclude ib0, i might want to
> >> mpirun --mca btl_tcp_if_exclude ib0 ...
> >>
> >> to me, this is an honest mistake, but with your proposal, i would be
> >> screwed when
> >> running on more than one node because i should have
> >> mpirun --mca btl_tcp_if_exclude ib0,lo ...
> >>
> >> and if this parameter is set by the admin in the system-wide config,
> >> then this configuration must be adapted by the admin, and that could
> >> generate some confusion.
> >>
> >> my suggestion simply adds a "safety net" to your proposal
> >>
> >> for the sake of completion, i do not really care whether there should
> >> be a safety net or not if localhost is explicitly included via the the
> >> btl_tcp_if_include MCA parameter
> >>
> >> a different and safe/friendly proposal is to add a new
> >> btl_tcp_if_exclude_localhost MCA param, which is true by default, so
> >> you would simply force it to false if you want to MPI_Comm_spawn or
> >> use the tcp btl on your disconnected laptop.
> >>
> >> as a side note, this reminds me that the openib/btl is used by default
> >> for intra node communication between two tasks from different jobs (sm
> >> nor vader cannot be used yet, and btl/openib has a higher exclusivity
> >> than btl/tcp). my first impression is that i am not so comfortable
> >> with that, and we could add yet an other MCA parameter so btl/openib
> >> disqualifies itself for intra node communications.
> >>
> >>
> >> Cheers,
> >>
> >> Gilles
> >>
> >> On Thu, Sep 22, 2016 at 12:56 AM, George Bosilca <bosi...@icl.utk.edu
> <javascript:;>>
> >> wrote:
> >> > My proposal is not about adding new ways of deciding what is local and
> >> > what
> >> > not. I proposed to use the corresponding MCA parameters to allow the
> >> > user to
> >> > decide. More specifically, I want to be able to change the exclude and
> >> > include MCA to enable TCP over local addresses.
> >> >
> >> > George
> >> >
> >> >
> >> > On Sep 21, 2016 4:32 PM, "Gilles Gouaillardet"
> >> > <gilles.gouaillar...@gmail.com <javascript:;>> wrote:
> >> >>
> >> >> George,
> >> >>
> >> >> Is proc locality already set at that time ?
> >> >>
> >> >> If yes, then we could keep a hard coded test so 127.x.y.z address
> (and
> >> >> IPv6 equivalent) are never used (even if included or not excluded)
> for
> >> >> inter
> >> >> node communication
> >> >>
> >> >> Cheers,
> >> >>
> >> >> Gilles
> >> >>
> >> >> "Jeff Squyres (jsquyres)" <jsquy...@cisco.com <javascript:;>> wrote:
> >> >> >On Sep 21, 2016, at 10:56 AM, George Bosilca <bosi...@icl.utk.edu
> <javascript:;>>
> >> >> > wrote:
> >> >> >>
> >> >> >> No, because 127.x.x.x is by default part of the exclude, so it
> will
> >> >> >> never get into the modex. The problem today, is that even if you
> >> >> >> manually
> >> >> >> remove it from the exclude and add it to the include, it will not
> >> >> >> work,
> >> >> >> because of the hardcoded checks. Once we remove those checks,
> things
> >> >> >> will
> >> >> >> work the way we expect, interfaces are removed because they don't
> >> >> >> match the
> >> >> >> provided addresses.
> >> >> >
> >> >> >Gotcha.
> >> >> >
> >> >> >> I would have agreed with you if the current code was doing a
> better
> >> >> >> decision of what is local and what not. But it is not, it simply
> >> >> >> remove all
> >> >> >> 127.x.x.x interfaces (opal/util/net.c:222). Thus, the only thing
> the
> >> >> >> current
> >> >> >> code does, is preventing a power-user from using the loopback
> >> >> >> (despite being
> >> >> >> explicitly enabled via the corresponding MCA parameters).
> >> >> >
> >> >> >Fair enough.
> >> >> >
> >> >> >Should we have a keyword that can be used in the
> >> >> > btl_tcp_if_include/exclude (e.g., "local") that removes all
> >> >> > local-only
> >> >> > interfaces?  I.E., all 127.x.x.x/8 interfaces *and* all local-only
> >> >> > interfaces (e.g., bridging interfaces to local VMs and the like)?
> >> >> >
> >> >> >We could then replace the default "127.0.0.0/8" value in
> >> >> > btl_tcp_if_exclude with this token, and therefore actually exclude
> >> >> > the
> >> >> > VM-only interfaces (which have caused some users problems in the
> >> >> > past).
> >> >> >
> >> >> >--
> >> >> >Jeff Squyres
> >> >> >jsquy...@cisco.com <javascript:;>
> >> >> >For corporate legal information go to:
> >> >> > http://www.cisco.com/web/about/doing_business/legal/cri/
> >> >> >
> >> >> >_______________________________________________
> >> >> >devel mailing list
> >> >> >devel@lists.open-mpi.org <javascript:;>
> >> >> >https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
> >> >> _______________________________________________
> >> >> devel mailing list
> >> >> devel@lists.open-mpi.org <javascript:;>
> >> >> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
> >> >
> >> >
> >> > _______________________________________________
> >> > devel mailing list
> >> > devel@lists.open-mpi.org <javascript:;>
> >> > https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
> >> _______________________________________________
> >> devel mailing list
> >> devel@lists.open-mpi.org <javascript:;>
> >> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
> >
> >
> >
> > _______________________________________________
> > devel mailing list
> > devel@lists.open-mpi.org <javascript:;>
> > https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
> _______________________________________________
> devel mailing list
> devel@lists.open-mpi.org <javascript:;>
> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
>
_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

Reply via email to