Gilles,

I don't understand how your proposal is any different than what we have
today. I quote "If [locality flag is set], then we could keep a hard coded
test so 127.x.y.z address (and IPv6 equivalent) are never used (even if
included or not excluded) for inter node communication". We already have a
hardcoded test to prevent 127.x.y.z addresses from being used. In fact we
have 2 tests, one because this address range is part of our default
if_exclude, and then a second test (that only does something useful in case
you manually added lo* to if_include) deep inside the IP matching logic.

  George.


On Wed, Sep 21, 2016 at 12:36 PM, Gilles Gouaillardet <
gilles.gouaillar...@gmail.com> wrote:

> George,
>
> i got that, and i consider my suggestion as an improvement to your
> proposal.
>
> if i want to exclude ib0, i might want to
> mpirun --mca btl_tcp_if_exclude ib0 ...
>
> to me, this is an honest mistake, but with your proposal, i would be
> screwed when
> running on more than one node because i should have
> mpirun --mca btl_tcp_if_exclude ib0,lo ...
>
> and if this parameter is set by the admin in the system-wide config,
> then this configuration must be adapted by the admin, and that could
> generate some confusion.
>
> my suggestion simply adds a "safety net" to your proposal
>
> for the sake of completion, i do not really care whether there should
> be a safety net or not if localhost is explicitly included via the the
> btl_tcp_if_include MCA parameter
>
> a different and safe/friendly proposal is to add a new
> btl_tcp_if_exclude_localhost MCA param, which is true by default, so
> you would simply force it to false if you want to MPI_Comm_spawn or
> use the tcp btl on your disconnected laptop.
>
> as a side note, this reminds me that the openib/btl is used by default
> for intra node communication between two tasks from different jobs (sm
> nor vader cannot be used yet, and btl/openib has a higher exclusivity
> than btl/tcp). my first impression is that i am not so comfortable
> with that, and we could add yet an other MCA parameter so btl/openib
> disqualifies itself for intra node communications.
>
>
> Cheers,
>
> Gilles
>
> On Thu, Sep 22, 2016 at 12:56 AM, George Bosilca <bosi...@icl.utk.edu>
> wrote:
> > My proposal is not about adding new ways of deciding what is local and
> what
> > not. I proposed to use the corresponding MCA parameters to allow the
> user to
> > decide. More specifically, I want to be able to change the exclude and
> > include MCA to enable TCP over local addresses.
> >
> > George
> >
> >
> > On Sep 21, 2016 4:32 PM, "Gilles Gouaillardet"
> > <gilles.gouaillar...@gmail.com> wrote:
> >>
> >> George,
> >>
> >> Is proc locality already set at that time ?
> >>
> >> If yes, then we could keep a hard coded test so 127.x.y.z address (and
> >> IPv6 equivalent) are never used (even if included or not excluded) for
> inter
> >> node communication
> >>
> >> Cheers,
> >>
> >> Gilles
> >>
> >> "Jeff Squyres (jsquyres)" <jsquy...@cisco.com> wrote:
> >> >On Sep 21, 2016, at 10:56 AM, George Bosilca <bosi...@icl.utk.edu>
> wrote:
> >> >>
> >> >> No, because 127.x.x.x is by default part of the exclude, so it will
> >> >> never get into the modex. The problem today, is that even if you
> manually
> >> >> remove it from the exclude and add it to the include, it will not
> work,
> >> >> because of the hardcoded checks. Once we remove those checks, things
> will
> >> >> work the way we expect, interfaces are removed because they don't
> match the
> >> >> provided addresses.
> >> >
> >> >Gotcha.
> >> >
> >> >> I would have agreed with you if the current code was doing a better
> >> >> decision of what is local and what not. But it is not, it simply
> remove all
> >> >> 127.x.x.x interfaces (opal/util/net.c:222). Thus, the only thing the
> current
> >> >> code does, is preventing a power-user from using the loopback
> (despite being
> >> >> explicitly enabled via the corresponding MCA parameters).
> >> >
> >> >Fair enough.
> >> >
> >> >Should we have a keyword that can be used in the
> >> > btl_tcp_if_include/exclude (e.g., "local") that removes all local-only
> >> > interfaces?  I.E., all 127.x.x.x/8 interfaces *and* all local-only
> >> > interfaces (e.g., bridging interfaces to local VMs and the like)?
> >> >
> >> >We could then replace the default "127.0.0.0/8" value in
> >> > btl_tcp_if_exclude with this token, and therefore actually exclude the
> >> > VM-only interfaces (which have caused some users problems in the
> past).
> >> >
> >> >--
> >> >Jeff Squyres
> >> >jsquy...@cisco.com
> >> >For corporate legal information go to:
> >> > http://www.cisco.com/web/about/doing_business/legal/cri/
> >> >
> >> >_______________________________________________
> >> >devel mailing list
> >> >devel@lists.open-mpi.org
> >> >https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
> >> _______________________________________________
> >> devel mailing list
> >> devel@lists.open-mpi.org
> >> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
> >
> >
> > _______________________________________________
> > devel mailing list
> > devel@lists.open-mpi.org
> > https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
> _______________________________________________
> devel mailing list
> devel@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
>
_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

Reply via email to