George,

OK then,
I recommend we explicitly state in the README that loopback interface can
no more be omitted from btl_tcp_if_exclude when running on multiple nodes

Cheers,

Gilles

On Thursday, September 22, 2016, George Bosilca <bosi...@icl.utk.edu> wrote:

> Thanks for clarifying, I now understand what your objection/suggestion
> was. We all misconfigured OMPI at least once, but that allowed us to learn
> how to do it right.
>
> Instead of adding extra protections for corner-cases, maybe we should fix
> our exclusivity flag so that the scenario you describe would not happen.
>
>   George.
>
> PS: "btl_tcp_if_exclude = ^ib0" qualifies as a honest mistake. I wouldn't
> dare proposing a new MCA param to prevent this ...
>
>
> On Wed, Sep 21, 2016 at 10:54 PM, Gilles Gouaillardet <
> gilles.gouaillar...@gmail.com
> <javascript:_e(%7B%7D,'cvml','gilles.gouaillar...@gmail.com');>> wrote:
>
>> ok, i was not clear
>>
>> by "let's consider the case where "lo" is *not* excluded via the
>> btl_tcp_if_exclude MCA param" i really meant
>> "let's consider the case where the value of the btl_tcp_if_exclude MCA
>> param has been forced to a list of network/interfaces that do not
>> contain any reference (e.g. name nor subnet) to the loopback
>> interface"
>> /* in a previous example, i did mpirun --mca btl_tcp_if_exclude ^ib0 */
>>
>> my concern is that openmpi-mca-params.conf contains
>> btl_tcp_if_exclude = ^ib0
>>
>> then hiccups will start when Open MPI is updated, and i expect some
>> complains.
>> of course we can reply, doc should have been read and advices
>> followed, so one cannot complain just because he has been lucky so
>> far.
>> or we can do things a bit differently so we do not run into this case
>>
>> /* if btl/self is excluded, the app will not start and it is trivial
>> to append to the error message a note asking to ensure btl/self was
>> not excluded.
>> in this case, i do not think we have a mechanism to issue a warning
>> message (e.g. "ensure lo is excluded") when hiccups occur. */
>>
>> Cheers,
>>
>> Gilles
>>
>> On Thu, Sep 22, 2016 at 9:54 AM, George Bosilca <bosi...@icl.utk.edu
>> <javascript:_e(%7B%7D,'cvml','bosi...@icl.utk.edu');>> wrote:
>> > On Wednesday, September 21, 2016, Gilles Gouaillardet
>> > <gilles.gouaillar...@gmail.com
>> <javascript:_e(%7B%7D,'cvml','gilles.gouaillar...@gmail.com');>> wrote:
>> >>
>> >> George,
>> >>
>> >> let's consider the case where "lo" is *not* excluded via the
>> >> btl_tcp_if_exclude MCA param
>> >> (if i understand correctly, the following is also true if "lo" is
>> >> included via the btl_tcp_if_include MCA param)
>> >>
>> >> currently, and because of/thanks to the test that is done "deep inside"
>> >> 1) on a disconnected laptop, mpirun --mca btl tcp,self ... fails with
>> >> 2 tasks or more because tasks cannot reach each other
>> >> 2) on a (connected) cluster, "lo" is never used and mpirun --mca btl
>> >> tcp,self ... does not hang when tasks are running on two nodes or more
>> >>
>> >> with your proposal :
>> >> 3) on a disconnected laptop, mpirun --mca btl tcp,self ... works with
>> >> any number of taks, because "lo" is used by btl/tcp
>> >> 4) on a (connected) cluster, "lo" is used and mpirun --mca btl
>> >> tcp,self ... will very likely hang when tasks are running on two nodes
>> >> or more
>> >>
>> >> am i right so far ?
>> >
>> >
>> > No, you are missing the fact that thanks to our if_exclude (which
>> contains
>> > by default 127.0.0.0/24) we will never use lo (not even with my patch).
>> > Thus, local interfaces will remain out of reach for most users, with the
>> > exception of those that manually force the inclusion of lo via
>> if_include.
>> >
>> > On a cluster where a user explicitly enable lo, there will be some
>> hiccups
>> > during startup. However, as Paul states we explicitly discourage people
>> of
>> > doing that in the README. Second, the connection over lo will eventually
>> > timeout, and lo it will be dropped and all pending communications will
>> be
>> > redirected through another TCP interface.
>> >
>> > Cheers,
>> > George.
>> >
>> >
>> >>
>> >> my concern is 4)
>> >> as Paul pointed out, we can consider this is not an issue since this
>> >> is a user/admin mistake, and we do not care whether this is an honest
>> >> one or not. that being said, this is not very friendly since something
>> >> that is working fine today will (likely) start hanging when your patch
>> >> is merged.
>> >>
>> >> my suggestion differs since it is basically 2) and 3), which can be
>> >> seen as the best of both worlds
>> >>
>> >> makes sense ?
>> >>
>> >> as a side note, there were some discussions about automatically adding
>> >> the self btl,
>> >> and even offering a user friendly alternative to --mca btl xxx
>> >> (for example --networks shm,infiniband. today Open MPI does not
>> >> provide any alternative to btl/self. also infiniband can be used via
>> >> btl/openib, mtl/mxm or libfabric, which makes it painful to
>> >> blacklist). i cannot remember the outcome of the discussion (if any).
>> >>
>> >> Cheers,
>> >>
>> >> Gilles
>> >>
>> >> On Thu, Sep 22, 2016 at 4:57 AM, George Bosilca <bosi...@icl.utk.edu
>> <javascript:_e(%7B%7D,'cvml','bosi...@icl.utk.edu');>>
>> >> wrote:
>> >> > Gilles,
>> >> >
>> >> > I don't understand how your proposal is any different than what we
>> have
>> >> > today. I quote "If [locality flag is set], then we could keep a hard
>> >> > coded
>> >> > test so 127.x.y.z address (and IPv6 equivalent) are never used (even
>> if
>> >> > included or not excluded) for inter node communication". We already
>> have
>> >> > a
>> >> > hardcoded test to prevent 127.x.y.z addresses from being used. In
>> fact
>> >> > we
>> >> > have 2 tests, one because this address range is part of our default
>> >> > if_exclude, and then a second test (that only does something useful
>> in
>> >> > case
>> >> > you manually added lo* to if_include) deep inside the IP matching
>> logic.
>> >> >
>> >> >   George.
>> >> >
>> >> >
>> >> > On Wed, Sep 21, 2016 at 12:36 PM, Gilles Gouaillardet
>> >> > <gilles.gouaillar...@gmail.com
>> <javascript:_e(%7B%7D,'cvml','gilles.gouaillar...@gmail.com');>> wrote:
>> >> >>
>> >> >> George,
>> >> >>
>> >> >> i got that, and i consider my suggestion as an improvement to your
>> >> >> proposal.
>> >> >>
>> >> >> if i want to exclude ib0, i might want to
>> >> >> mpirun --mca btl_tcp_if_exclude ib0 ...
>> >> >>
>> >> >> to me, this is an honest mistake, but with your proposal, i would be
>> >> >> screwed when
>> >> >> running on more than one node because i should have
>> >> >> mpirun --mca btl_tcp_if_exclude ib0,lo ...
>> >> >>
>> >> >> and if this parameter is set by the admin in the system-wide config,
>> >> >> then this configuration must be adapted by the admin, and that could
>> >> >> generate some confusion.
>> >> >>
>> >> >> my suggestion simply adds a "safety net" to your proposal
>> >> >>
>> >> >> for the sake of completion, i do not really care whether there
>> should
>> >> >> be a safety net or not if localhost is explicitly included via the
>> the
>> >> >> btl_tcp_if_include MCA parameter
>> >> >>
>> >> >> a different and safe/friendly proposal is to add a new
>> >> >> btl_tcp_if_exclude_localhost MCA param, which is true by default, so
>> >> >> you would simply force it to false if you want to MPI_Comm_spawn or
>> >> >> use the tcp btl on your disconnected laptop.
>> >> >>
>> >> >> as a side note, this reminds me that the openib/btl is used by
>> default
>> >> >> for intra node communication between two tasks from different jobs
>> (sm
>> >> >> nor vader cannot be used yet, and btl/openib has a higher
>> exclusivity
>> >> >> than btl/tcp). my first impression is that i am not so comfortable
>> >> >> with that, and we could add yet an other MCA parameter so btl/openib
>> >> >> disqualifies itself for intra node communications.
>> >> >>
>> >> >>
>> >> >> Cheers,
>> >> >>
>> >> >> Gilles
>> >> >>
>> >> >> On Thu, Sep 22, 2016 at 12:56 AM, George Bosilca <
>> bosi...@icl.utk.edu <javascript:_e(%7B%7D,'cvml','bosi...@icl.utk.edu');>
>> >
>> >> >> wrote:
>> >> >> > My proposal is not about adding new ways of deciding what is local
>> >> >> > and
>> >> >> > what
>> >> >> > not. I proposed to use the corresponding MCA parameters to allow
>> the
>> >> >> > user to
>> >> >> > decide. More specifically, I want to be able to change the exclude
>> >> >> > and
>> >> >> > include MCA to enable TCP over local addresses.
>> >> >> >
>> >> >> > George
>> >> >> >
>> >> >> >
>> >> >> > On Sep 21, 2016 4:32 PM, "Gilles Gouaillardet"
>> >> >> > <gilles.gouaillar...@gmail.com
>> <javascript:_e(%7B%7D,'cvml','gilles.gouaillar...@gmail.com');>> wrote:
>> >> >> >>
>> >> >> >> George,
>> >> >> >>
>> >> >> >> Is proc locality already set at that time ?
>> >> >> >>
>> >> >> >> If yes, then we could keep a hard coded test so 127.x.y.z address
>> >> >> >> (and
>> >> >> >> IPv6 equivalent) are never used (even if included or not
>> excluded)
>> >> >> >> for
>> >> >> >> inter
>> >> >> >> node communication
>> >> >> >>
>> >> >> >> Cheers,
>> >> >> >>
>> >> >> >> Gilles
>> >> >> >>
>> >> >> >> "Jeff Squyres (jsquyres)" <jsquy...@cisco.com
>> <javascript:_e(%7B%7D,'cvml','jsquy...@cisco.com');>> wrote:
>> >> >> >> >On Sep 21, 2016, at 10:56 AM, George Bosilca <
>> bosi...@icl.utk.edu <javascript:_e(%7B%7D,'cvml','bosi...@icl.utk.edu');>
>> >
>> >> >> >> > wrote:
>> >> >> >> >>
>> >> >> >> >> No, because 127.x.x.x is by default part of the exclude, so it
>> >> >> >> >> will
>> >> >> >> >> never get into the modex. The problem today, is that even if
>> you
>> >> >> >> >> manually
>> >> >> >> >> remove it from the exclude and add it to the include, it will
>> not
>> >> >> >> >> work,
>> >> >> >> >> because of the hardcoded checks. Once we remove those checks,
>> >> >> >> >> things
>> >> >> >> >> will
>> >> >> >> >> work the way we expect, interfaces are removed because they
>> don't
>> >> >> >> >> match the
>> >> >> >> >> provided addresses.
>> >> >> >> >
>> >> >> >> >Gotcha.
>> >> >> >> >
>> >> >> >> >> I would have agreed with you if the current code was doing a
>> >> >> >> >> better
>> >> >> >> >> decision of what is local and what not. But it is not, it
>> simply
>> >> >> >> >> remove all
>> >> >> >> >> 127.x.x.x interfaces (opal/util/net.c:222). Thus, the only
>> thing
>> >> >> >> >> the
>> >> >> >> >> current
>> >> >> >> >> code does, is preventing a power-user from using the loopback
>> >> >> >> >> (despite being
>> >> >> >> >> explicitly enabled via the corresponding MCA parameters).
>> >> >> >> >
>> >> >> >> >Fair enough.
>> >> >> >> >
>> >> >> >> >Should we have a keyword that can be used in the
>> >> >> >> > btl_tcp_if_include/exclude (e.g., "local") that removes all
>> >> >> >> > local-only
>> >> >> >> > interfaces?  I.E., all 127.x.x.x/8 interfaces *and* all
>> local-only
>> >> >> >> > interfaces (e.g., bridging interfaces to local VMs and the
>> like)?
>> >> >> >> >
>> >> >> >> >We could then replace the default "127.0.0.0/8" value in
>> >> >> >> > btl_tcp_if_exclude with this token, and therefore actually
>> exclude
>> >> >> >> > the
>> >> >> >> > VM-only interfaces (which have caused some users problems in
>> the
>> >> >> >> > past).
>> >> >> >> >
>> >> >> >> >--
>> >> >> >> >Jeff Squyres
>> >> >> >> >jsquy...@cisco.com
>> <javascript:_e(%7B%7D,'cvml','jsquy...@cisco.com');>
>> >> >> >> >For corporate legal information go to:
>> >> >> >> > http://www.cisco.com/web/about/doing_business/legal/cri/
>> >> >> >> >
>> >> >> >> >_______________________________________________
>> >> >> >> >devel mailing list
>> >> >> >> >devel@lists.open-mpi.org
>> <javascript:_e(%7B%7D,'cvml','devel@lists.open-mpi.org');>
>> >> >> >> >https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
>> >> >> >> _______________________________________________
>> >> >> >> devel mailing list
>> >> >> >> devel@lists.open-mpi.org
>> <javascript:_e(%7B%7D,'cvml','devel@lists.open-mpi.org');>
>> >> >> >> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
>> >> >> >
>> >> >> >
>> >> >> > _______________________________________________
>> >> >> > devel mailing list
>> >> >> > devel@lists.open-mpi.org
>> <javascript:_e(%7B%7D,'cvml','devel@lists.open-mpi.org');>
>> >> >> > https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
>> >> >> _______________________________________________
>> >> >> devel mailing list
>> >> >> devel@lists.open-mpi.org
>> <javascript:_e(%7B%7D,'cvml','devel@lists.open-mpi.org');>
>> >> >> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
>> >> >
>> >> >
>> >> >
>> >> > _______________________________________________
>> >> > devel mailing list
>> >> > devel@lists.open-mpi.org
>> <javascript:_e(%7B%7D,'cvml','devel@lists.open-mpi.org');>
>> >> > https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
>> >> _______________________________________________
>> >> devel mailing list
>> >> devel@lists.open-mpi.org
>> <javascript:_e(%7B%7D,'cvml','devel@lists.open-mpi.org');>
>> >> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
>> >
>> >
>> > _______________________________________________
>> > devel mailing list
>> > devel@lists.open-mpi.org
>> <javascript:_e(%7B%7D,'cvml','devel@lists.open-mpi.org');>
>> > https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
>> _______________________________________________
>> devel mailing list
>> devel@lists.open-mpi.org
>> <javascript:_e(%7B%7D,'cvml','devel@lists.open-mpi.org');>
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
>>
>
>
_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

Reply via email to