ok, i was not clear

by "let's consider the case where "lo" is *not* excluded via the
btl_tcp_if_exclude MCA param" i really meant
"let's consider the case where the value of the btl_tcp_if_exclude MCA
param has been forced to a list of network/interfaces that do not
contain any reference (e.g. name nor subnet) to the loopback
interface"
/* in a previous example, i did mpirun --mca btl_tcp_if_exclude ^ib0 */

my concern is that openmpi-mca-params.conf contains
btl_tcp_if_exclude = ^ib0

then hiccups will start when Open MPI is updated, and i expect some complains.
of course we can reply, doc should have been read and advices
followed, so one cannot complain just because he has been lucky so
far.
or we can do things a bit differently so we do not run into this case

/* if btl/self is excluded, the app will not start and it is trivial
to append to the error message a note asking to ensure btl/self was
not excluded.
in this case, i do not think we have a mechanism to issue a warning
message (e.g. "ensure lo is excluded") when hiccups occur. */

Cheers,

Gilles

On Thu, Sep 22, 2016 at 9:54 AM, George Bosilca <bosi...@icl.utk.edu> wrote:
> On Wednesday, September 21, 2016, Gilles Gouaillardet
> <gilles.gouaillar...@gmail.com> wrote:
>>
>> George,
>>
>> let's consider the case where "lo" is *not* excluded via the
>> btl_tcp_if_exclude MCA param
>> (if i understand correctly, the following is also true if "lo" is
>> included via the btl_tcp_if_include MCA param)
>>
>> currently, and because of/thanks to the test that is done "deep inside"
>> 1) on a disconnected laptop, mpirun --mca btl tcp,self ... fails with
>> 2 tasks or more because tasks cannot reach each other
>> 2) on a (connected) cluster, "lo" is never used and mpirun --mca btl
>> tcp,self ... does not hang when tasks are running on two nodes or more
>>
>> with your proposal :
>> 3) on a disconnected laptop, mpirun --mca btl tcp,self ... works with
>> any number of taks, because "lo" is used by btl/tcp
>> 4) on a (connected) cluster, "lo" is used and mpirun --mca btl
>> tcp,self ... will very likely hang when tasks are running on two nodes
>> or more
>>
>> am i right so far ?
>
>
> No, you are missing the fact that thanks to our if_exclude (which contains
> by default 127.0.0.0/24) we will never use lo (not even with my patch).
> Thus, local interfaces will remain out of reach for most users, with the
> exception of those that manually force the inclusion of lo via if_include.
>
> On a cluster where a user explicitly enable lo, there will be some hiccups
> during startup. However, as Paul states we explicitly discourage people of
> doing that in the README. Second, the connection over lo will eventually
> timeout, and lo it will be dropped and all pending communications will be
> redirected through another TCP interface.
>
> Cheers,
> George.
>
>
>>
>> my concern is 4)
>> as Paul pointed out, we can consider this is not an issue since this
>> is a user/admin mistake, and we do not care whether this is an honest
>> one or not. that being said, this is not very friendly since something
>> that is working fine today will (likely) start hanging when your patch
>> is merged.
>>
>> my suggestion differs since it is basically 2) and 3), which can be
>> seen as the best of both worlds
>>
>> makes sense ?
>>
>> as a side note, there were some discussions about automatically adding
>> the self btl,
>> and even offering a user friendly alternative to --mca btl xxx
>> (for example --networks shm,infiniband. today Open MPI does not
>> provide any alternative to btl/self. also infiniband can be used via
>> btl/openib, mtl/mxm or libfabric, which makes it painful to
>> blacklist). i cannot remember the outcome of the discussion (if any).
>>
>> Cheers,
>>
>> Gilles
>>
>> On Thu, Sep 22, 2016 at 4:57 AM, George Bosilca <bosi...@icl.utk.edu>
>> wrote:
>> > Gilles,
>> >
>> > I don't understand how your proposal is any different than what we have
>> > today. I quote "If [locality flag is set], then we could keep a hard
>> > coded
>> > test so 127.x.y.z address (and IPv6 equivalent) are never used (even if
>> > included or not excluded) for inter node communication". We already have
>> > a
>> > hardcoded test to prevent 127.x.y.z addresses from being used. In fact
>> > we
>> > have 2 tests, one because this address range is part of our default
>> > if_exclude, and then a second test (that only does something useful in
>> > case
>> > you manually added lo* to if_include) deep inside the IP matching logic.
>> >
>> >   George.
>> >
>> >
>> > On Wed, Sep 21, 2016 at 12:36 PM, Gilles Gouaillardet
>> > <gilles.gouaillar...@gmail.com> wrote:
>> >>
>> >> George,
>> >>
>> >> i got that, and i consider my suggestion as an improvement to your
>> >> proposal.
>> >>
>> >> if i want to exclude ib0, i might want to
>> >> mpirun --mca btl_tcp_if_exclude ib0 ...
>> >>
>> >> to me, this is an honest mistake, but with your proposal, i would be
>> >> screwed when
>> >> running on more than one node because i should have
>> >> mpirun --mca btl_tcp_if_exclude ib0,lo ...
>> >>
>> >> and if this parameter is set by the admin in the system-wide config,
>> >> then this configuration must be adapted by the admin, and that could
>> >> generate some confusion.
>> >>
>> >> my suggestion simply adds a "safety net" to your proposal
>> >>
>> >> for the sake of completion, i do not really care whether there should
>> >> be a safety net or not if localhost is explicitly included via the the
>> >> btl_tcp_if_include MCA parameter
>> >>
>> >> a different and safe/friendly proposal is to add a new
>> >> btl_tcp_if_exclude_localhost MCA param, which is true by default, so
>> >> you would simply force it to false if you want to MPI_Comm_spawn or
>> >> use the tcp btl on your disconnected laptop.
>> >>
>> >> as a side note, this reminds me that the openib/btl is used by default
>> >> for intra node communication between two tasks from different jobs (sm
>> >> nor vader cannot be used yet, and btl/openib has a higher exclusivity
>> >> than btl/tcp). my first impression is that i am not so comfortable
>> >> with that, and we could add yet an other MCA parameter so btl/openib
>> >> disqualifies itself for intra node communications.
>> >>
>> >>
>> >> Cheers,
>> >>
>> >> Gilles
>> >>
>> >> On Thu, Sep 22, 2016 at 12:56 AM, George Bosilca <bosi...@icl.utk.edu>
>> >> wrote:
>> >> > My proposal is not about adding new ways of deciding what is local
>> >> > and
>> >> > what
>> >> > not. I proposed to use the corresponding MCA parameters to allow the
>> >> > user to
>> >> > decide. More specifically, I want to be able to change the exclude
>> >> > and
>> >> > include MCA to enable TCP over local addresses.
>> >> >
>> >> > George
>> >> >
>> >> >
>> >> > On Sep 21, 2016 4:32 PM, "Gilles Gouaillardet"
>> >> > <gilles.gouaillar...@gmail.com> wrote:
>> >> >>
>> >> >> George,
>> >> >>
>> >> >> Is proc locality already set at that time ?
>> >> >>
>> >> >> If yes, then we could keep a hard coded test so 127.x.y.z address
>> >> >> (and
>> >> >> IPv6 equivalent) are never used (even if included or not excluded)
>> >> >> for
>> >> >> inter
>> >> >> node communication
>> >> >>
>> >> >> Cheers,
>> >> >>
>> >> >> Gilles
>> >> >>
>> >> >> "Jeff Squyres (jsquyres)" <jsquy...@cisco.com> wrote:
>> >> >> >On Sep 21, 2016, at 10:56 AM, George Bosilca <bosi...@icl.utk.edu>
>> >> >> > wrote:
>> >> >> >>
>> >> >> >> No, because 127.x.x.x is by default part of the exclude, so it
>> >> >> >> will
>> >> >> >> never get into the modex. The problem today, is that even if you
>> >> >> >> manually
>> >> >> >> remove it from the exclude and add it to the include, it will not
>> >> >> >> work,
>> >> >> >> because of the hardcoded checks. Once we remove those checks,
>> >> >> >> things
>> >> >> >> will
>> >> >> >> work the way we expect, interfaces are removed because they don't
>> >> >> >> match the
>> >> >> >> provided addresses.
>> >> >> >
>> >> >> >Gotcha.
>> >> >> >
>> >> >> >> I would have agreed with you if the current code was doing a
>> >> >> >> better
>> >> >> >> decision of what is local and what not. But it is not, it simply
>> >> >> >> remove all
>> >> >> >> 127.x.x.x interfaces (opal/util/net.c:222). Thus, the only thing
>> >> >> >> the
>> >> >> >> current
>> >> >> >> code does, is preventing a power-user from using the loopback
>> >> >> >> (despite being
>> >> >> >> explicitly enabled via the corresponding MCA parameters).
>> >> >> >
>> >> >> >Fair enough.
>> >> >> >
>> >> >> >Should we have a keyword that can be used in the
>> >> >> > btl_tcp_if_include/exclude (e.g., "local") that removes all
>> >> >> > local-only
>> >> >> > interfaces?  I.E., all 127.x.x.x/8 interfaces *and* all local-only
>> >> >> > interfaces (e.g., bridging interfaces to local VMs and the like)?
>> >> >> >
>> >> >> >We could then replace the default "127.0.0.0/8" value in
>> >> >> > btl_tcp_if_exclude with this token, and therefore actually exclude
>> >> >> > the
>> >> >> > VM-only interfaces (which have caused some users problems in the
>> >> >> > past).
>> >> >> >
>> >> >> >--
>> >> >> >Jeff Squyres
>> >> >> >jsquy...@cisco.com
>> >> >> >For corporate legal information go to:
>> >> >> > http://www.cisco.com/web/about/doing_business/legal/cri/
>> >> >> >
>> >> >> >_______________________________________________
>> >> >> >devel mailing list
>> >> >> >devel@lists.open-mpi.org
>> >> >> >https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
>> >> >> _______________________________________________
>> >> >> devel mailing list
>> >> >> devel@lists.open-mpi.org
>> >> >> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
>> >> >
>> >> >
>> >> > _______________________________________________
>> >> > devel mailing list
>> >> > devel@lists.open-mpi.org
>> >> > https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
>> >> _______________________________________________
>> >> devel mailing list
>> >> devel@lists.open-mpi.org
>> >> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
>> >
>> >
>> >
>> > _______________________________________________
>> > devel mailing list
>> > devel@lists.open-mpi.org
>> > https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
>> _______________________________________________
>> devel mailing list
>> devel@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
>
>
> _______________________________________________
> devel mailing list
> devel@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

Reply via email to