George, OK then, I recommend we explicitly state in the README that loopback interface can no more be omitted from btl_tcp_if_exclude when running on multiple nodes
Cheers, Gilles On Thursday, September 22, 2016, George Bosilca <bosi...@icl.utk.edu> wrote: > Thanks for clarifying, I now understand what your objection/suggestion > was. We all misconfigured OMPI at least once, but that allowed us to learn > how to do it right. > > Instead of adding extra protections for corner-cases, maybe we should fix > our exclusivity flag so that the scenario you describe would not happen. > > George. > > PS: "btl_tcp_if_exclude = ^ib0" qualifies as a honest mistake. I wouldn't > dare proposing a new MCA param to prevent this ... > > > On Wed, Sep 21, 2016 at 10:54 PM, Gilles Gouaillardet < > gilles.gouaillar...@gmail.com > <javascript:_e(%7B%7D,'cvml','gilles.gouaillar...@gmail.com');>> wrote: > >> ok, i was not clear >> >> by "let's consider the case where "lo" is *not* excluded via the >> btl_tcp_if_exclude MCA param" i really meant >> "let's consider the case where the value of the btl_tcp_if_exclude MCA >> param has been forced to a list of network/interfaces that do not >> contain any reference (e.g. name nor subnet) to the loopback >> interface" >> /* in a previous example, i did mpirun --mca btl_tcp_if_exclude ^ib0 */ >> >> my concern is that openmpi-mca-params.conf contains >> btl_tcp_if_exclude = ^ib0 >> >> then hiccups will start when Open MPI is updated, and i expect some >> complains. >> of course we can reply, doc should have been read and advices >> followed, so one cannot complain just because he has been lucky so >> far. >> or we can do things a bit differently so we do not run into this case >> >> /* if btl/self is excluded, the app will not start and it is trivial >> to append to the error message a note asking to ensure btl/self was >> not excluded. >> in this case, i do not think we have a mechanism to issue a warning >> message (e.g. "ensure lo is excluded") when hiccups occur. */ >> >> Cheers, >> >> Gilles >> >> On Thu, Sep 22, 2016 at 9:54 AM, George Bosilca <bosi...@icl.utk.edu >> <javascript:_e(%7B%7D,'cvml','bosi...@icl.utk.edu');>> wrote: >> > On Wednesday, September 21, 2016, Gilles Gouaillardet >> > <gilles.gouaillar...@gmail.com >> <javascript:_e(%7B%7D,'cvml','gilles.gouaillar...@gmail.com');>> wrote: >> >> >> >> George, >> >> >> >> let's consider the case where "lo" is *not* excluded via the >> >> btl_tcp_if_exclude MCA param >> >> (if i understand correctly, the following is also true if "lo" is >> >> included via the btl_tcp_if_include MCA param) >> >> >> >> currently, and because of/thanks to the test that is done "deep inside" >> >> 1) on a disconnected laptop, mpirun --mca btl tcp,self ... fails with >> >> 2 tasks or more because tasks cannot reach each other >> >> 2) on a (connected) cluster, "lo" is never used and mpirun --mca btl >> >> tcp,self ... does not hang when tasks are running on two nodes or more >> >> >> >> with your proposal : >> >> 3) on a disconnected laptop, mpirun --mca btl tcp,self ... works with >> >> any number of taks, because "lo" is used by btl/tcp >> >> 4) on a (connected) cluster, "lo" is used and mpirun --mca btl >> >> tcp,self ... will very likely hang when tasks are running on two nodes >> >> or more >> >> >> >> am i right so far ? >> > >> > >> > No, you are missing the fact that thanks to our if_exclude (which >> contains >> > by default 127.0.0.0/24) we will never use lo (not even with my patch). >> > Thus, local interfaces will remain out of reach for most users, with the >> > exception of those that manually force the inclusion of lo via >> if_include. >> > >> > On a cluster where a user explicitly enable lo, there will be some >> hiccups >> > during startup. However, as Paul states we explicitly discourage people >> of >> > doing that in the README. Second, the connection over lo will eventually >> > timeout, and lo it will be dropped and all pending communications will >> be >> > redirected through another TCP interface. >> > >> > Cheers, >> > George. >> > >> > >> >> >> >> my concern is 4) >> >> as Paul pointed out, we can consider this is not an issue since this >> >> is a user/admin mistake, and we do not care whether this is an honest >> >> one or not. that being said, this is not very friendly since something >> >> that is working fine today will (likely) start hanging when your patch >> >> is merged. >> >> >> >> my suggestion differs since it is basically 2) and 3), which can be >> >> seen as the best of both worlds >> >> >> >> makes sense ? >> >> >> >> as a side note, there were some discussions about automatically adding >> >> the self btl, >> >> and even offering a user friendly alternative to --mca btl xxx >> >> (for example --networks shm,infiniband. today Open MPI does not >> >> provide any alternative to btl/self. also infiniband can be used via >> >> btl/openib, mtl/mxm or libfabric, which makes it painful to >> >> blacklist). i cannot remember the outcome of the discussion (if any). >> >> >> >> Cheers, >> >> >> >> Gilles >> >> >> >> On Thu, Sep 22, 2016 at 4:57 AM, George Bosilca <bosi...@icl.utk.edu >> <javascript:_e(%7B%7D,'cvml','bosi...@icl.utk.edu');>> >> >> wrote: >> >> > Gilles, >> >> > >> >> > I don't understand how your proposal is any different than what we >> have >> >> > today. I quote "If [locality flag is set], then we could keep a hard >> >> > coded >> >> > test so 127.x.y.z address (and IPv6 equivalent) are never used (even >> if >> >> > included or not excluded) for inter node communication". We already >> have >> >> > a >> >> > hardcoded test to prevent 127.x.y.z addresses from being used. In >> fact >> >> > we >> >> > have 2 tests, one because this address range is part of our default >> >> > if_exclude, and then a second test (that only does something useful >> in >> >> > case >> >> > you manually added lo* to if_include) deep inside the IP matching >> logic. >> >> > >> >> > George. >> >> > >> >> > >> >> > On Wed, Sep 21, 2016 at 12:36 PM, Gilles Gouaillardet >> >> > <gilles.gouaillar...@gmail.com >> <javascript:_e(%7B%7D,'cvml','gilles.gouaillar...@gmail.com');>> wrote: >> >> >> >> >> >> George, >> >> >> >> >> >> i got that, and i consider my suggestion as an improvement to your >> >> >> proposal. >> >> >> >> >> >> if i want to exclude ib0, i might want to >> >> >> mpirun --mca btl_tcp_if_exclude ib0 ... >> >> >> >> >> >> to me, this is an honest mistake, but with your proposal, i would be >> >> >> screwed when >> >> >> running on more than one node because i should have >> >> >> mpirun --mca btl_tcp_if_exclude ib0,lo ... >> >> >> >> >> >> and if this parameter is set by the admin in the system-wide config, >> >> >> then this configuration must be adapted by the admin, and that could >> >> >> generate some confusion. >> >> >> >> >> >> my suggestion simply adds a "safety net" to your proposal >> >> >> >> >> >> for the sake of completion, i do not really care whether there >> should >> >> >> be a safety net or not if localhost is explicitly included via the >> the >> >> >> btl_tcp_if_include MCA parameter >> >> >> >> >> >> a different and safe/friendly proposal is to add a new >> >> >> btl_tcp_if_exclude_localhost MCA param, which is true by default, so >> >> >> you would simply force it to false if you want to MPI_Comm_spawn or >> >> >> use the tcp btl on your disconnected laptop. >> >> >> >> >> >> as a side note, this reminds me that the openib/btl is used by >> default >> >> >> for intra node communication between two tasks from different jobs >> (sm >> >> >> nor vader cannot be used yet, and btl/openib has a higher >> exclusivity >> >> >> than btl/tcp). my first impression is that i am not so comfortable >> >> >> with that, and we could add yet an other MCA parameter so btl/openib >> >> >> disqualifies itself for intra node communications. >> >> >> >> >> >> >> >> >> Cheers, >> >> >> >> >> >> Gilles >> >> >> >> >> >> On Thu, Sep 22, 2016 at 12:56 AM, George Bosilca < >> bosi...@icl.utk.edu <javascript:_e(%7B%7D,'cvml','bosi...@icl.utk.edu');> >> > >> >> >> wrote: >> >> >> > My proposal is not about adding new ways of deciding what is local >> >> >> > and >> >> >> > what >> >> >> > not. I proposed to use the corresponding MCA parameters to allow >> the >> >> >> > user to >> >> >> > decide. More specifically, I want to be able to change the exclude >> >> >> > and >> >> >> > include MCA to enable TCP over local addresses. >> >> >> > >> >> >> > George >> >> >> > >> >> >> > >> >> >> > On Sep 21, 2016 4:32 PM, "Gilles Gouaillardet" >> >> >> > <gilles.gouaillar...@gmail.com >> <javascript:_e(%7B%7D,'cvml','gilles.gouaillar...@gmail.com');>> wrote: >> >> >> >> >> >> >> >> George, >> >> >> >> >> >> >> >> Is proc locality already set at that time ? >> >> >> >> >> >> >> >> If yes, then we could keep a hard coded test so 127.x.y.z address >> >> >> >> (and >> >> >> >> IPv6 equivalent) are never used (even if included or not >> excluded) >> >> >> >> for >> >> >> >> inter >> >> >> >> node communication >> >> >> >> >> >> >> >> Cheers, >> >> >> >> >> >> >> >> Gilles >> >> >> >> >> >> >> >> "Jeff Squyres (jsquyres)" <jsquy...@cisco.com >> <javascript:_e(%7B%7D,'cvml','jsquy...@cisco.com');>> wrote: >> >> >> >> >On Sep 21, 2016, at 10:56 AM, George Bosilca < >> bosi...@icl.utk.edu <javascript:_e(%7B%7D,'cvml','bosi...@icl.utk.edu');> >> > >> >> >> >> > wrote: >> >> >> >> >> >> >> >> >> >> No, because 127.x.x.x is by default part of the exclude, so it >> >> >> >> >> will >> >> >> >> >> never get into the modex. The problem today, is that even if >> you >> >> >> >> >> manually >> >> >> >> >> remove it from the exclude and add it to the include, it will >> not >> >> >> >> >> work, >> >> >> >> >> because of the hardcoded checks. Once we remove those checks, >> >> >> >> >> things >> >> >> >> >> will >> >> >> >> >> work the way we expect, interfaces are removed because they >> don't >> >> >> >> >> match the >> >> >> >> >> provided addresses. >> >> >> >> > >> >> >> >> >Gotcha. >> >> >> >> > >> >> >> >> >> I would have agreed with you if the current code was doing a >> >> >> >> >> better >> >> >> >> >> decision of what is local and what not. But it is not, it >> simply >> >> >> >> >> remove all >> >> >> >> >> 127.x.x.x interfaces (opal/util/net.c:222). Thus, the only >> thing >> >> >> >> >> the >> >> >> >> >> current >> >> >> >> >> code does, is preventing a power-user from using the loopback >> >> >> >> >> (despite being >> >> >> >> >> explicitly enabled via the corresponding MCA parameters). >> >> >> >> > >> >> >> >> >Fair enough. >> >> >> >> > >> >> >> >> >Should we have a keyword that can be used in the >> >> >> >> > btl_tcp_if_include/exclude (e.g., "local") that removes all >> >> >> >> > local-only >> >> >> >> > interfaces? I.E., all 127.x.x.x/8 interfaces *and* all >> local-only >> >> >> >> > interfaces (e.g., bridging interfaces to local VMs and the >> like)? >> >> >> >> > >> >> >> >> >We could then replace the default "127.0.0.0/8" value in >> >> >> >> > btl_tcp_if_exclude with this token, and therefore actually >> exclude >> >> >> >> > the >> >> >> >> > VM-only interfaces (which have caused some users problems in >> the >> >> >> >> > past). >> >> >> >> > >> >> >> >> >-- >> >> >> >> >Jeff Squyres >> >> >> >> >jsquy...@cisco.com >> <javascript:_e(%7B%7D,'cvml','jsquy...@cisco.com');> >> >> >> >> >For corporate legal information go to: >> >> >> >> > http://www.cisco.com/web/about/doing_business/legal/cri/ >> >> >> >> > >> >> >> >> >_______________________________________________ >> >> >> >> >devel mailing list >> >> >> >> >devel@lists.open-mpi.org >> <javascript:_e(%7B%7D,'cvml','devel@lists.open-mpi.org');> >> >> >> >> >https://rfd.newmexicoconsortium.org/mailman/listinfo/devel >> >> >> >> _______________________________________________ >> >> >> >> devel mailing list >> >> >> >> devel@lists.open-mpi.org >> <javascript:_e(%7B%7D,'cvml','devel@lists.open-mpi.org');> >> >> >> >> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel >> >> >> > >> >> >> > >> >> >> > _______________________________________________ >> >> >> > devel mailing list >> >> >> > devel@lists.open-mpi.org >> <javascript:_e(%7B%7D,'cvml','devel@lists.open-mpi.org');> >> >> >> > https://rfd.newmexicoconsortium.org/mailman/listinfo/devel >> >> >> _______________________________________________ >> >> >> devel mailing list >> >> >> devel@lists.open-mpi.org >> <javascript:_e(%7B%7D,'cvml','devel@lists.open-mpi.org');> >> >> >> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel >> >> > >> >> > >> >> > >> >> > _______________________________________________ >> >> > devel mailing list >> >> > devel@lists.open-mpi.org >> <javascript:_e(%7B%7D,'cvml','devel@lists.open-mpi.org');> >> >> > https://rfd.newmexicoconsortium.org/mailman/listinfo/devel >> >> _______________________________________________ >> >> devel mailing list >> >> devel@lists.open-mpi.org >> <javascript:_e(%7B%7D,'cvml','devel@lists.open-mpi.org');> >> >> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel >> > >> > >> > _______________________________________________ >> > devel mailing list >> > devel@lists.open-mpi.org >> <javascript:_e(%7B%7D,'cvml','devel@lists.open-mpi.org');> >> > https://rfd.newmexicoconsortium.org/mailman/listinfo/devel >> _______________________________________________ >> devel mailing list >> devel@lists.open-mpi.org >> <javascript:_e(%7B%7D,'cvml','devel@lists.open-mpi.org');> >> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel >> > >
_______________________________________________ devel mailing list devel@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/devel