I guess I can see that, but I have the opposite use case; I have a device on some nodes and not others that I want to ignore, so I set btl_tcp_if_exclude to include that device. It would be totally counter-intuitive to have a giant warning because of that.
Brian On 2/5/13 6:46 AM, "Jeff Squyres (jsquyres)" <jsquy...@cisco.com> wrote: >I had a typo in my btl_tcp_if_exclude such that it was effectively > > mpirun --mca btl_tco_if_exclude bogus ... > >instead of ignoring the actual interface I wanted to ignore. And since I >wasn't ignoring the special loopback device that I have on some machines, >every single MPI job hung because they tried to use those interfaces to >communicate with processes on other nodes that that interface could not >reach. > > > >On Feb 4, 2013, at 5:56 PM, "Barrett, Brian W" <bwba...@sandia.gov> wrote: > >> I'm confused; why is it disastrous to have an interface in if_exclude >>that doesn't exist? I can see it being a problem if we don't exclude >>something in the list, but the other way is (in my opinion) harmless but >>with a useful use case... >> >> Brian >> >> >> >> Sent with Good (www.good.com) >> >> >> -----Original Message----- >> From: Jeff Squyres (jsquyres) [mailto:jsquy...@cisco.com] >> Sent: Monday, February 04, 2013 06:47 PM Mountain Standard Time >> To: Open MPI Developers >> Subject: [EXTERNAL] Re: [OMPI devel] [OMPI svn] svn:open-mpi r28016 - >>trunk/ompi/mca/btl/tcp >> >> On Feb 4, 2013, at 2:03 PM, George Bosilca <bosi...@icl.utk.edu> wrote: >> >>> The two behaviors you describe for include and exclude do not look >>>conflicting to me. Inclusion is a strong request, the user enforce the >>>usage of a specific interface. If the interface is not available, then >>>we have a problem. Exclude on the other side, must enforce that a >>>specific interface is not in use, fact that can be quite simple if the >>>interface is not available. >> >> I still maintain that it's equally disastrous if you don't exclude the >>correct interfaces (I lost 2 nights of MTT because of this!). >> >>> I'm not a fan of the nowarn option. Seems like a lot of code with >>>limited interest, especially if we only plan to support it in TCP. >> >> This is a good point -- I wonder what openib (and others?) do who >>support *_if_include and *_if_exclude notation. Do they warn / error if >>you specify an invalid interface? >> >>> If you need specialized arguments for some of your nodes here is what >>>I do: rename the binaries to .orig, and use the original name to create >>>a sh script that will change the value of mca_param_files to something >>>based on the host name (if such a file exists) and then call the .orig >>>executable. Works like a charm., even when a batch scheduler is used. >> >> That will still be quite difficult to do in MTT. Remember: all the >>tests that are run in MTT are shared across all of us via the ompi-tests >>SVN repo. Are you suggesting that I alias every test in the ompi-tests >>SVN with a public script that you should run that should look for some >>site-specific MCA override param file? >> >> -- >> Jeff Squyres >> jsquy...@cisco.com >> For corporate legal information go to: >>http://www.cisco.com/web/about/doing_business/legal/cri/ >> >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > >-- >Jeff Squyres >jsquy...@cisco.com >For corporate legal information go to: >http://www.cisco.com/web/about/doing_business/legal/cri/ > > >_______________________________________________ >devel mailing list >de...@open-mpi.org >http://www.open-mpi.org/mailman/listinfo.cgi/devel > > -- Brian W. Barrett Scalable System Software Group Sandia National Laboratories
smime.p7s
Description: S/MIME cryptographic signature