Yeah, that's the quandary: I can see both use cases.

That's why I proposed the "nowarn:" syntax that George hated.  :-)

Got any other suggestion on how to handle both use cases?



On Feb 5, 2013, at 7:25 AM, "Barrett, Brian W" <bwba...@sandia.gov> wrote:

> I guess I can see that, but I have the opposite use case; I have a device
> on some nodes and not others that I want to ignore, so I set
> btl_tcp_if_exclude to include that device.  It would be totally
> counter-intuitive to have a giant warning because of that.
> 
> Brian
> 
> On 2/5/13 6:46 AM, "Jeff Squyres (jsquyres)" <jsquy...@cisco.com> wrote:
> 
>> I had a typo in my btl_tcp_if_exclude such that it was effectively
>> 
>> mpirun --mca btl_tco_if_exclude bogus ...
>> 
>> instead of ignoring the actual interface I wanted to ignore.  And since I
>> wasn't ignoring the special loopback device that I have on some machines,
>> every single MPI job hung because they tried to use those interfaces to
>> communicate with processes on other nodes that that interface could not
>> reach.
>> 
>> 
>> 
>> On Feb 4, 2013, at 5:56 PM, "Barrett, Brian W" <bwba...@sandia.gov> wrote:
>> 
>>> I'm confused; why is it disastrous to have an interface in if_exclude
>>> that doesn't exist?  I can see it being a problem if we don't exclude
>>> something in the list, but the other way is (in my opinion) harmless but
>>> with a useful use case...
>>> 
>>> Brian
>>> 
>>> 
>>> 
>>> Sent with Good (www.good.com)
>>> 
>>> 
>>> -----Original Message-----
>>> From:       Jeff Squyres (jsquyres) [mailto:jsquy...@cisco.com]
>>> Sent:       Monday, February 04, 2013 06:47 PM Mountain Standard Time
>>> To: Open MPI Developers
>>> Subject:    [EXTERNAL] Re: [OMPI devel] [OMPI svn] svn:open-mpi r28016 -
>>> trunk/ompi/mca/btl/tcp
>>> 
>>> On Feb 4, 2013, at 2:03 PM, George Bosilca <bosi...@icl.utk.edu> wrote:
>>> 
>>>> The two behaviors you describe for include and exclude do not look
>>>> conflicting to me. Inclusion is a strong request, the user enforce the
>>>> usage of a specific interface. If the interface is not available, then
>>>> we have a problem. Exclude on the other side, must enforce that a
>>>> specific interface is not in use, fact that can be quite simple if the
>>>> interface is not available.
>>> 
>>> I still maintain that it's equally disastrous if you don't exclude the
>>> correct interfaces (I lost 2 nights of MTT because of this!).
>>> 
>>>> I'm not a fan of the nowarn option. Seems like a lot of code with
>>>> limited interest, especially if we only plan to support it in TCP.
>>> 
>>> This is a good point -- I wonder what openib (and others?) do who
>>> support *_if_include and *_if_exclude notation.  Do they warn / error if
>>> you specify an invalid interface?
>>> 
>>>> If you need specialized arguments for some of your nodes here is what
>>>> I do: rename the binaries to .orig, and use the original name to create
>>>> a sh script that will change the value of mca_param_files to something
>>>> based on the host name (if such a file exists) and then call the .orig
>>>> executable. Works like a charm., even when a batch scheduler is used.
>>> 
>>> That will still be quite difficult to do in MTT.  Remember: all the
>>> tests that are run in MTT are shared across all of us via the ompi-tests
>>> SVN repo.  Are you suggesting that I alias every test in the ompi-tests
>>> SVN with a public script that you should run that should look for some
>>> site-specific MCA override param file?
>>> 
>>> -- 
>>> Jeff Squyres
>>> jsquy...@cisco.com
>>> For corporate legal information go to:
>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>> 
>>> 
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> 
>>> 
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> 
>> -- 
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>> 
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> 
> 
> 
> --
>  Brian W. Barrett
>  Scalable System Software Group
>  Sandia National Laboratories
> 
> 
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to