I guess I can see that, but I have the opposite use case; I have a device
on some nodes and not others that I want to ignore, so I set
btl_tcp_if_exclude to include that device.  It would be totally
counter-intuitive to have a giant warning because of that.

Brian

On 2/5/13 6:46 AM, "Jeff Squyres (jsquyres)" <jsquy...@cisco.com> wrote:

>I had a typo in my btl_tcp_if_exclude such that it was effectively
>
>  mpirun --mca btl_tco_if_exclude bogus ...
>
>instead of ignoring the actual interface I wanted to ignore.  And since I
>wasn't ignoring the special loopback device that I have on some machines,
>every single MPI job hung because they tried to use those interfaces to
>communicate with processes on other nodes that that interface could not
>reach.
>
>
>
>On Feb 4, 2013, at 5:56 PM, "Barrett, Brian W" <bwba...@sandia.gov> wrote:
>
>> I'm confused; why is it disastrous to have an interface in if_exclude
>>that doesn't exist?  I can see it being a problem if we don't exclude
>>something in the list, but the other way is (in my opinion) harmless but
>>with a useful use case...
>> 
>> Brian
>> 
>> 
>> 
>> Sent with Good (www.good.com)
>> 
>> 
>> -----Original Message-----
>> From:        Jeff Squyres (jsquyres) [mailto:jsquy...@cisco.com]
>> Sent:        Monday, February 04, 2013 06:47 PM Mountain Standard Time
>> To:  Open MPI Developers
>> Subject:     [EXTERNAL] Re: [OMPI devel] [OMPI svn] svn:open-mpi r28016 -
>>trunk/ompi/mca/btl/tcp
>> 
>> On Feb 4, 2013, at 2:03 PM, George Bosilca <bosi...@icl.utk.edu> wrote:
>> 
>>> The two behaviors you describe for include and exclude do not look
>>>conflicting to me. Inclusion is a strong request, the user enforce the
>>>usage of a specific interface. If the interface is not available, then
>>>we have a problem. Exclude on the other side, must enforce that a
>>>specific interface is not in use, fact that can be quite simple if the
>>>interface is not available.
>> 
>> I still maintain that it's equally disastrous if you don't exclude the
>>correct interfaces (I lost 2 nights of MTT because of this!).
>> 
>>> I'm not a fan of the nowarn option. Seems like a lot of code with
>>>limited interest, especially if we only plan to support it in TCP.
>> 
>> This is a good point -- I wonder what openib (and others?) do who
>>support *_if_include and *_if_exclude notation.  Do they warn / error if
>>you specify an invalid interface?
>> 
>>> If you need specialized arguments for some of your nodes here is what
>>>I do: rename the binaries to .orig, and use the original name to create
>>>a sh script that will change the value of mca_param_files to something
>>>based on the host name (if such a file exists) and then call the .orig
>>>executable. Works like a charm., even when a batch scheduler is used.
>> 
>> That will still be quite difficult to do in MTT.  Remember: all the
>>tests that are run in MTT are shared across all of us via the ompi-tests
>>SVN repo.  Are you suggesting that I alias every test in the ompi-tests
>>SVN with a public script that you should run that should look for some
>>site-specific MCA override param file?
>> 
>> -- 
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to:
>>http://www.cisco.com/web/about/doing_business/legal/cri/
>> 
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
>-- 
>Jeff Squyres
>jsquy...@cisco.com
>For corporate legal information go to:
>http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
>_______________________________________________
>devel mailing list
>de...@open-mpi.org
>http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>


--
  Brian W. Barrett
  Scalable System Software Group
  Sandia National Laboratories



Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to