This is an open question to OMPI developers...

It looks like RHEL (and maybe others?) adds the "virbr0" IP interface when Xen 
is activated.  This IP interface is only used to communicate with the local Xen 
instance(s); it is not used to communicate over the real network.  

In a case that I saw, the interface is created, set to "up", and is given an IP 
address in the 192.168.1.x range.  This was done by default -- all the user had 
done was either say "yes, I want Xen enabled", or he didn't say he wanted it 
*disabled* (I'm not sure which).

This causes a problem if you have Xen enabled on multiple machines in an OMPI 
job.  OMPI will see the 192.168.1.x address and see that it's "up", so it'll 
add it to the eligible subnets that can be used.  When OMPI sees that its peer 
processes also have 192.168.1.x, it'll try to use that network for OOB/BTL 
traffic -- which will fail, because these are local-only interfaces.

Should we add "virbr0" to the default value for [btl|oob]_tcp_if_exclude?  

Or is there another way to detect that an interface is local-only and should 
not be used for OOB/BTL communication?

See this post on the user's list:

    http://www.open-mpi.org/community/lists/users/2012/02/18432.php

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to