On Feb 12, 2007, at 12:54 PM, Matteo Guglielmi wrote:

This is the ifconfig output from the machine I'm used to submit the
parallel job:

It looks like both of your nodes share an IP address:

[root@lcbcpc02 ~]# ifconfig
eth1      Link encap:Ethernet  HWaddr 00:15:17:10:53:C9
inet addr:192.168.0.1 Bcast:192.168.0.255 Mask: 255.255.255.0
[root@lcbcpc04 ~]# ifconfig
eth1      Link encap:Ethernet  HWaddr 00:15:17:10:53:75
inet addr:192.168.0.1 Bcast:192.168.0.255 Mask: 255.255.255.0

This will be problematic to more than just OMPI if these two interfaces are on the same network. The solution is to ensure that all your nodes have unique IP addresses.

If these NICs are on different networks, than it's a valid network configuration, but Open MPI (by default) will assume that these are routable to each other. You can tell Open MPI to not use eth1 in this case -- see this FAQ entries for details:

  http://www.open-mpi.org/faq/?category=tcp#tcp-multi-network
  http://www.open-mpi.org/faq/?category=tcp#tcp-selection
  http://www.open-mpi.org/faq/?category=tcp#tcp-routability

--
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems

Reply via email to