Hi Olivier and list

I presume you are talking about Ethernet or GigE.
The basic information on  how to launch jobs is on the OpenMPI FAQ pages:

http://www.open-mpi.org/faq/?category=tcp
http://www.open-mpi.org/faq/?category=tcp#tcp-selection

Here is what I did on our toy/test cluster made of salvaged computers.

1) I use ROCKS cluster, which makes some steps more automatic then described below.
However, ROCKS is not needed for this.

2) I have actually three private networks, but you may use, say, two,
if your motherboards have dual Ethernet (or GigE) ports.
Each node has three NICs, which Linux recognized and activated as eth0, eth1, eth2.

Make sure you and Linux agree on which port is eth0, eth1, etc.
This may be a bit tricky, the kernel seems to have its own wisdom and mood when it assigns the port names. Ping, lspci, ifconfig, ifup, ifdown, ethtool, are your friends here, and can help you
sort out the correct port-name map.

3) For a modest number of nodes, less than 8, you can buy inexpensive SOHO type GigE switches,
one for each network, for about $50 a piece. (This is what I did.)
For more nodes you would need larger switches.
Use Cat5e or Cat6 Ethernet cables and connect the separate networks using the correct ports on the
nodes and switches.
Well, you may have done that already ...

4) On RHEL or Fedora the essential information is on /etc/sysconfig/network-scripts/ifcfg-eth[0,1,2],
on each of your cluster nodes.
Other Linux distributions may have equivalent files.
You need to edit these files to insert the correct IP address, netmask, and MAC address.

For instance, if you have less than 254 nodes, you can define private networks like this:
net1) 192.168.1.0 netmask 255.255.255.0  (using the eth0 port)
net2) 192.168.2.0 netmask 255.255.255.0 (using the eth1 port)
net3) 192.168.3.0 netmask 255.255.255.0 (using the eth2 port)
etc.

Here is an example:

[node1] $ cat /etc/sysconfig/network-scripts/ifcfg-eth0

DEVICE=eth0
HWADDR=(put your eth0 port MAC address here)
IPADDR=192.168.1.1   ( ... 192.168.1.2 on node2, etc)
NETMASK=255.255.255.0
BOOTPROTO=none
ONBOOT=yes

[node1] $ cat /etc/sysconfig/network-scripts/ifcfg-eth1

DEVICE=eth1
HWADDR=(put your eth1 port MAC address here)
IPADDR=192.168.2.1 ( ... 192.168.2.2 on node2, etc)
NETMASK=255.255.255.0
BOOTPROTO=none
ONBOOT=yes


5) To launch the OpenMPI program "mp_prog"
using the 192.168.2.0 (i.e. "eth1") network using, say, 8 processes, do:

mpiexec --mca btl_tcp_if_include eth1 -n 8 my_prog

(Good if your 192.168.1.0 (eth0) network is already used for I/O, control, etc.)

To be more aggressive, and use both networks,
192.168.1.0 ("eth0") and 192.168.2.0 ("eth1")  do:

mpiexec --mca btl_tcp_if_include eth0,eth1 -n 8 my_prog

***

Works for me.
I hope it helps!

Gus Correa
PS - More answers below.

--
---------------------------------------------------------------------
Gustavo J. Ponce Correa, PhD - Email: g...@ldeo.columbia.edu
Lamont-Doherty Earth Observatory - Columbia University
P.O. Box 1000 [61 Route 9W] - Palisades, NY, 10964-8000 - USA
---------------------------------------------------------------------


Olivier Marsden wrote:

Hello,
I am configuring a cluster with multiple nics for use with open mpi.
I have not found very much information on the best way of setting up
my network for open mpi. At the moment I have a pretty standard setup
with a single hostname and single ip address for each node.
Could someone advise me on the following points?
- for each node, should I have the second ip on the same subnet as the first, or not ?

No, use separate subnets.


- does openmpi need separate hostnames for each ip?

No, same hostname, but different subnets and different IPs for each port on a given host.


If there is a webpage describing how to configure such a network for the best, that
would be great.

Yes, to some extent.
Look at the OpenMPI FAQ:
http://www.open-mpi.org/faq/?category=tcp
http://www.open-mpi.org/faq/?category=tcp#tcp-selection

Many thanks,

Olivier Marsden
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to