Re: [OMPI devel] [GE users] OpenMPI 1.2 integration and dedicated MPI networks

Orion Poplawski Thu, 2 Nov 2006 15:38:42 -0500

Pak Lui wrote:

Orion Poplawski wrote:
In our setup (which I don't believe is very unique) the nodes areconnected by two networks: an "admin" network which allows forconnections from outside the cluster and an "MPI" network that is aprivate GigE network connecting the nodes for MPI traffic:
       +---------admin net (192.168.0.X)--------+
       |                           |            |
+-----------+                 +--------+    +--------+
| SGE Master|                 | coop00 |    | coop01 |
|           |                 | coop00x|    | coop01x|
+-----------+                 +--------+    +--------+
                                   |            |
                                   +------------+

                                    MPI net (192.168.1.X)

So the "x" suffix names are the addresses on the MPI network.

Currently (loose integration), we create machines files like:

coop00x.cora.nwra.com cpu=2
coop01x.cora.nwra.com cpu=2
which makes the MPI traffic travel over the MPI network. I'm tryingto duplicate this under "tight" integration.

Well, this is what we did with LAM and I naively assumed that sinceOpenMPI used that same machines file format, it worked the same there.But once I finally read the FAQ (specifically:<http://www.open-mpi.org/faq/?category=tcp#tcp-selection>) I see that itworks totally differently.


So, default setup with gridengine integration works, and I just have:

btl_tcp_if_include = eth1

in my /etc/openmpi-mca-params.conf file.

Sorry for all the confusion.

--
Orion Poplawski
System Administrator                  303-415-9701 x222
NWRA/CoRA Division                    FAX: 303-415-9702
3380 Mitchell Lane                  or...@cora.nwra.com
Boulder, CO 80301              http://www.cora.nwra.com

Re: [OMPI devel] [GE users] OpenMPI 1.2 integration and dedicated MPI networks

Reply via email to