Hi,

> Open MPI should just "figure it out" and do the Right Thing at run-
> time -- is that not happening?
you are right it should.
But I want to exclude any traffic from OpenMPI communications, like NFS, 
traffic from other jobs and so on.
And use only special ethernet interface for this purpose. 

So I have OpenMPI 1.3.3 installed on all nodes and head node in the same 
directory.
OS is the same on all cluster - debian 5.0
On nodes I have two interfaces eth0 - for NFS and so on...
and eht1 for OpenMPI.
On head node I have 5 interfaces: eth0 for NFS, eth4 for OpenMPI
Network is next:
1) Head node eht0 + nodes eht0    : 192.168.0.0/24
2) Head node eth4 + nodes eth1    : 192.168.1.0/24

So how I can configure OpenMPI for using only network 2) for my purpose?
It  is one question.

Other problem is next: 
I try to run some examples. But unfortunately it is not work.
My be it is not correctly configured network.

I can submit any jobs only on one host from this host.
When I submit from head node for example to other nodes it hangs  without any 
messages.
And on node where I want to calculate I see that here is started orted daemon.
(I use default config files)

Below is examples:
mpirun -v --mca btl self,sm,tcp --mca btl_base_verbose 30 --mca 
btl_tcp_if_include eth0 -np 2 -host n10,n11 cpi
no output, no calculations, only orted daemon on nodes

mpirun -v --mca btl self,sm,tcp --mca btl_base_verbose 30 -np 2 -host n10,n11 
cpi
the same as abowe

mpirun -v --mca btl self,sm,tcp --mca btl_base_verbose 30 -np 2 -host n00,n00 
cpi
n00 is head node - it works and produces output.

on nodes:
route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
192.168.0.0      0.0.0.0         255.255.255.0   U     0      0        0 eth0
192.168.1.0      0.0.0.0         255.255.255.0   U     0      0        0 eth1
0.0.0.0         192.168.0.100    0.0.0.0         UG    0      0        0 eth0

on head node:
192.168.0.0      0.0.0.0         255.255.255.0   U     0      0        0 eth0
...
192.168.1.0      0.0.0.0         255.255.255.0   U     0      0        0 eth4
0.0.0.0         192.168.100.1    0.0.0.0         UG    0      0        0 eth1

node have name n01-n99
head node is n00

hosts file is like this and the same on all nodes:

127.0.0.1       localhost

192.168.0.1     n01.local   n01
192.168.0.2     n02.local   n02
...
192.168.0.99   n99.local   n99

192.168.1.1     n01e.local   n01e
192.168.1.2     n02e.local   n02e
...
192.168.1.99   n99e.local   n99e

/etc/host.conf:
multi on
order hosts,bind

/etc/resolv.conf:
search local
nameserver 127.0.0.1

DNS is not installed

/etc/nsswitch.conf:
...
hosts:          files dns
networks:       files


Thanx for help.

> I want to use for openmpi communication the additional ethernet
> interfaces on node and head node.
> its is eth1 on nodes and eth4 on head node.
> So how can I configure openmpi?
>
> If I add in config file
> btl_base_include=tcp,sm,self.
> btl_tcp_if_include=eth1
>
> will it work or not?
>
> And how is it working with torque batch system (daemons listen eth0
> on all nodes).

Reply via email to