Re: [OMPI users] forcing MPI to bind all sockets to 127.0.0.1

2007-05-30 Thread Bill Saphir


George,

This is one of the things I tried, and the setting the oob interface  
did not work,

with the error message below.

Also, per this thread:
http://www.open-mpi.org/community/lists/users/2007/05/3319.php
I believe it is oob_tcp_include, not oob_tcp_if_include. The latter  
is silently

ignored in 1.2, as far as I can tell.

Interestingly, telling the MPI layer to use lo0 (or to not use tcp at  
all) works fine.
But when I try to do the same for the OOB layer, it complains. The  
full error is:


[mymac.local:07001] [0,0,0] mca_oob_tcp_init: invalid address ''  
returned for selected oob interfaces.
[mymac.local:07001] [0,0,0] ORTE_ERROR_LOG: Error in file oob_tcp.c  
at line 1196


mpirun actually hangs at this point and no processes are spawned. I  
have to ^C to stop it.

I see this behavior on both Mac OS and on Linux with 1.2.2.

Bill


George Bosilica wrote:

There are 2 sets of sockets: one for the oob layer and one for the
MPI layer (at least if TCP support is enabled). Therefore, in order
to achieve what you're looking for you should add to the command line
"--mca oob_tcp_if_include lo0 --mca btl_tcp_if_include lo0".
On May 29, 2007, at 3:58 PM, Bill Saphir wrote:



- original message below ---


We have run into the following problem:

- start up Open MPI application on a laptop
- disconnect from network
- application hangs

I believe that the problem is that all sockets created by Open MPI  
are bound to the external network interface.
For example, when I start up a 2 process MPI job on my Mac (no  
hosts specified), I get the following tcp

connections. 192.168.5.2 is an address on my LAN.

tcp4   0  0  192.168.5.2.49459  192.168.5.2.49463   
ESTABLISHED
tcp4   0  0  192.168.5.2.49463  192.168.5.2.49459   
ESTABLISHED
tcp4   0  0  192.168.5.2.49456  192.168.5.2.49462   
ESTABLISHED
tcp4   0  0  192.168.5.2.49462  192.168.5.2.49456   
ESTABLISHED
tcp4   0  0  192.168.5.2.49456  192.168.5.2.49460   
ESTABLISHED
tcp4   0  0  192.168.5.2.49460  192.168.5.2.49456   
ESTABLISHED
tcp4   0  0  192.168.5.2.49456  192.168.5.2.49458   
ESTABLISHED
tcp4   0  0  192.168.5.2.49458  192.168.5.2.49456   
ESTABLISHED


Since this application is confined to a single machine, I would  
like it to use 127.0.0.1,
which will remain available as the laptop moves around. I am unable  
to force it to bind

sockets to this address, however.

Some of the things I've tried are:
- explicitly setting the hostname to 127.0.0.1 (--host 127.0.0.1)
- turning off the tcp btl (--mca btl ^tcp) and other variations (-- 
mca btl self,sm)

- using --mca oob_tcp_include lo0

The first two have no effect. The last one results in an error  
message of:
[myhost.locall:05830] [0,0,0] mca_oob_tcp_init: invalid address ''  
returned for selected oob interfaces.


Is there any way to force Open MPI to bind all sockets to 127.0.0.1?

As a side question -- I'm curious what all of these tcp connections  
are used for.  As I increase the number
of processes, it looks like there are 4 sockets created per MPI  
process, without using the tcp btl.

Perhaps stdin/out/err + control?

Bill






[OMPI users] forcing MPI to bind all sockets to 127.0.0.1

2007-05-29 Thread Bill Saphir


We have run into the following problem:

- start up Open MPI application on a laptop
- disconnect from network
- application hangs

I believe that the problem is that all sockets created by Open MPI  
are bound to the external network interface.
For example, when I start up a 2 process MPI job on my Mac (no hosts  
specified), I get the following tcp

connections. 192.168.5.2 is an address on my LAN.

tcp4   0  0  192.168.5.2.49459  192.168.5.2.49463   
ESTABLISHED
tcp4   0  0  192.168.5.2.49463  192.168.5.2.49459   
ESTABLISHED
tcp4   0  0  192.168.5.2.49456  192.168.5.2.49462   
ESTABLISHED
tcp4   0  0  192.168.5.2.49462  192.168.5.2.49456   
ESTABLISHED
tcp4   0  0  192.168.5.2.49456  192.168.5.2.49460   
ESTABLISHED
tcp4   0  0  192.168.5.2.49460  192.168.5.2.49456   
ESTABLISHED
tcp4   0  0  192.168.5.2.49456  192.168.5.2.49458   
ESTABLISHED
tcp4   0  0  192.168.5.2.49458  192.168.5.2.49456   
ESTABLISHED


Since this application is confined to a single machine, I would like  
it to use 127.0.0.1,
which will remain available as the laptop moves around. I am unable  
to force it to bind

sockets to this address, however.

Some of the things I've tried are:
- explicitly setting the hostname to 127.0.0.1 (--host 127.0.0.1)
- turning off the tcp btl (--mca btl ^tcp) and other variations (-- 
mca btl self,sm)

- using --mca oob_tcp_include lo0

The first two have no effect. The last one results in an error  
message of:
[myhost.locall:05830] [0,0,0] mca_oob_tcp_init: invalid address ''  
returned for selected oob interfaces.


Is there any way to force Open MPI to bind all sockets to 127.0.0.1?

As a side question -- I'm curious what all of these tcp connections  
are used for.  As I increase the number
of processes, it looks like there are 4 sockets created per MPI  
process, without using the tcp btl.

Perhaps stdin/out/err + control?

Bill




[OMPI users] mpirun exit status for non-existent executable

2007-03-20 Thread Bill Saphir


If you ask mpirun to launch an executable that does not exist, it  
fails, but returns an exit status of 0.
This makes it difficult to write scripts that invoke mpirun and need  
to check for errors.
I'm wondering if a) this is considered a bug and b) whether it might  
be fixed in a near term release.


Example:

> orterun -np 2 asdflkj
 
--

Failed to find the following executable:

Host:   build-linux64
Executable: asdflkj

Cannot continue.
 
--

> echo $?
0


I see this behavior for both 1.2 and 1.1.x.

Thanks for your help.

Bill