Hello

OMPI doesn't use ssh by default to launch a daemon local to mpirun - instead, we locally fork/exec the orted.

The problem here is that OMPI doesn't realize that you are launching on the local machine. This is usually caused by confusion when IP resolving the hostname returned by gethostname vs. the IP address on your machine.

Take a look at ifconfig and see what addresses are on your machine. Do any of them match the IP address OMPI is trying to launch to?

Ralph


On Nov 14, 2008, at 5:27 AM, Sun, Yongqi (E F ES EN 72) wrote:

Hello,

I have two questions about ssh and details follow.

Questions:

Is there any way to prevent the usage of ssh on my local desktop and
launch locally by default? (The FAQ page writes "Also note that if using
a launcher that uses a hostfile and no hostfile is specified, all
processes are launched on the local host." Unfortunately, this is not
the case for me. )

If ssh/rsh has to be used, can I redirect the host to local machine? (I have tried to add "192.168.160.1" to /etc/hosts, but nothing changed.) I
want to use OpenMPI in Eclipse, where "--hostfile" option cannot be
added to mpirun.

Details:

I'm using OpenMPI 1.2.8 on my Linux desktop (two quad-core AMD Opteron
2354). Although I always launch mpirun only on the local machine, ssh is
used by the default case. For example,
   shell% cd [openmpi-1.2.8]/examples

The code can be compiled (so IMHO the PATH and LD_LIBRARY_PATH are
correct)
   shell% mpicc -o hello_c hello_c.c

But when lauched
   shell% mpirun -np 2 hello_c

There are runtime errors:

ssh: connect to host 192.168.160.1 port 22: No route to host
[W71c-140644:14261] [0,0,0] ORTE_ERROR_LOG: Timeout in file
base/pls_base_orted_cmds.c at line 275
[W71c-140644:14261] [0,0,0] ORTE_ERROR_LOG: Timeout in file
pls_rsh_module.c at line 1158
[W71c-140644:14261] [0,0,0] ORTE_ERROR_LOG: Timeout in file errmgr_hnp.c
at line 90
[W71c-140644:14261] ERROR: A daemon on node 192.168.160.1 failed to
start as expected.
[W71c-140644:14261] ERROR: There may be more information available from
[W71c-140644:14261] ERROR: the remote shell (see above).
[W71c-140644:14261] ERROR: The daemon exited unexpectedly with status
255.
[W71c-140644:14261] [0,0,0] ORTE_ERROR_LOG: Timeout in file
base/pls_base_orted_cmds.c at line 188
[W71c-140644:14261] [0,0,0] ORTE_ERROR_LOG: Timeout in file
pls_rsh_module.c at line 1190
------------------------------------------------------------------------
--
mpirun was unable to cleanly terminate the daemons for this job.
Returned value Timeout instead of ORTE_SUCCESS.
------------------------------------------------------------------------
--
<<ompi-output.tar.gz>>

However, I'm lauching on my local desktop, where no "192.168.160.1"
exists. I have to specify a hostfile to make it working as expected
   shell% mpirun -np 2 --hostfile myhostfile hello_c

Where the "myhostfile" contains my local machine "W71C-140644"

Best wishes

Sun, Yongqi
<ompi-output.tar.gz>_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to