You have two options:
1. Ensure that your PATH and LD_LIBRARY_PATH are exactly what you
think they are on the remote nodes. A common problem that some
people run into is that they setup their PATH/LD_LIBRARY_PATH in the
"interactive" portions of their .bashrc, meaning that they are only
set for interactive logins (and therefore not set for non-interactive
logins). Try the following:
ssh othernode 'echo $PATH'
Note the single quotes; they are necessary to ensure that "echo
$PATH" is evaluated on the *remote* node. Do the same with
$LD_LIBRARY_PATH and ensure that they are really set to the values
that you think they are. Check out the following FAQ entry:
http://www.open-mpi.org/faq/?category=running#adding-ompi-to-path
2. Use the --prefix functionality in mpirun to automatically set the
PATH / LD_LIBRARY_PATH values for the remote node. Check out this
FAQ entry:
http://www.open-mpi.org/faq/?category=running#mpirun-prefix
Note that a synonym to the --prefix functionality that is not [yet]
mentioned in that FAQ entry is that you can use the absolute pathname
to mpirun. For example:
/path/to/mpirun ...
Or you can use OMPI 1.2's --enable-mpirun-prefix-by-default option to
OMPI's configure, which will tell mpirun to always assume that it
needs to use --prefix-like behavior (without you needing to specify
it on the mpirun command line).
Hope that helps.
On Jun 12, 2007, at 11:58 PM, lichanjua...@lzu.cn wrote:
On Wed, 2007-06-13 at 11:47 +0800, lichanjua...@lzu.cn wrote:
hi,all:
I am a first user of openmpi, I have used mpich before.I found
there
are many differenties between them.So I am confused.
I build openmpi on a ps3 using default option,that is
$ ./configure --prefiex=
$ make all install
I modify my .bash_profile file and add openmpi lib and
executable file
in LD_LIBRARY_PATH and PATH.
I use NFS file system between server and node, I just install
openmpi on
server.
I check the mailling list and FAQ, knowing default lancher is
ssh,but I
sitll add "pls_rsh_agent = ssh" in openmpi-mca-params.conf.
I test the hello_c.c example. when I run:
$mpiexec -host ps3-2 -n 4 ./hello
it can run correctly(ps3-2 is hostname of server).I try it on
each node.
but when I run:
$ mpiexec -hostfile host.txt -n 4 ./hello
content of host.txt:
ps3-1
ps3-2
there is error message:
bash: orted: command not found
[ps3-1:25154] [0,0,0] ORTE_ERROR_LOG: Timeout in file
base/pls_base_orted_cmds.c at line 275
[ps3-1:25154] [0,0,0] ORTE_ERROR_LOG: Timeout in file
pls_rsh_module.c
at line 1164
[ps3-1:25154] [0,0,0] ORTE_ERROR_LOG: Timeout in file
errmgr_hnp.c at
line 90
[ps3-1:25154] ERROR: A daemon on node ps3-2 failed to
start as
expected.
[ps3-1:25154] ERROR: There may be more information available
from
[ps3-1:25154] ERROR: the remote shell (see above).
[ps3-1:25154] ERROR: The daemon exited unexpectedly with
status
127.
[ps3-1:25154] [0,0,0] ORTE_ERROR_LOG: Timeout in file
base/pls_base_orted_cmds.c at line 188
[ps3-1:25154] [0,0,0] ORTE_ERROR_LOG: Timeout in file
pls_rsh_module.c
at line 1196
---------------------------------------------------------------------
-----
mpiexec was unable to cleanly terminate the daemons for this
job.
Returned value Timeout instead of ORTE_SUCCESS.
---------------------------------------------------------------------
-----
I search the same problem in mailing list and FAQ, saying
PATH
and
LD_LIBRARY_PATH are not setted correctly,but I ensure them
in my
path.
I use openmpi in first time, so hope anybody help me,thanks a
lot!
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
sorry, I forget some information. I use openmpi1.2, I try to run the
command on remote host such as ,run command on ps3-1:
$ mpiexec -host ps3-2 -n 2 ./a.out
there appear same error message.I think there is something wrong with
rsh/ssh,but I don't where to modify or some file I missed.
if someone met same problem,please tell me the solution. I will be
grateful. thanks very much!
Li chanjuan
--
Li, Chanjuan Lanzhou University
Distributed & Embedded System Lab http://dslab.lzu.edu.cn
School of Information Science and Engeneering
lichanjua...@lzu.cn
Tianshui South Road 222. Lanzhou
730000 .P.R.China
Tel:+86-931-8912025 Fax:+86-931-8912022
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
--
Jeff Squyres
Cisco Systems