hi,all: I am a first user of openmpi, I have used mpich before.I found there are many differenties between them.So I am confused. I build openmpi on a ps3 using default option,that is $ ./configure --prefiex= $ make all install I modify my .bash_profile file and add openmpi lib and executable file in LD_LIBRARY_PATH and PATH. I use NFS file system between server and node, I just install openmpi on server. I check the mailling list and FAQ, knowing default lancher is ssh,but I sitll add "pls_rsh_agent = ssh" in openmpi-mca-params.conf. I test the hello_c.c example. when I run: $mpiexec -host ps3-2 -n 4 ./hello it can run correctly(ps3-2 is hostname of server).I try it on each node. but when I run: $ mpiexec -hostfile host.txt -n 4 ./hello content of host.txt: ps3-1 ps3-2 there is error message: bash: orted: command not found [ps3-1:25154] [0,0,0] ORTE_ERROR_LOG: Timeout in file base/pls_base_orted_cmds.c at line 275 [ps3-1:25154] [0,0,0] ORTE_ERROR_LOG: Timeout in file pls_rsh_module.c at line 1164 [ps3-1:25154] [0,0,0] ORTE_ERROR_LOG: Timeout in file errmgr_hnp.c at line 90 [ps3-1:25154] ERROR: A daemon on node ps3-2 failed to start as expected. [ps3-1:25154] ERROR: There may be more information available from [ps3-1:25154] ERROR: the remote shell (see above). [ps3-1:25154] ERROR: The daemon exited unexpectedly with status 127. [ps3-1:25154] [0,0,0] ORTE_ERROR_LOG: Timeout in file base/pls_base_orted_cmds.c at line 188 [ps3-1:25154] [0,0,0] ORTE_ERROR_LOG: Timeout in file pls_rsh_module.c at line 1196 -------------------------------------------------------------------------- mpiexec was unable to cleanly terminate the daemons for this job. Returned value Timeout instead of ORTE_SUCCESS. -------------------------------------------------------------------------- I search the same problem in mailing list and FAQ, saying PATH and LD_LIBRARY_PATH are not setted correctly,but I ensure them in my path. I use openmpi in first time, so hope anybody help me,thanks a lot! -- Li, Chanjuan Lanzhou University Distributed & Embedded System Lab http://dslab.lzu.edu.cn School of Information Science and Engeneering lichanjua...@lzu.cn Tianshui South Road 222. Lanzhou 730000 .P.R.China Tel:+86-931-8912025 Fax:+86-931-8912022
signature.asc
Description: This is a digitally signed message part