hi,all:
    I am a first user of openmpi, I have used mpich before.I found there
are many differenties between them.So I am confused.
        I build openmpi on a ps3 using default option,that is
          $ ./configure --prefiex=
          $ make all install
        I modify my .bash_profile file and add openmpi lib and
        executable file
        in LD_LIBRARY_PATH and PATH.
        I use NFS file system between server and node, I just install
        openmpi on
        server.
        I check the mailling list and FAQ, knowing default lancher is
        ssh,but I
        sitll add "pls_rsh_agent = ssh" in openmpi-mca-params.conf.
        
        I test the hello_c.c example. when I run:
                $mpiexec -host ps3-2 -n 4 ./hello
        it can run correctly(ps3-2 is hostname of server).I try it on
        each node.
        but when I run:
                $ mpiexec -hostfile host.txt -n 4 ./hello
        
        content of host.txt:
        ps3-1
        ps3-2
        
        there is error message:
        
        bash: orted: command not found
        [ps3-1:25154] [0,0,0] ORTE_ERROR_LOG: Timeout in file
        base/pls_base_orted_cmds.c at line 275
        [ps3-1:25154] [0,0,0] ORTE_ERROR_LOG: Timeout in file
        pls_rsh_module.c
        at line 1164
        [ps3-1:25154] [0,0,0] ORTE_ERROR_LOG: Timeout in file
        errmgr_hnp.c at
        line 90
        [ps3-1:25154] ERROR: A daemon on node ps3-2 failed to start as
        expected.
        [ps3-1:25154] ERROR: There may be more information available
        from
        [ps3-1:25154] ERROR: the remote shell (see above).
        [ps3-1:25154] ERROR: The daemon exited unexpectedly with status
        127.
        [ps3-1:25154] [0,0,0] ORTE_ERROR_LOG: Timeout in file
        base/pls_base_orted_cmds.c at line 188
        [ps3-1:25154] [0,0,0] ORTE_ERROR_LOG: Timeout in file
        pls_rsh_module.c
        at line 1196
        
--------------------------------------------------------------------------
        mpiexec was unable to cleanly terminate the daemons for this
        job.
        Returned value Timeout instead of ORTE_SUCCESS.
        
        
--------------------------------------------------------------------------
        I search the same problem in mailing list and FAQ, saying PATH
        and
        LD_LIBRARY_PATH are not setted correctly,but I ensure them in my
        path.
        I use openmpi in first time, so hope anybody help me,thanks a
        lot!
-- 
Li, Chanjuan                                        Lanzhou University
Distributed & Embedded System Lab              http://dslab.lzu.edu.cn
School of Information Science and Engeneering        lichanjua...@lzu.cn
Tianshui South Road 222. Lanzhou 730000                      .P.R.China
Tel:+86-931-8912025                                Fax:+86-931-8912022

Attachment: signature.asc
Description: This is a digitally signed message part



Reply via email to