Hi All,

I want to submit a openmpi job to gridengine 6.1u3
To compile openmpi 1.4.3, I have configure with --with-sge option.
If my job used single node, it can be finish without any error message.
If my job used 2 nodes, error message as following:
error: executing task of job 118325 failed:
--------------------------------------------------------------------------
A daemon (pid 32414) died unexpectedly with status 1 while attempting
to launch so we are aborting.

There may be more information reported by the environment (see above).

This may be because the daemon was unable to find all the needed shared
libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
location of the shared libraries on the remote nodes and this will
automatically be forwarded to the remote nodes.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that the job aborted, but has no info as to the process
that caused that situation.
--------------------------------------------------------------------------
mpirun: clean termination accomplished

hpca04028.pcf.sinica.edu.tw: rm -rf /usrtmp/118325.1.q0-em64t-ge
hpca04028.pcf.sinica.edu.tw: rm -rf /usrtmp/118325.1.q0-em64t-ge/qrsh_client_cache

My script as following:
#$ -S /bin/sh
#$ -q q0-em64t-ge
#$ -pe mpich 2
#$ -v OMPI_MCA_pls_gridengine_verbose=1,OMPI_MCA_plm_rsh_agent=rsh,LD_LIBRARY_PATH=/usr1/wzlu/openmpi/1.4.3/lib:/lib64:/lib:/usr/lib64:/usr/lib
#$ -cwd
source ~/openmpi/openmpi.sh
mpirun -np $NPROCS cpi

Have any suggestion? Thanks.

Best Regards,
Lu
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to