Add -mca plm_base_verbose 5 --leave-session-attached to the cmd line - that 
will show the ssh command being used to start each orted.

On Dec 14, 2012, at 12:17 PM, "Blosch, Edwin L" <edwin.l.blo...@lmco.com> wrote:

> I am having a weird problem launching cases with OpenMPI 1.4.3.  It is most 
> likely a problem with a particular node of our cluster, as the jobs will run 
> fine on some submissions, but not other submissions.  It seems to depend on 
> the node list.  I just am having trouble diagnosing which node, and what is 
> the nature of the problem it has.
>  
> One or perhaps more of the orted are indicating they cannot find an Intel 
> Math library.  The error is:
> /release/cfd/openmpi-intel/bin/orted: error while loading shared libraries: 
> libimf.so: cannot open shared object file: No such file or directory
>  
> I’ve checked the environment just before launching mpirun, and 
> LD_LIBRARY_PATH includes the necessary component to point to where the Intel 
> shared libraries are located.  Furthermore, my mpirun command line says to 
> export the LD_LIBRARY_PATH variable:
> Executing ['/release/cfd/openmpi-intel/bin/mpirun', '--machinefile 
> /var/spool/PBS/aux/20761.maruhpc4-mgt', '-np 160', '-x LD_LIBRARY_PATH', '-x 
> MPI_ENVIRONMENT=1', '/tmp/fv420761.maruhpc4-mgt/falconv4_openmpi_jsgl', '-v', 
> '-cycles', '10000', '-ri', 'restart.1', '-ro', 
> '/tmp/fv420761.maruhpc4-mgt/restart.1']
>  
> My shell-initialization script (.bashrc) does not overwrite LD_LIBRARY_PATH.  
> OpenMPI is built explicitly --without-torque and should be using ssh to 
> launch the orted.
>  
> What options can I add to get more debugging of problems launching orted?
>  
> Thanks,
>  
> Ed
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to