What happens is mpirun does under the hood <remote_exec> orted And your remote_exec does not propagate LD_LIBRARY_PATH one option is to configure your remote_exec to do so, but I'd rather suggest you re-configure ompi with --enable-orterun-prefix-by-default If your remote_exec is ssh (if you are not running under a supported batch manager), then ssh node188 ldd $path_to_openmpi_bin/orted should show zero unresolved libraries
Cheers, Gilles On Sunday, April 9, 2017, Ilchenko Evgeniy <ilchenk...@gmail.com> wrote: > Hi! > > Problem with random segfault for java-programs solved by adding mca > options: > > $path_to_openmpi_bin/mpirun -np 1 -mca btl self,sm,openib > $path_to_java_bin/java randomTest > > Thanks to Eshsou Hashba and Michael Kalugin! > > > But i get other problems! > > If I start mpirun from manager-node (without ssh-login to calculation node) > > $path_to_openmpi_bin/mpirun -np 2 -host node188,node189 -mca btl > self,sm,openib $path_to_java_bin/java randomTest > > I get next error: > > > $openmpi1.10_folder/bin/orted: error while loading shared libraries: > libimf.so: cannot open shared object file: No such file or directory > -------------------------------------------------------------------------- > ORTE was unable to reliably start one or more daemons. > This usually is caused by: > > * not finding the required libraries and/or binaries on > one or more nodes. Please check your PATH and LD_LIBRARY_PATH > settings, or configure OMPI with --enable-orterun-prefix-by-default > > * lack of authority to execute on one or more specified nodes. > Please verify your allocation and authorities. > > * the inability to write startup files into /tmp > (--tmpdir/orte_tmpdir_base). > Please check with your sys admin to determine the correct location to > use. > > * compilation of the orted with dynamic libraries when static are required > (e.g., on Cray). Please check your configure cmd line and consider using > one of the contrib/platform definitions for your system type. > > * an inability to create a connection back to mpirun due to a > lack of common network interfaces and/or no route found between > them. Please check network connectivity (including firewalls > and network routing requirements). > -------------------------------------------------------------------------- > > If I throw LD_LIBRARY_PATH (that contain path to libimf.so) via -x option > to mpirun: > > $path_to_openmpi_bin/mpirun -x LD_LIBRARY_PATH -np 2 -host > node188,node189 -mca btl self,sm,openib $path_to_java_bin/java randomTest > > then I get same error (orted: error while loading shared libraries: > libimf.so: cannot open shared object file: No such file or directory). > > How I can throw lib path for spawned mpi processes and orted? > I don't have root-privileges on this cluster. > >
_______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users