Sorry for jumping in late... The /lib vs. /lib64 thing as part of --prefix was definitely broken until recently. This behavior has been fixed in the 1.1 series. Specifically, OMPI will take the prefix that you provided and append the basename of the local $libdir. So if you configured OMPI with something like:
shell$ ./configure --libdir=/some/path/lib64 ... And then you run: shell$ mpirun --prefix /some/path ... Then OMPI will add /some/path/lib64 to the remote LD_LIBRARY_PATH. The previous behavior would always add "/lib" to the remote LD_LIBRARY_PATH, regardless of what the local $libdir was (i.e., it ignored the basename of your $libdir). If you have a situation more complicated than this (e.g., your $libdir is different than your prefix by more than just the basename), then --prefix is not the solution for you. Instead, you'll need to set your $PATH and $LD_LIBRARY_PATH properly on all nodes (e.g., in your shell startup files). Specifically, --prefix is meant to be an easy workaround for common configurations where $libdir is a subdirectory under $prefix. Another random note: invoking mpirun with an absolute path (e.g., /path/to/bin/mpirun) is exactly the same as specifying --prefix /path/to -- so you don't have to do both. > -----Original Message----- > From: users-boun...@open-mpi.org > [mailto:users-boun...@open-mpi.org] On Behalf Of Eric Thibodeau > Sent: Friday, June 16, 2006 11:47 AM > To: pak....@sun.com; Open MPI Users > Subject: Re: [OMPI users] pls:rsh: execv failed with errno=2 > > Thanks for pointing out the LD_LIBRARY_PATH64 ...that > explains much. As for the original error, I am still "a duck > out of watter". I will try the 1.1rxxx trunck though > (creating an ebuild for it as we speak) > > Eric > > Le vendredi 16 juin 2006 11:44, Pak Lui a écrit : > > Hi Eric, > > > > I started to see what you are saying. You tried to point > out that you > > are using the libdir to lib64 instead of just lib and > somehow it doesn't > > get picked up. > > > > I personally have not tried this option though, so I don't > think I can > > help you much here. But I saw that there are changes in the rsh pls > > module for the trunk and 1.1 versions (r9930, 9931, 10207, > 10214) that > > may solve your lib64 issue. If you do ldd on a.out, it'd show the > > libraries it linked to. Other than that, setting should the > > LD_LIBRARY_PATH64 shouldn't make a different either. > > > > I am not sure if others can help you on this. > > > > Eric Thibodeau wrote: > > > Hello, > > > > > > I don't want to get too much off topic in this reply but > you're brigning > > > out a point here. I am unable to run mpi apps on the > AMD64 platform with > > > the regular exporting of $LD_LIBRARY_PATH and $PATH, this > is why I have > > > no choice but to revert to using the --prefix approach. > Here are a few > > > execution examples to demonstrate my point: > > > > > > kyron@headless ~ $ > /usr/lib64/openmpi/1.0.2-gcc-4.1/bin/mpirun --prefix > > > /usr/lib64/openmpi/1.0.2-gcc-4.1/ -np 2 ./a.out > > > > > > ./a.out: error while loading shared libraries: > libmpi.so.0: cannot open > > > shared object file: No such file or directory > > > > > > kyron@headless ~ $ > /usr/lib64/openmpi/1.0.2-gcc-4.1/bin/mpirun --prefix > > > /usr/lib64/openmpi/1.0.2-gcc-4.1/lib64/ -np 2 ./a.out > > > > > > [headless:10827] pls:rsh: execv failed with errno=2 > > > > > > [headless:10827] ERROR: A daemon on node localhost failed > to start as > > > expected. > > > > > > [headless:10827] ERROR: There may be more information > available from > > > > > > [headless:10827] ERROR: the remote shell (see above). > > > > > > [headless:10827] ERROR: The daemon exited unexpectedly > with status 255. > > > > > > kyron@headless ~ $ cat opmpi64.sh > > > > > > #!/bin/bash > > > > > > MPI_BASE='/usr/lib64/openmpi/1.0.2-gcc-4.1' > > > > > > export PATH=$PATH:${MPI_BASE}/bin > > > > > > LD_LIBRARY_PATH=${MPI_BASE}/lib64 > > > > > > kyron@headless ~ $ . opmpi64.sh > > > > > > kyron@headless ~ $ mpirun -np 2 ./a.out > > > > > > ./a.out: error while loading shared libraries: > libmpi.so.0: cannot open > > > shared object file: No such file or directory > > > > > > kyron@headless ~ $ > > > > > > Eric > > > > > > Le vendredi 16 juin 2006 10:31, Pak Lui a écrit : > > > > > > > Hi, I noticed your prefix set to the lib dir, can you > try without the > > > > > > > lib64 part and rerun? > > > > > > > > > > > > > > Eric Thibodeau wrote: > > > > > > > > Hello everyone, > > > > > > > > > > > > > > > > Well, first off, I hope this problem I am reporting > is of some > > > validity, > > > > > > > > I tried finding simmilar situations off Google and > the mailing list > > > but > > > > > > > > came up with only one reference [1] which seems > invalid in my case > > > since > > > > > > > > all executions are local (naïve assumptions that it > makes a difference > > > > > > > > on the calling stack). I am trying to run asimple > HelloWorld using > > > > > > > > OpenMPI 1.0.2 on an AMD64 machine and a Sun > Enterprise (12 procs) > > > > > > > > machine. In both cases I get the following error: > > > > > > > > > > > > > > > > pls:rsh: execv failed with errno=2 > > > > > > > > > > > > > > > > Here is the mpirun -d trace when running my > HelloWorld (on AMD64): > > > > > > > > > > > > > > > > kyron@headless ~ $ mpirun -d --prefix > > > > > > > > /usr/lib64/openmpi/1.0.2-gcc-4.1/lib64/ -np 4 ./hello > > > > > > > > > > > > > > > > [headless:10461] procdir: (null) > > > > > > > > > > > > > > > > [headless:10461] jobdir: (null) > > > > > > > > > > > > > > > > [headless:10461] unidir: > > > > > > > > /tmp/openmpi-sessions-kyron@headless_0/default-universe > > > > > >> > > > > > > >> > [headless:10461] top: openmpi-sessions-kyron@headless_0 > > > > > >> > > > > > > >> > [headless:10461] tmp: /tmp > > > > > >> > > > > > > >> > [headless:10461] [0,0,0] setting up session dir with > > > > > >> > > > > > > >> > [headless:10461] tmpdir /tmp > > > > > >> > > > > > > >> > [headless:10461] universe default-universe-10461 > > > > > >> > > > > > > >> > [headless:10461] user kyron > > > > > >> > > > > > > >> > [headless:10461] host headless > > > > > >> > > > > > > >> > [headless:10461] jobid 0 > > > > > >> > > > > > > >> > [headless:10461] procid 0 > > > > > >> > > > > > > >> > [headless:10461] procdir: > > > > > >> > > /tmp/openmpi-sessions-kyron@headless_0/default-universe-10461/0/0 > > > > > >> > > > > > > >> > [headless:10461] jobdir: > > > > > >> > > /tmp/openmpi-sessions-kyron@headless_0/default-universe-10461/0 > > > > > >> > > > > > > >> > [headless:10461] unidir: > > > > > >> > /tmp/openmpi-sessions-kyron@headless_0/default-universe-10461 > > > > > >> > > > > > > >> > [headless:10461] top: openmpi-sessions-kyron@headless_0 > > > > > >> > > > > > > >> > [headless:10461] tmp: /tmp > > > > > >> > > > > > > >> > [headless:10461] [0,0,0] contact_file > > > > > >> > > > > > /tmp/openmpi-sessions-kyron@headless_0/default-universe-10461/ > universe-setup.txt > > > > > >> > > > > > > >> > [headless:10461] [0,0,0] wrote setup file > > > > > >> > > > > > > >> > [headless:10461] spawn: in job_state_callback(jobid = > 1, state = 0x1) > > > > > >> > > > > > > >> > [headless:10461] pls:rsh: local csh: 0, local bash: 1 > > > > > >> > > > > > > >> > [headless:10461] pls:rsh: assuming same remote shell > as local shell > > > > > >> > > > > > > >> > [headless:10461] pls:rsh: remote csh: 0, remote bash: 1 > > > > > >> > > > > > > >> > [headless:10461] pls:rsh: final template argv: > > > > > >> > > > > > > >> > [headless:10461] pls:rsh: /usr/bin/ssh <template> > orted --debug > > > > > >> > --bootproxy 1 --name <template> --num_procs 2 > --vpid_start 0 --nodename > > > > > >> > <template> --universe > kyron@headless:default-universe-10461 --nsreplica > > > > > >> > "0.0.0;tcp://142.137.135.124:37657;tcp://192.168.1.1:37657" > > > --gprreplica > > > > > >> > "0.0.0;tcp://142.137.135.124:37657;tcp://192.168.1.1:37657" > > > > > >> > --mpi-call-yield 0 > > > > > >> > > > > > > >> > [headless:10461] pls:rsh: launching on node localhost > > > > > >> > > > > > > >> > [headless:10461] pls:rsh: oversubscribed -- setting > mpi_yield_when_idle > > > > > >> > to 1 (1 4) > > > > > >> > > > > > > >> > [headless:10461] pls:rsh: localhost is a LOCAL node > > > > > >> > > > > > > >> > [headless:10461] pls:rsh: reset PATH: > > > > > >> > > > > > /usr/lib64/openmpi/1.0.2-gcc-4.1/lib64/bin:/usr/local/bin:/usr /bin:/bin:/usr/x86_64-pc-linux-gnu/gcc-> bin/4.1.1:/opt/c3-4:/usr/qt/3/bin:/usr/lib64/openmpi/1.0.2-gcc-4.1/bin > > > > > >> > > > > > > >> > [headless:10461] pls:rsh: reset LD_LIBRARY_PATH: > > > > > >> > /usr/lib64/openmpi/1.0.2-gcc-4.1/lib64/lib > > > > > >> > > > > > > >> > [headless:10461] pls:rsh: changing to directory /home/kyron > > > > > >> > > > > > > >> > [headless:10461] pls:rsh: executing: orted --debug > --bootproxy 1 --name > > > > > >> > 0.0.1 --num_procs 2 --vpid_start 0 --nodename > localhost --universe > > > > > >> > kyron@headless:default-universe-10461 --nsreplica > > > > > >> > "0.0.0;tcp://142.137.135.124:37657;tcp://192.168.1.1:37657" > > > --gprreplica > > > > > >> > "0.0.0;tcp://142.137.135.124:37657;tcp://192.168.1.1:37657" > > > > > >> > --mpi-call-yield 1 > > > > > >> > > > > > > >> > [headless:10461] pls:rsh: execv failed with errno=2 > > > > > >> > > > > > > >> > [headless:10461] sess_dir_finalize: proc session dir > not empty - leaving > > > > > >> > > > > > > >> > [headless:10461] spawn: in job_state_callback(jobid = > 1, state = 0xa) > > > > > >> > > > > > > >> > [headless:10461] sess_dir_finalize: proc session dir > not empty - leaving > > > > > >> > > > > > > >> > [headless:10461] sess_dir_finalize: proc session dir > not empty - leaving > > > > > >> > > > > > > >> > [headless:10461] sess_dir_finalize: proc session dir > not empty - leaving > > > > > >> > > > > > > >> > [headless:10461] spawn: in job_state_callback(jobid = > 1, state = 0x9) > > > > > >> > > > > > > >> > [headless:10461] ERROR: A daemon on node localhost > failed to start as > > > > > >> > expected. > > > > > >> > > > > > > >> > [headless:10461] ERROR: There may be more information > available from > > > > > >> > > > > > > >> > [headless:10461] ERROR: the remote shell (see above). > > > > > >> > > > > > > >> > [headless:10461] ERROR: The daemon exited > unexpectedly with status 255. > > > > > >> > > > > > > >> > [headless:10461] sess_dir_finalize: found proc > session dir empty - > > > deleting > > > > > >> > > > > > > >> > [headless:10461] sess_dir_finalize: found job session > dir empty - > > > deleting > > > > > >> > > > > > > >> > [headless:10461] sess_dir_finalize: found univ > session dir empty - > > > deleting > > > > > >> > > > > > > >> > [headless:10461] sess_dir_finalize: top session dir > not empty - leaving > > > > > >> > > > > > > >> > The two platforms are very different, one is AMD64 > (dual Opteron) with > > > > > >> > GCC 4.1.1 (Gentoo), the other is SUN OS 5.8 with GCC > 3.4.2. OpenMPI was > > > > > >> > compiled with the following options (extracted from > the config.status): > > > > > >> > > > > > > >> > AMD64: > > > > > >> > > > > > > >> > Open MPI config.status 1.0.2 > > > > > >> > > > > > > >> > configured by ./configure, generated by GNU Autoconf 2.59, > > > > > >> > > > > > > >> > with options \"'--prefix=/usr' '--host=x86_64-pc-linux-gnu' > > > > > >> > '--mandir=/usr/share/man' '--infodir=/usr/share/info' > > > > > >> > '--datadir=/usr/share' '--sysconfdir=/etc' > '--localstatedir=/var/lib' > > > > > >> > '--prefix=/usr/lib64/openmpi/1.0.2-gcc-4.1' > > > > > >> > '--datadir=/usr/share/openmpi/1.0.2-gcc-4.1' > > > > > >> > '--program-suffix=-1.0.2-gcc-4.1' > > > > > >> > '--sysconfdir=/etc/openmpi/1.0.2-gcc-4.1' > > > > > >> > '--enable-pretty-print-stacktrace' > > > > > >> > '--libdir=/usr/lib64/openmpi/1.0.2-gcc-4.1/lib64' > > > > > >> > '--build=x86_64-pc-linux-gnu' '--cache-file' 'config.cache' > > > > > >> > 'CFLAGS=-march=opteron -O2 -pipe -ftracer > -fprefetch-loop-arrays > > > > > >> > -mfpmath=sse -ffast-math -ftree-vectorize -floop-optimize2' > > > > > >> > 'CXXFLAGS=-march=opteron -O2 -pipe -ftracer > -fprefetch-loop-arrays > > > > > >> > -mfpmath=sse -ffast-math -ftree-vectorize > -floop-optimize2' 'LDFLAGS= > > > > > >> > -Wl,-z,-noexecstack' 'build_alias=x86_64-pc-linux-gnu' > > > > > >> > 'host_alias=x86_64-pc-linux-gnu' --enable-ltdl-convenience\" > > > > > >> > > > > > > >> > SUN 5.8: > > > > > >> > > > > > > >> > Open MPI config.status 1.0.2 > > > > > >> > > > > > > >> > configured by ./configure, generated by GNU Autoconf 2.59, > > > > > >> > > > > > > >> > with options > > > > > >> > \"'--prefix=/export/lca/home/lca0/etudiants/ac38820/openmpi' > > > > > >> > '--enable-pretty-print-stacktrace' 'CFLAGS=-mv8plus' > > > 'CXXFLAGS=-mv8plus' > > > > > >> > --enable-ltdl-convenience\" > > > > > >> > > > > > > >> > x86 (as a working reference, configure options should > be close to > > > > > >> > identical as the AMD64): > > > > > >> > > > > > > >> > Open MPI config.status 1.0.2 > > > > > >> > > > > > > >> > configured by ./configure, generated by GNU Autoconf 2.59, > > > > > >> > > > > > > >> > with options \"'--prefix=/usr' '--host=i686-pc-linux-gnu' > > > > > >> > '--mandir=/usr/share/man' '--infodir=/usr/share/info' > > > > > >> > '--datadir=/usr/share' '--sysconfdir=/etc' > '--localstatedir=/var/lib' > > > > > >> > '--prefix=/usr/lib/openmpi/1.0.2-gcc-4.1' > > > > > >> > '--datadir=/usr/share/openmpi/1.0.2-gcc-4.1' > > > > > >> > '--program-suffix=-1.0.2-gcc-4.1' > > > > > >> > '--sysconfdir=/etc/openmpi/1.0.2-gcc-4.1' > > > > > >> > '--enable-pretty-print-stacktrace' '--build=i686-pc-linux-gnu' > > > > > >> > '--cache-file' 'config.cache' 'CFLAGS=-march=nocona -O2 -pipe > > > > > >> > -fomit-frame-pointer' 'CXXFLAGS=-march=nocona -O2 -pipe > > > > > >> > -fomit-frame-pointer' 'LDFLAGS= -Wl,-z,-noexecstack' > > > > > >> > 'build_alias=i686-pc-linux-gnu' 'host_alias=i686-pc-linux-gnu' > > > > > >> > --enable-ltdl-convenience\" > > > > > >> > > > > > > >> > Any help would be greatly appreciated. > > > > > >> > > > > > > >> > Thanks. > > > > > >> > > > > > > >> > [1] > > > > http://gridengine.sunsource.net/servlets/ReadMsg?list=users&ms > gNo=15775 > > > > > >> > > > > > > >> > -- > > > > > >> > > > > > > >> > Eric Thibodeau > > > > > >> > > > > > > >> > Neural Bucket Solutions Inc. > > > > > >> > > > > > > >> > T. (514) 736-1436 > > > > > >> > > > > > > >> > C. (514) 710-0517 > > > > > >> > > > > > > >> > > > > > > >> > > -------------------------------------------------------------- > ---------- > > > > > >> > > > > > > >> > _______________________________________________ > > > > > >> > users mailing list > > > > > >> > us...@open-mpi.org > > > > > >> > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > > >> > > > > > >> > > > > > > -- > > > > > > Eric Thibodeau > > > > > > Neural Bucket Solutions Inc. > > > > > > T. (514) 736-1436 > > > > > > C. (514) 710-0517 > > > > > > > > > > -------------------------------------------------------------- > ---------- > > > > > > _______________________________________________ > > > users mailing list > > > us...@open-mpi.org > > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > > > -- > Eric Thibodeau > Neural Bucket Solutions Inc. > T. (514) 736-1436 > C. (514) 710-0517 > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >