Thanks for pointing out the LD_LIBRARY_PATH64 ...that explains much. As for the original error, I am still "a duck out of watter". I will try the 1.1rxxx trunck though (creating an ebuild for it as we speak)
Eric Le vendredi 16 juin 2006 11:44, Pak Lui a écrit : > Hi Eric, > > I started to see what you are saying. You tried to point out that you > are using the libdir to lib64 instead of just lib and somehow it doesn't > get picked up. > > I personally have not tried this option though, so I don't think I can > help you much here. But I saw that there are changes in the rsh pls > module for the trunk and 1.1 versions (r9930, 9931, 10207, 10214) that > may solve your lib64 issue. If you do ldd on a.out, it'd show the > libraries it linked to. Other than that, setting should the > LD_LIBRARY_PATH64 shouldn't make a different either. > > I am not sure if others can help you on this. > > Eric Thibodeau wrote: > > Hello, > > > > I don't want to get too much off topic in this reply but you're brigning > > out a point here. I am unable to run mpi apps on the AMD64 platform with > > the regular exporting of $LD_LIBRARY_PATH and $PATH, this is why I have > > no choice but to revert to using the --prefix approach. Here are a few > > execution examples to demonstrate my point: > > > > kyron@headless ~ $ /usr/lib64/openmpi/1.0.2-gcc-4.1/bin/mpirun --prefix > > /usr/lib64/openmpi/1.0.2-gcc-4.1/ -np 2 ./a.out > > > > ./a.out: error while loading shared libraries: libmpi.so.0: cannot open > > shared object file: No such file or directory > > > > kyron@headless ~ $ /usr/lib64/openmpi/1.0.2-gcc-4.1/bin/mpirun --prefix > > /usr/lib64/openmpi/1.0.2-gcc-4.1/lib64/ -np 2 ./a.out > > > > [headless:10827] pls:rsh: execv failed with errno=2 > > > > [headless:10827] ERROR: A daemon on node localhost failed to start as > > expected. > > > > [headless:10827] ERROR: There may be more information available from > > > > [headless:10827] ERROR: the remote shell (see above). > > > > [headless:10827] ERROR: The daemon exited unexpectedly with status 255. > > > > kyron@headless ~ $ cat opmpi64.sh > > > > #!/bin/bash > > > > MPI_BASE='/usr/lib64/openmpi/1.0.2-gcc-4.1' > > > > export PATH=$PATH:${MPI_BASE}/bin > > > > LD_LIBRARY_PATH=${MPI_BASE}/lib64 > > > > kyron@headless ~ $ . opmpi64.sh > > > > kyron@headless ~ $ mpirun -np 2 ./a.out > > > > ./a.out: error while loading shared libraries: libmpi.so.0: cannot open > > shared object file: No such file or directory > > > > kyron@headless ~ $ > > > > Eric > > > > Le vendredi 16 juin 2006 10:31, Pak Lui a écrit : > > > > > Hi, I noticed your prefix set to the lib dir, can you try without the > > > > > lib64 part and rerun? > > > > > > > > > > Eric Thibodeau wrote: > > > > > > Hello everyone, > > > > > > > > > > > > Well, first off, I hope this problem I am reporting is of some > > validity, > > > > > > I tried finding simmilar situations off Google and the mailing list > > but > > > > > > came up with only one reference [1] which seems invalid in my case > > since > > > > > > all executions are local (naïve assumptions that it makes a difference > > > > > > on the calling stack). I am trying to run asimple HelloWorld using > > > > > > OpenMPI 1.0.2 on an AMD64 machine and a Sun Enterprise (12 procs) > > > > > > machine. In both cases I get the following error: > > > > > > > > > > > > pls:rsh: execv failed with errno=2 > > > > > > > > > > > > Here is the mpirun -d trace when running my HelloWorld (on AMD64): > > > > > > > > > > > > kyron@headless ~ $ mpirun -d --prefix > > > > > > /usr/lib64/openmpi/1.0.2-gcc-4.1/lib64/ -np 4 ./hello > > > > > > > > > > > > [headless:10461] procdir: (null) > > > > > > > > > > > > [headless:10461] jobdir: (null) > > > > > > > > > > > > [headless:10461] unidir: > > > > > > /tmp/openmpi-sessions-kyron@headless_0/default-universe > > > >> > > > > >> > [headless:10461] top: openmpi-sessions-kyron@headless_0 > > > >> > > > > >> > [headless:10461] tmp: /tmp > > > >> > > > > >> > [headless:10461] [0,0,0] setting up session dir with > > > >> > > > > >> > [headless:10461] tmpdir /tmp > > > >> > > > > >> > [headless:10461] universe default-universe-10461 > > > >> > > > > >> > [headless:10461] user kyron > > > >> > > > > >> > [headless:10461] host headless > > > >> > > > > >> > [headless:10461] jobid 0 > > > >> > > > > >> > [headless:10461] procid 0 > > > >> > > > > >> > [headless:10461] procdir: > > > >> > /tmp/openmpi-sessions-kyron@headless_0/default-universe-10461/0/0 > > > >> > > > > >> > [headless:10461] jobdir: > > > >> > /tmp/openmpi-sessions-kyron@headless_0/default-universe-10461/0 > > > >> > > > > >> > [headless:10461] unidir: > > > >> > /tmp/openmpi-sessions-kyron@headless_0/default-universe-10461 > > > >> > > > > >> > [headless:10461] top: openmpi-sessions-kyron@headless_0 > > > >> > > > > >> > [headless:10461] tmp: /tmp > > > >> > > > > >> > [headless:10461] [0,0,0] contact_file > > > >> > > > /tmp/openmpi-sessions-kyron@headless_0/default-universe-10461/universe-setup.txt > > > >> > > > > >> > [headless:10461] [0,0,0] wrote setup file > > > >> > > > > >> > [headless:10461] spawn: in job_state_callback(jobid = 1, state = 0x1) > > > >> > > > > >> > [headless:10461] pls:rsh: local csh: 0, local bash: 1 > > > >> > > > > >> > [headless:10461] pls:rsh: assuming same remote shell as local shell > > > >> > > > > >> > [headless:10461] pls:rsh: remote csh: 0, remote bash: 1 > > > >> > > > > >> > [headless:10461] pls:rsh: final template argv: > > > >> > > > > >> > [headless:10461] pls:rsh: /usr/bin/ssh <template> orted --debug > > > >> > --bootproxy 1 --name <template> --num_procs 2 --vpid_start 0 --nodename > > > >> > <template> --universe kyron@headless:default-universe-10461 --nsreplica > > > >> > "0.0.0;tcp://142.137.135.124:37657;tcp://192.168.1.1:37657" > > --gprreplica > > > >> > "0.0.0;tcp://142.137.135.124:37657;tcp://192.168.1.1:37657" > > > >> > --mpi-call-yield 0 > > > >> > > > > >> > [headless:10461] pls:rsh: launching on node localhost > > > >> > > > > >> > [headless:10461] pls:rsh: oversubscribed -- setting mpi_yield_when_idle > > > >> > to 1 (1 4) > > > >> > > > > >> > [headless:10461] pls:rsh: localhost is a LOCAL node > > > >> > > > > >> > [headless:10461] pls:rsh: reset PATH: > > > >> > > > /usr/lib64/openmpi/1.0.2-gcc-4.1/lib64/bin:/usr/local/bin:/usr/bin:/bin:/usr/x86_64-pc-linux-gnu/gcc-bin/4.1.1:/opt/c3-4:/usr/qt/3/bin:/usr/lib64/openmpi/1.0.2-gcc-4.1/bin > > > >> > > > > >> > [headless:10461] pls:rsh: reset LD_LIBRARY_PATH: > > > >> > /usr/lib64/openmpi/1.0.2-gcc-4.1/lib64/lib > > > >> > > > > >> > [headless:10461] pls:rsh: changing to directory /home/kyron > > > >> > > > > >> > [headless:10461] pls:rsh: executing: orted --debug --bootproxy 1 --name > > > >> > 0.0.1 --num_procs 2 --vpid_start 0 --nodename localhost --universe > > > >> > kyron@headless:default-universe-10461 --nsreplica > > > >> > "0.0.0;tcp://142.137.135.124:37657;tcp://192.168.1.1:37657" > > --gprreplica > > > >> > "0.0.0;tcp://142.137.135.124:37657;tcp://192.168.1.1:37657" > > > >> > --mpi-call-yield 1 > > > >> > > > > >> > [headless:10461] pls:rsh: execv failed with errno=2 > > > >> > > > > >> > [headless:10461] sess_dir_finalize: proc session dir not empty - leaving > > > >> > > > > >> > [headless:10461] spawn: in job_state_callback(jobid = 1, state = 0xa) > > > >> > > > > >> > [headless:10461] sess_dir_finalize: proc session dir not empty - leaving > > > >> > > > > >> > [headless:10461] sess_dir_finalize: proc session dir not empty - leaving > > > >> > > > > >> > [headless:10461] sess_dir_finalize: proc session dir not empty - leaving > > > >> > > > > >> > [headless:10461] spawn: in job_state_callback(jobid = 1, state = 0x9) > > > >> > > > > >> > [headless:10461] ERROR: A daemon on node localhost failed to start as > > > >> > expected. > > > >> > > > > >> > [headless:10461] ERROR: There may be more information available from > > > >> > > > > >> > [headless:10461] ERROR: the remote shell (see above). > > > >> > > > > >> > [headless:10461] ERROR: The daemon exited unexpectedly with status 255. > > > >> > > > > >> > [headless:10461] sess_dir_finalize: found proc session dir empty - > > deleting > > > >> > > > > >> > [headless:10461] sess_dir_finalize: found job session dir empty - > > deleting > > > >> > > > > >> > [headless:10461] sess_dir_finalize: found univ session dir empty - > > deleting > > > >> > > > > >> > [headless:10461] sess_dir_finalize: top session dir not empty - leaving > > > >> > > > > >> > The two platforms are very different, one is AMD64 (dual Opteron) with > > > >> > GCC 4.1.1 (Gentoo), the other is SUN OS 5.8 with GCC 3.4.2. OpenMPI was > > > >> > compiled with the following options (extracted from the config.status): > > > >> > > > > >> > AMD64: > > > >> > > > > >> > Open MPI config.status 1.0.2 > > > >> > > > > >> > configured by ./configure, generated by GNU Autoconf 2.59, > > > >> > > > > >> > with options \"'--prefix=/usr' '--host=x86_64-pc-linux-gnu' > > > >> > '--mandir=/usr/share/man' '--infodir=/usr/share/info' > > > >> > '--datadir=/usr/share' '--sysconfdir=/etc' '--localstatedir=/var/lib' > > > >> > '--prefix=/usr/lib64/openmpi/1.0.2-gcc-4.1' > > > >> > '--datadir=/usr/share/openmpi/1.0.2-gcc-4.1' > > > >> > '--program-suffix=-1.0.2-gcc-4.1' > > > >> > '--sysconfdir=/etc/openmpi/1.0.2-gcc-4.1' > > > >> > '--enable-pretty-print-stacktrace' > > > >> > '--libdir=/usr/lib64/openmpi/1.0.2-gcc-4.1/lib64' > > > >> > '--build=x86_64-pc-linux-gnu' '--cache-file' 'config.cache' > > > >> > 'CFLAGS=-march=opteron -O2 -pipe -ftracer -fprefetch-loop-arrays > > > >> > -mfpmath=sse -ffast-math -ftree-vectorize -floop-optimize2' > > > >> > 'CXXFLAGS=-march=opteron -O2 -pipe -ftracer -fprefetch-loop-arrays > > > >> > -mfpmath=sse -ffast-math -ftree-vectorize -floop-optimize2' 'LDFLAGS= > > > >> > -Wl,-z,-noexecstack' 'build_alias=x86_64-pc-linux-gnu' > > > >> > 'host_alias=x86_64-pc-linux-gnu' --enable-ltdl-convenience\" > > > >> > > > > >> > SUN 5.8: > > > >> > > > > >> > Open MPI config.status 1.0.2 > > > >> > > > > >> > configured by ./configure, generated by GNU Autoconf 2.59, > > > >> > > > > >> > with options > > > >> > \"'--prefix=/export/lca/home/lca0/etudiants/ac38820/openmpi' > > > >> > '--enable-pretty-print-stacktrace' 'CFLAGS=-mv8plus' > > 'CXXFLAGS=-mv8plus' > > > >> > --enable-ltdl-convenience\" > > > >> > > > > >> > x86 (as a working reference, configure options should be close to > > > >> > identical as the AMD64): > > > >> > > > > >> > Open MPI config.status 1.0.2 > > > >> > > > > >> > configured by ./configure, generated by GNU Autoconf 2.59, > > > >> > > > > >> > with options \"'--prefix=/usr' '--host=i686-pc-linux-gnu' > > > >> > '--mandir=/usr/share/man' '--infodir=/usr/share/info' > > > >> > '--datadir=/usr/share' '--sysconfdir=/etc' '--localstatedir=/var/lib' > > > >> > '--prefix=/usr/lib/openmpi/1.0.2-gcc-4.1' > > > >> > '--datadir=/usr/share/openmpi/1.0.2-gcc-4.1' > > > >> > '--program-suffix=-1.0.2-gcc-4.1' > > > >> > '--sysconfdir=/etc/openmpi/1.0.2-gcc-4.1' > > > >> > '--enable-pretty-print-stacktrace' '--build=i686-pc-linux-gnu' > > > >> > '--cache-file' 'config.cache' 'CFLAGS=-march=nocona -O2 -pipe > > > >> > -fomit-frame-pointer' 'CXXFLAGS=-march=nocona -O2 -pipe > > > >> > -fomit-frame-pointer' 'LDFLAGS= -Wl,-z,-noexecstack' > > > >> > 'build_alias=i686-pc-linux-gnu' 'host_alias=i686-pc-linux-gnu' > > > >> > --enable-ltdl-convenience\" > > > >> > > > > >> > Any help would be greatly appreciated. > > > >> > > > > >> > Thanks. > > > >> > > > > >> > [1] > > http://gridengine.sunsource.net/servlets/ReadMsg?list=users&msgNo=15775 > > > >> > > > > >> > -- > > > >> > > > > >> > Eric Thibodeau > > > >> > > > > >> > Neural Bucket Solutions Inc. > > > >> > > > > >> > T. (514) 736-1436 > > > >> > > > > >> > C. (514) 710-0517 > > > >> > > > > >> > > > > >> > ------------------------------------------------------------------------ > > > >> > > > > >> > _______________________________________________ > > > >> > users mailing list > > > >> > us...@open-mpi.org > > > >> > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > >> > > > >> > > > > -- > > > > Eric Thibodeau > > > > Neural Bucket Solutions Inc. > > > > T. (514) 736-1436 > > > > C. (514) 710-0517 > > > > > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > -- Eric Thibodeau Neural Bucket Solutions Inc. T. (514) 736-1436 C. (514) 710-0517