2009/1/3 Maciej Kazulak <kazul...@gmail.com> > Hi, > > I have a weird problem. After a fresh install mpirun refuses to work: > > box% ./hello > Process 0 on box out of 1 > box% mpirun -np 1 ./hello > # hangs here, no output, nothing at all; on another terminal: > box% ps axl | egrep 'mpirun|orted' > 0 1000 24162 7687 20 0 86704 2744 - Sl+ pts/2 0:00 mpirun > -np 1 ./hello > 1 1000 24165 1 20 0 76016 2088 - Ss ? 0:00 orted > --bootproxy 1 --name 0.0.1 --num_procs 2 --vpid_start 0 --nodename box > --universe ncl@box:default-universe-24162 --nsreplica "0.0.0;tcp:// > 192.168.1.8:21500" --gprreplica "0.0.0;tcp://192.168.1.8:21500" --set-sid > 0 1000 24171 23924 20 0 6020 732 pipe_w S+ pts/3 0:00 egrep > mpirun|orted > > Is there some post-install configuration i forgot to do? I couldn't find > anything useful in the faq nor the docs that come with the package. > Following advice in this thread > http://www.open-mpi.org/community/lists/users/2007/08/3845.php i tried > --debug-daemons but no output whatsoever as above. > Also tried MTT: > > box% cat developer.ini trivial.ini| ../client/mtt - > alreadyinstalled_dir=/usr > ompi:version:full:1.2.8 > *** WARNING: Test: cxx_hello, np=2, variant=1: TIMED OUT (failed) > *** WARNING: Test: c_ring, np=2, variant=1: TIMED OUT (failed) > *** WARNING: Test: cxx_ring, np=2, variant=1: TIMED OUT (failed) > *** WARNING: Test: c_hello, np=2, variant=1: TIMED OUT (failed) > *** WARNING: Test: c_ring, np=2, variant=1: TIMED OUT (failed) > *** WARNING: Test: c_hello, np=2, variant=1: TIMED OUT (failed) > > MTT Results Summary > hostname: box > uname: Linux box 2.6.28-gentoo #2 SMP Thu Jan 1 15:27:59 CET 2009 x86_64 > Intel(R) Core(TM)2 Duo CPU E6550 @ 2.33GHz GenuineIntel GNU/Linux > who am i: > +-------------+-----------------+------+------+----------+------+ > | Phase | Section | Pass | Fail | Time out | Skip | > +-------------+-----------------+------+------+----------+------+ > | MPI install | my installation | 1 | 0 | 0 | 0 | > | MPI install | my installation | 1 | 0 | 0 | 0 | > | Test Build | trivial | 1 | 0 | 0 | 0 | > | Test Build | trivial | 1 | 0 | 0 | 0 | > | Test Run | trivial | 0 | 0 | 4 | 0 | > | Test Run | trivial | 0 | 0 | 2 | 0 | > +-------------+-----------------+------+------+----------+------+ > > box% ompi_info > Open MPI: 1.2.8 > Open MPI SVN revision: r19718 > Open RTE: 1.2.8 > Open RTE SVN revision: r19718 > OPAL: 1.2.8 > OPAL SVN revision: r19718 > Prefix: /usr > Configured architecture: x86_64-pc-linux-gnu > Configured by: root > Configured on: Sat Jan 3 01:03:53 CET 2009 > Configure host: box > Built by: root > Built on: sob, 3 sty 2009, 01:06:54 CET > Built host: box > C bindings: yes > C++ bindings: yes > Fortran77 bindings: no > Fortran90 bindings: no > Fortran90 bindings size: na > C compiler: x86_64-pc-linux-gnu-gcc > C compiler absolute: /usr/bin/x86_64-pc-linux-gnu-gcc > C++ compiler: x86_64-pc-linux-gnu-g++ > C++ compiler absolute: /usr/bin/x86_64-pc-linux-gnu-g++ > Fortran77 compiler: x86_64-pc-linux-gnu-gfortran > Fortran77 compiler abs: /usr/bin/x86_64-pc-linux-gnu-gfortran > Fortran90 compiler: none > Fortran90 compiler abs: none > C profiling: yes > C++ profiling: yes > Fortran77 profiling: no > Fortran90 profiling: no > C++ exceptions: no > Thread support: posix (mpi: no, progress: no) > Internal debug support: no > MPI parameter check: runtime > Memory profiling support: no > Memory debugging support: no > libltdl support: yes > Heterogeneous support: yes > mpirun default --prefix: yes > MCA backtrace: execinfo (MCA v1.0, API v1.0, Component v1.2.8) > MCA memory: ptmalloc2 (MCA v1.0, API v1.0, Component v1.2.8) > MCA paffinity: linux (MCA v1.0, API v1.0, Component v1.2.8) > MCA maffinity: first_use (MCA v1.0, API v1.0, Component v1.2.8) > MCA timer: linux (MCA v1.0, API v1.0, Component v1.2.8) > MCA installdirs: env (MCA v1.0, API v1.0, Component v1.2.8) > MCA installdirs: config (MCA v1.0, API v1.0, Component v1.2.8) > MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0) > MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0) > MCA coll: basic (MCA v1.0, API v1.0, Component v1.2.8) > MCA coll: self (MCA v1.0, API v1.0, Component v1.2.8) > MCA coll: sm (MCA v1.0, API v1.0, Component v1.2.8) > MCA coll: tuned (MCA v1.0, API v1.0, Component v1.2.8) > MCA mpool: rdma (MCA v1.0, API v1.0, Component v1.2.8) > MCA mpool: sm (MCA v1.0, API v1.0, Component v1.2.8) > MCA pml: cm (MCA v1.0, API v1.0, Component v1.2.8) > MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.2.8) > MCA bml: r2 (MCA v1.0, API v1.0, Component v1.2.8) > MCA rcache: vma (MCA v1.0, API v1.0, Component v1.2.8) > MCA btl: self (MCA v1.0, API v1.0.1, Compone > MCA errmgr: hnp (MCA v1.0, API v1.3, Component v1.2.8) > MCA errmgr: orted (MCA v1.0, API v1.3, Component v1.2.8) > MCA errmgr: proxy (MCA v1.0, API v1.3, Component v1.2.8) > MCA gpr: null (MCA v1.0, API v1.0, Component v1.2.8) > MCA gpr: proxy (MCA v1.0, API v1.0, Component v1.2.8) > MCA gpr: replica (MCA v1.0, API v1.0, Component v1.2.8) > MCA iof: proxy (MCA v1.0, API v1.0, Component v1.2.8) > MCA iof: svc (MCA v1.0, API v1.0, Component v1.2.8) > MCA ns: proxy (MCA v1.0, API v2.0, Component v1.2.8) > MCA ns: replica (MCA v1.0, API v2.0, Component v1.2.8) > MCA oob: tcp (MCA v1.0, API v1.0, Component v1.0) > MCA ras: dash_host (MCA v1.0, API v1.3, Component v1.2.8) > MCA ras: gridengine (MCA v1.0, API v1.3, Component v1.2.8) > MCA ras: localhost (MCA v1.0, API v1.3, Component v1.2.8) > MCA rds: hostfile (MCA v1.0, API v1.3, Component v1.2.8) > MCA rds: proxy (MCA v1.0, API v1.3, Component v1.2.8) > MCA rds: resfile (MCA v1.0, API v1.3, Component v1.2.8) > MCA rmaps: round_robin (MCA v1.0, API v1.3, Component > v1.2.8) > MCA rmgr: proxy (MCA v1.0, API v2.0, Component v1.2.8) > MCA rmgr: urm (MCA v1.0, API v2.0, Component v1.2.8) > MCA rml: oob (MCA v1.0, API v1.0, Component v1.2.8) > MCA pls: gridengine (MCA v1.0, API v1.3, Component v1.2.8) > MCA pls: proxy (MCA v1.0, API v1.3, Component v1.2.8) > MCA pls: rsh (MCA v1.0, API v1.3, Component v1.2.8) > MCA sds: env (MCA v1.0, API v1.0, Component v1.2.8) > MCA sds: pipe (MCA v1.0, API v1.0, Component v1.2.8) > MCA sds: seed (MCA v1.0, API v1.0, Component v1.2.8) > MCA sds: singleton (MCA v1.0, API v1.0, Component > v1.2.8)nt v1.2.8) > MCA btl: sm (MCA v1.0, API v1.0.1, Component v1.2.8) > MCA btl: tcp (MCA v1.0, API v1.0.1, Component v1.0) > MCA topo: unity (MCA v1.0, API v1.0, Component v1.2.8) > MCA osc: pt2pt (MCA v1.0, API v1.0, Component v1.2.8) > > I tried 1.2.6-r1 earlier with same results which only leads me to the > assumption I must be doing something wrong but I'm out of ideas for now. > Anyone? >
Nevermind. Interesting though. I thought in such a simple scenario shared memory would be used for IPC (or whatever's fastest) . But nope. Even with one process still it wants to use TCP/IP to communicate between mpirun and orted. What's even more surprising to me it won't use loopback for that. Hence my maybe a little bit over-restrictive iptables rules were the problem. I allowed traffic from 127.0.0.1 to 127.0.0.1 on lo but not from <eth0_addr> to <eth0_addr> on eth0 and both processes ended up waiting for IO. Can I somehow configure it to use something other than TCP/IP here? Or at least switch it to loopback?