2009/1/3 Maciej Kazulak <kazul...@gmail.com>

> Hi,
>
> I have a weird problem. After a fresh install mpirun refuses to work:
>
> box% ./hello
> Process 0 on box out of 1
> box% mpirun -np 1 ./hello
> # hangs here, no output, nothing at all; on another terminal:
> box% ps axl | egrep 'mpirun|orted'
> 0  1000 24162  7687  20   0  86704  2744 -      Sl+  pts/2      0:00 mpirun
> -np 1 ./hello
> 1  1000 24165     1  20   0  76016  2088 -      Ss   ?          0:00 orted
> --bootproxy 1 --name 0.0.1 --num_procs 2 --vpid_start 0 --nodename box
> --universe ncl@box:default-universe-24162 --nsreplica "0.0.0;tcp://
> 192.168.1.8:21500" --gprreplica "0.0.0;tcp://192.168.1.8:21500" --set-sid
> 0  1000 24171 23924  20   0   6020   732 pipe_w S+   pts/3      0:00 egrep
> mpirun|orted
>
> Is there some post-install configuration i forgot to do? I couldn't find
> anything useful in the faq nor the docs that come with the package.
> Following advice in this thread
> http://www.open-mpi.org/community/lists/users/2007/08/3845.php i tried
> --debug-daemons but no output whatsoever as above.
> Also tried MTT:
>
> box% cat developer.ini trivial.ini| ../client/mtt -
> alreadyinstalled_dir=/usr
> ompi:version:full:1.2.8
> *** WARNING: Test: cxx_hello, np=2, variant=1: TIMED OUT (failed)
> *** WARNING: Test: c_ring, np=2, variant=1: TIMED OUT (failed)
> *** WARNING: Test: cxx_ring, np=2, variant=1: TIMED OUT (failed)
> *** WARNING: Test: c_hello, np=2, variant=1: TIMED OUT (failed)
> *** WARNING: Test: c_ring, np=2, variant=1: TIMED OUT (failed)
> *** WARNING: Test: c_hello, np=2, variant=1: TIMED OUT (failed)
>
> MTT Results Summary
> hostname: box
> uname: Linux box 2.6.28-gentoo #2 SMP Thu Jan 1 15:27:59 CET 2009 x86_64
> Intel(R) Core(TM)2 Duo CPU E6550 @ 2.33GHz GenuineIntel GNU/Linux
> who am i:
> +-------------+-----------------+------+------+----------+------+
> | Phase       | Section         | Pass | Fail | Time out | Skip |
> +-------------+-----------------+------+------+----------+------+
> | MPI install | my installation | 1    | 0    | 0        | 0    |
> | MPI install | my installation | 1    | 0    | 0        | 0    |
> | Test Build  | trivial         | 1    | 0    | 0        | 0    |
> | Test Build  | trivial         | 1    | 0    | 0        | 0    |
> | Test Run    | trivial         | 0    | 0    | 4        | 0    |
> | Test Run    | trivial         | 0    | 0    | 2        | 0    |
> +-------------+-----------------+------+------+----------+------+
>
> box% ompi_info
>                 Open MPI: 1.2.8
>    Open MPI SVN revision: r19718
>                 Open RTE: 1.2.8
>    Open RTE SVN revision: r19718
>                     OPAL: 1.2.8
>        OPAL SVN revision: r19718
>                   Prefix: /usr
>  Configured architecture: x86_64-pc-linux-gnu
>            Configured by: root
>            Configured on: Sat Jan  3 01:03:53 CET 2009
>           Configure host: box
>                 Built by: root
>                 Built on: sob, 3 sty 2009, 01:06:54 CET
>               Built host: box
>               C bindings: yes
>             C++ bindings: yes
>       Fortran77 bindings: no
>       Fortran90 bindings: no
>  Fortran90 bindings size: na
>               C compiler: x86_64-pc-linux-gnu-gcc
>      C compiler absolute: /usr/bin/x86_64-pc-linux-gnu-gcc
>             C++ compiler: x86_64-pc-linux-gnu-g++
>    C++ compiler absolute: /usr/bin/x86_64-pc-linux-gnu-g++
>       Fortran77 compiler: x86_64-pc-linux-gnu-gfortran
>   Fortran77 compiler abs: /usr/bin/x86_64-pc-linux-gnu-gfortran
>       Fortran90 compiler: none
>   Fortran90 compiler abs: none
>              C profiling: yes
>            C++ profiling: yes
>      Fortran77 profiling: no
>      Fortran90 profiling: no
>           C++ exceptions: no
>           Thread support: posix (mpi: no, progress: no)
>   Internal debug support: no
>      MPI parameter check: runtime
> Memory profiling support: no
> Memory debugging support: no
>          libltdl support: yes
>    Heterogeneous support: yes
>  mpirun default --prefix: yes
>            MCA backtrace: execinfo (MCA v1.0, API v1.0, Component v1.2.8)
>               MCA memory: ptmalloc2 (MCA v1.0, API v1.0, Component v1.2.8)
>            MCA paffinity: linux (MCA v1.0, API v1.0, Component v1.2.8)
>            MCA maffinity: first_use (MCA v1.0, API v1.0, Component v1.2.8)
>                MCA timer: linux (MCA v1.0, API v1.0, Component v1.2.8)
>          MCA installdirs: env (MCA v1.0, API v1.0, Component v1.2.8)
>          MCA installdirs: config (MCA v1.0, API v1.0, Component v1.2.8)
>            MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0)
>            MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0)
>                 MCA coll: basic (MCA v1.0, API v1.0, Component v1.2.8)
>                 MCA coll: self (MCA v1.0, API v1.0, Component v1.2.8)
>                 MCA coll: sm (MCA v1.0, API v1.0, Component v1.2.8)
>                 MCA coll: tuned (MCA v1.0, API v1.0, Component v1.2.8)
>                MCA mpool: rdma (MCA v1.0, API v1.0, Component v1.2.8)
>                MCA mpool: sm (MCA v1.0, API v1.0, Component v1.2.8)
>                  MCA pml: cm (MCA v1.0, API v1.0, Component v1.2.8)
>                  MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.2.8)
>                  MCA bml: r2 (MCA v1.0, API v1.0, Component v1.2.8)
>               MCA rcache: vma (MCA v1.0, API v1.0, Component v1.2.8)
>                  MCA btl: self (MCA v1.0, API v1.0.1, Compone
>               MCA errmgr: hnp (MCA v1.0, API v1.3, Component v1.2.8)
>               MCA errmgr: orted (MCA v1.0, API v1.3, Component v1.2.8)
>               MCA errmgr: proxy (MCA v1.0, API v1.3, Component v1.2.8)
>                  MCA gpr: null (MCA v1.0, API v1.0, Component v1.2.8)
>                  MCA gpr: proxy (MCA v1.0, API v1.0, Component v1.2.8)
>                  MCA gpr: replica (MCA v1.0, API v1.0, Component v1.2.8)
>                  MCA iof: proxy (MCA v1.0, API v1.0, Component v1.2.8)
>                  MCA iof: svc (MCA v1.0, API v1.0, Component v1.2.8)
>                   MCA ns: proxy (MCA v1.0, API v2.0, Component v1.2.8)
>                   MCA ns: replica (MCA v1.0, API v2.0, Component v1.2.8)
>                  MCA oob: tcp (MCA v1.0, API v1.0, Component v1.0)
>                  MCA ras: dash_host (MCA v1.0, API v1.3, Component v1.2.8)
>                  MCA ras: gridengine (MCA v1.0, API v1.3, Component v1.2.8)
>                  MCA ras: localhost (MCA v1.0, API v1.3, Component v1.2.8)
>                  MCA rds: hostfile (MCA v1.0, API v1.3, Component v1.2.8)
>                  MCA rds: proxy (MCA v1.0, API v1.3, Component v1.2.8)
>                  MCA rds: resfile (MCA v1.0, API v1.3, Component v1.2.8)
>                MCA rmaps: round_robin (MCA v1.0, API v1.3, Component
> v1.2.8)
>                 MCA rmgr: proxy (MCA v1.0, API v2.0, Component v1.2.8)
>                 MCA rmgr: urm (MCA v1.0, API v2.0, Component v1.2.8)
>                  MCA rml: oob (MCA v1.0, API v1.0, Component v1.2.8)
>                  MCA pls: gridengine (MCA v1.0, API v1.3, Component v1.2.8)
>                  MCA pls: proxy (MCA v1.0, API v1.3, Component v1.2.8)
>                  MCA pls: rsh (MCA v1.0, API v1.3, Component v1.2.8)
>                  MCA sds: env (MCA v1.0, API v1.0, Component v1.2.8)
>                  MCA sds: pipe (MCA v1.0, API v1.0, Component v1.2.8)
>                  MCA sds: seed (MCA v1.0, API v1.0, Component v1.2.8)
>                  MCA sds: singleton (MCA v1.0, API v1.0, Component
> v1.2.8)nt v1.2.8)
>                  MCA btl: sm (MCA v1.0, API v1.0.1, Component v1.2.8)
>                  MCA btl: tcp (MCA v1.0, API v1.0.1, Component v1.0)
>                 MCA topo: unity (MCA v1.0, API v1.0, Component v1.2.8)
>                  MCA osc: pt2pt (MCA v1.0, API v1.0, Component v1.2.8)
>
> I tried 1.2.6-r1 earlier with same results which only leads me to the
> assumption I must be doing something wrong but I'm out of ideas for now.
> Anyone?
>

Nevermind.

Interesting though. I thought in such a simple scenario shared memory would
be used for IPC (or whatever's fastest) . But nope. Even with one process
still it wants to use TCP/IP to communicate between mpirun and orted. What's
even more surprising to me it won't use loopback for that. Hence my maybe a
little bit over-restrictive iptables rules were the problem. I allowed
traffic from 127.0.0.1 to 127.0.0.1 on lo but not from <eth0_addr> to
<eth0_addr> on eth0 and both processes ended up waiting for IO.

Can I somehow configure it to use something other than TCP/IP here? Or at
least switch it to loopback?

Reply via email to