Weird - it works fine for me:

sjc-vpn5-109:mpi rhc$ mpirun -n 3 ./abort
Hello, World, I am 1 of 3
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD
with errorcode 2.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun has exited due to process rank 1 with PID 22980 on
node sjc-vpn5-109.cisco.com exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
Hello, World, I am 0 of 3
Hello, World, I am 2 of 3

I built it with gcc 4.2.1, though - I know we have a problem with shared
memory hanging when built with gcc 4.4.x, so I wonder if the issue here is
your use of gcc 4.5?

Can you try running this again with -mca btl ^sm?


On Wed, Jun 2, 2010 at 3:49 AM, Yves Caniou <yves.can...@ens-lyon.fr> wrote:

> Dear All,
>
> As already said on this mailing list, I found that a simple Hello_world
> program does not necessarily  end (the program just hangs after the
> MPI_Finalize(), and I can printf the MPI_FINALIZED which confirm that the
> MPI
> part of the code has finished, but the exit() or return() never ends).
>
> So I tried to use MPI_Abort(), and observed two different behaviors
> (description of the architecture is given below).
> Either it ends with a segfault, or the application doesn't return to shell,
> even if the string "MPI_ABORT was [...] here)." appears on screen (program
> just hangs, as with MPI_Finalize()).
>
> This is annoying since I need several execution in a batch script, since
> several submission cost a lot of time in queues. Then, if you have any tips
> to bypass the hanging of the application, I take it (even if it means
> recompile OpenMPI with specific options of course).
>
> Thank you!
>
> .Yves.
>
> Here is an example of the output produced on screen. Note that errorcode is
> the rank of the process which called MPI_Abort().
>
> ############################################
> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
> with errorcode 0.
>
> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> You may or may not see output from other processes, depending on
> exactly when Open MPI kills them.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> mpiexec has exited due to process rank 0 with PID 18062 on
> node ha8000-1 exiting without calling "finalize". This may
> have caused other processes in the application to be
> terminated by signals sent by mpiexec (as reported here).
> --------------------------------------------------------------------------
> [ha8000-1:18060] *** Process received signal ***
> [ha8000-1:18060] Signal: Segmentation fault (11)
> [ha8000-1:18060] Signal code: Address not mapped (1)
> [ha8000-1:18060] Failing at address: 0x2aaaac1bd940
> Segmentation fault
> ############################################
>
> The architecture is a Quad-Core AMD Opteron(tm) Processor 8356, Ethernet
> controller: MYRICOM Inc. Myri-10G Dual-Protocol NIC (10G-PCIE-8A), the
> version of OMPI is 1.4.2 and have been compiled with GCC-4.5
> $>ompi_info
>                 Package: Open MPI p10015@ha8000-1 Distribution
>                Open MPI: 1.4.2
>   Open MPI SVN revision: r23093
>   Open MPI release date: May 04, 2010
>                Open RTE: 1.4.2
>   Open RTE SVN revision: r23093
>   Open RTE release date: May 04, 2010
>                    OPAL: 1.4.2
>       OPAL SVN revision: r23093
>       OPAL release date: May 04, 2010
>            Ident string: 1.4.2
>                  Prefix: /home/p10015/openmpi
>  Configured architecture: x86_64-unknown-linux-gnu
>          Configure host: ha8000-1
>           Configured by: p10015
>           Configured on: Wed May 19 19:01:19 JST 2010
>          Configure host: ha8000-1
>                Built by: p10015
>                Built on: Wed May 19 21:03:33 JST 2010
>              Built host: ha8000-1
>              C bindings: yes
>            C++ bindings: yes
>      Fortran77 bindings: yes (all)
>      Fortran90 bindings: yes
>  Fortran90 bindings size: small
>              C
> compiler: /home/p10015/gcc/bin/x86_64-unknown-linux-gnu-gcc-4.5.0
>     C compiler absolute:
>            C++ compiler: /home/p10015/gcc/bin/x86_64-unknown-linux-gnu-g++
>   C++ compiler absolute:
>      Fortran77 compiler: gfortran
>  Fortran77 compiler abs: /usr/bin/gfortran
>      Fortran90 compiler: gfortran
>  Fortran90 compiler abs: /usr/bin/gfortran
>             C profiling: yes
>           C++ profiling: yes
>     Fortran77 profiling: yes
>     Fortran90 profiling: yes
>          C++ exceptions: no
>          Thread support: posix (mpi: yes, progress: yes)
>           Sparse Groups: yes
>  Internal debug support: no
>     MPI parameter check: runtime
> Memory profiling support: no
> Memory debugging support: no
>         libltdl support: yes
>   Heterogeneous support: yes
>  mpirun default --prefix: yes
>         MPI I/O support: yes
>       MPI_WTIME support: gettimeofday
> Symbol visibility support: yes
>   FT Checkpoint support: no  (checkpoint thread: no)
>           MCA backtrace: execinfo (MCA v2.0, API v2.0, Component v1.4.2)
>              MCA memory: ptmalloc2 (MCA v2.0, API v2.0, Component v1.4.2)
>           MCA paffinity: linux (MCA v2.0, API v2.0, Component v1.4.2)
>               MCA carto: auto_detect (MCA v2.0, API v2.0, Component v1.4.2)
>               MCA carto: file (MCA v2.0, API v2.0, Component v1.4.2)
>           MCA maffinity: first_use (MCA v2.0, API v2.0, Component v1.4.2)
>           MCA maffinity: libnuma (MCA v2.0, API v2.0, Component v1.4.2)
>               MCA timer: linux (MCA v2.0, API v2.0, Component v1.4.2)
>         MCA installdirs: env (MCA v2.0, API v2.0, Component v1.4.2)
>         MCA installdirs: config (MCA v2.0, API v2.0, Component v1.4.2)
>                 MCA dpm: orte (MCA v2.0, API v2.0, Component v1.4.2)
>              MCA pubsub: orte (MCA v2.0, API v2.0, Component v1.4.2)
>           MCA allocator: basic (MCA v2.0, API v2.0, Component v1.4.2)
>           MCA allocator: bucket (MCA v2.0, API v2.0, Component v1.4.2)
>                MCA coll: basic (MCA v2.0, API v2.0, Component v1.4.2)
>                MCA coll: hierarch (MCA v2.0, API v2.0, Component v1.4.2)
>                MCA coll: inter (MCA v2.0, API v2.0, Component v1.4.2)
>                MCA coll: self (MCA v2.0, API v2.0, Component v1.4.2)
>                MCA coll: sm (MCA v2.0, API v2.0, Component v1.4.2)
>                MCA coll: sync (MCA v2.0, API v2.0, Component v1.4.2)
>                MCA coll: tuned (MCA v2.0, API v2.0, Component v1.4.2)
>                  MCA io: romio (MCA v2.0, API v2.0, Component v1.4.2)
>               MCA mpool: fake (MCA v2.0, API v2.0, Component v1.4.2)
>               MCA mpool: rdma (MCA v2.0, API v2.0, Component v1.4.2)
>               MCA mpool: sm (MCA v2.0, API v2.0, Component v1.4.2)
>                 MCA pml: cm (MCA v2.0, API v2.0, Component v1.4.2)
>                 MCA pml: csum (MCA v2.0, API v2.0, Component v1.4.2)
>                 MCA pml: ob1 (MCA v2.0, API v2.0, Component v1.4.2)
>                 MCA pml: v (MCA v2.0, API v2.0, Component v1.4.2)
>                 MCA bml: r2 (MCA v2.0, API v2.0, Component v1.4.2)
>              MCA rcache: vma (MCA v2.0, API v2.0, Component v1.4.2)
>                 MCA btl: self (MCA v2.0, API v2.0, Component v1.4.2)
>                 MCA btl: sm (MCA v2.0, API v2.0, Component v1.4.2)
>                 MCA btl: tcp (MCA v2.0, API v2.0, Component v1.4.2)
>                MCA topo: unity (MCA v2.0, API v2.0, Component v1.4.2)
>                 MCA osc: pt2pt (MCA v2.0, API v2.0, Component v1.4.2)
>                 MCA osc: rdma (MCA v2.0, API v2.0, Component v1.4.2)
>                 MCA iof: hnp (MCA v2.0, API v2.0, Component v1.4.2)
>                 MCA iof: orted (MCA v2.0, API v2.0, Component v1.4.2)
>                 MCA iof: tool (MCA v2.0, API v2.0, Component v1.4.2)
>                 MCA oob: tcp (MCA v2.0, API v2.0, Component v1.4.2)
>                MCA odls: default (MCA v2.0, API v2.0, Component v1.4.2)
>                 MCA ras: slurm (MCA v2.0, API v2.0, Component v1.4.2)
>               MCA rmaps: load_balance (MCA v2.0, API v2.0, Component
> v1.4.2)
>               MCA rmaps: rank_file (MCA v2.0, API v2.0, Component v1.4.2)
>               MCA rmaps: round_robin (MCA v2.0, API v2.0, Component v1.4.2)
>               MCA rmaps: seq (MCA v2.0, API v2.0, Component v1.4.2)
>                 MCA rml: oob (MCA v2.0, API v2.0, Component v1.4.2)
>              MCA routed: binomial (MCA v2.0, API v2.0, Component v1.4.2)
>              MCA routed: direct (MCA v2.0, API v2.0, Component v1.4.2)
>              MCA routed: linear (MCA v2.0, API v2.0, Component v1.4.2)
>                 MCA plm: rsh (MCA v2.0, API v2.0, Component v1.4.2)
>                 MCA plm: slurm (MCA v2.0, API v2.0, Component v1.4.2)
>               MCA filem: rsh (MCA v2.0, API v2.0, Component v1.4.2)
>              MCA errmgr: default (MCA v2.0, API v2.0, Component v1.4.2)
>                 MCA ess: env (MCA v2.0, API v2.0, Component v1.4.2)
>                 MCA ess: hnp (MCA v2.0, API v2.0, Component v1.4.2)
>                 MCA ess: singleton (MCA v2.0, API v2.0, Component v1.4.2)
>                 MCA ess: slurm (MCA v2.0, API v2.0, Component v1.4.2)
>                 MCA ess: tool (MCA v2.0, API v2.0, Component v1.4.2)
>             MCA grpcomm: bad (MCA v2.0, API v2.0, Component v1.4.2)
>             MCA grpcomm: basic (MCA v2.0, API v2.0, Component v1.4.2)
>
> --
> Yves Caniou
> Associate Professor at Université Lyon 1,
> Member of the team project INRIA GRAAL in the LIP ENS-Lyon,
> Délégation CNRS in Japan French Laboratory of Informatics (JFLI),
>  * in Information Technology Center, The University of Tokyo,
>    2-11-16 Yayoi, Bunkyo-ku, Tokyo 113-8658, Japan
>    tel: +81-3-5841-0540
>  * in National Institute of Informatics
>    2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo 101-8430, Japan
>    tel: +81-3-4212-2412
> http://graal.ens-lyon.fr/~ycaniou/ <http://graal.ens-lyon.fr/%7Eycaniou/>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to