Sachin,

I cant replicate your issue neither with the latest 1.8 nor with the trunk.
I tried using a single host, while forcing SM and then TP to no avail.

Can you try restricting the collective modules in use (adding --mca coll
tuned,basic) to your mpirun command?

  George.


On Fri, Feb 20, 2015 at 9:31 PM, Sachin Krishnan <sachk...@gmail.com> wrote:

> Josh,
>
> Thanks for the help.
> I'm running on a single host. How do I confirm that it is an issue with
> the shared memory?
>
> Sachin
>
> On Fri, Feb 20, 2015 at 11:58 PM, Joshua Ladd <jladd.m...@gmail.com>
> wrote:
>
>> Sachin,
>>
>> Are you running this on a single host or across multiple hosts (i.e. are
>> you communicating between processes via networking.) If it's on a single
>> host, then it might be an issue with shared memory.
>>
>> Josh
>>
>> On Fri, Feb 20, 2015 at 1:51 AM, Sachin Krishnan <sachk...@gmail.com>
>> wrote:
>>
>>> Hello Josh,
>>>
>>> The command i use to compile the code is:
>>>
>>> mpicc bcast_loop.c
>>>
>>>
>>> To run the code I use:
>>>
>>> mpirun -np 2 ./a.out
>>>
>>> Output is unpredictable. It gets stuck at different places.
>>>
>>> Im attaching lstopo and ompi_info outputs. Do you need any other info?
>>>
>>>
>>> lstopo-no-graphics output:
>>>
>>> Machine (3433MB)
>>>
>>>   Socket L#0 + L3 L#0 (8192KB)
>>>
>>>     L2 L#0 (256KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0
>>>
>>>       PU L#0 (P#0)
>>>
>>>       PU L#1 (P#4)
>>>
>>>     L2 L#1 (256KB) + L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1
>>>
>>>       PU L#2 (P#1)
>>>
>>>       PU L#3 (P#5)
>>>
>>>     L2 L#2 (256KB) + L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2
>>>
>>>       PU L#4 (P#2)
>>>
>>>       PU L#5 (P#6)
>>>
>>>     L2 L#3 (256KB) + L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3
>>>
>>>       PU L#6 (P#3)
>>>
>>>       PU L#7 (P#7)
>>>
>>>   HostBridge L#0
>>>
>>>     PCI 8086:0162
>>>
>>>       GPU L#0 "card0"
>>>
>>>       GPU L#1 "renderD128"
>>>
>>>       GPU L#2 "controlD64"
>>>
>>>     PCI 8086:1502
>>>
>>>       Net L#3 "eth0"
>>>
>>>     PCI 8086:1e02
>>>
>>>       Block L#4 "sda"
>>>
>>>       Block L#5 "sr0"
>>>
>>>
>>> ompi_info output:
>>>
>>>
>>>                  Package: Open MPI builduser@anatol Distribution
>>>
>>>                 Open MPI: 1.8.4
>>>
>>>   Open MPI repo revision: v1.8.3-330-g0344f04
>>>
>>>    Open MPI release date: Dec 19, 2014
>>>
>>>                 Open RTE: 1.8.4
>>>
>>>   Open RTE repo revision: v1.8.3-330-g0344f04
>>>
>>>    Open RTE release date: Dec 19, 2014
>>>
>>>                     OPAL: 1.8.4
>>>
>>>       OPAL repo revision: v1.8.3-330-g0344f04
>>>
>>>        OPAL release date: Dec 19, 2014
>>>
>>>                  MPI API: 3.0
>>>
>>>             Ident string: 1.8.4
>>>
>>>                   Prefix: /usr
>>>
>>>  Configured architecture: i686-pc-linux-gnu
>>>
>>>           Configure host: anatol
>>>
>>>            Configured by: builduser
>>>
>>>            Configured on: Sat Dec 20 17:00:34 PST 2014
>>>
>>>           Configure host: anatol
>>>
>>>                 Built by: builduser
>>>
>>>                 Built on: Sat Dec 20 17:12:16 PST 2014
>>>
>>>               Built host: anatol
>>>
>>>               C bindings: yes
>>>
>>>             C++ bindings: yes
>>>
>>>              Fort mpif.h: yes (all)
>>>
>>>             Fort use mpi: yes (full: ignore TKR)
>>>
>>>        Fort use mpi size: deprecated-ompi-info-value
>>>
>>>         Fort use mpi_f08: yes
>>>
>>>  Fort mpi_f08 compliance: The mpi_f08 module is available, but due to
>>>
>>>                           limitations in the /usr/bin/gfortran compiler,
>>> does
>>>
>>>                           not support the following: array subsections,
>>>
>>>                           direct passthru (where possible) to underlying
>>> Open
>>>
>>>                           MPI's C functionality
>>>
>>>   Fort mpi_f08 subarrays: no
>>>
>>>            Java bindings: no
>>>
>>>   Wrapper compiler rpath: runpath
>>>
>>>               C compiler: gcc
>>>
>>>      C compiler absolute: /usr/bin/gcc
>>>
>>>   C compiler family name: GNU
>>>
>>>       C compiler version: 4.9.2
>>>
>>>             C++ compiler: g++
>>>
>>>    C++ compiler absolute: /usr/bin/g++
>>>
>>>            Fort compiler: /usr/bin/gfortran
>>>
>>>        Fort compiler abs:
>>>
>>>          Fort ignore TKR: yes (!GCC$ ATTRIBUTES NO_ARG_CHECK ::)
>>>
>>>    Fort 08 assumed shape: yes
>>>
>>>       Fort optional args: yes
>>>
>>>           Fort INTERFACE: yes
>>>
>>>     Fort ISO_FORTRAN_ENV: yes
>>>
>>>        Fort STORAGE_SIZE: yes
>>>
>>>       Fort BIND(C) (all): yes
>>>
>>>       Fort ISO_C_BINDING: yes
>>>
>>>  Fort SUBROUTINE BIND(C): yes
>>>
>>>        Fort TYPE,BIND(C): yes
>>>
>>>  Fort T,BIND(C,name="a"): yes
>>>
>>>             Fort PRIVATE: yes
>>>
>>>           Fort PROTECTED: yes
>>>
>>>            Fort ABSTRACT: yes
>>>
>>>        Fort ASYNCHRONOUS: yes
>>>
>>>           Fort PROCEDURE: yes
>>>
>>>            Fort C_FUNLOC: yes
>>>
>>>  Fort f08 using wrappers: yes
>>>
>>>          Fort MPI_SIZEOF: yes
>>>
>>>              C profiling: yes
>>>
>>>            C++ profiling: yes
>>>
>>>    Fort mpif.h profiling: yes
>>>
>>>   Fort use mpi profiling: yes
>>>
>>>    Fort use mpi_f08 prof: yes
>>>
>>>           C++ exceptions: no
>>>
>>>           Thread support: posix (MPI_THREAD_MULTIPLE: no, OPAL support:
>>> yes,
>>>
>>>                           OMPI progress: no, ORTE progress: yes, Event
>>> lib:
>>>
>>>                           yes)
>>>
>>>            Sparse Groups: no
>>>
>>>   Internal debug support: no
>>>
>>>   MPI interface warnings: yes
>>>
>>>      MPI parameter check: runtime
>>>
>>> Memory profiling support: no
>>>
>>> Memory debugging support: no
>>>
>>>          libltdl support: yes
>>>
>>>    Heterogeneous support: no
>>>
>>>  mpirun default --prefix: no
>>>
>>>          MPI I/O support: yes
>>>
>>>        MPI_WTIME support: gettimeofday
>>>
>>>      Symbol vis. support: yes
>>>
>>>    Host topology support: yes
>>>
>>>           MPI extensions:
>>>
>>>    FT Checkpoint support: no (checkpoint thread: no)
>>>
>>>    C/R Enabled Debugging: no
>>>
>>>      VampirTrace support: yes
>>>
>>>   MPI_MAX_PROCESSOR_NAME: 256
>>>
>>>     MPI_MAX_ERROR_STRING: 256
>>>
>>>      MPI_MAX_OBJECT_NAME: 64
>>>
>>>         MPI_MAX_INFO_KEY: 36
>>>
>>>         MPI_MAX_INFO_VAL: 256
>>>
>>>        MPI_MAX_PORT_NAME: 1024
>>>
>>>   MPI_MAX_DATAREP_STRING: 128
>>>
>>>            MCA backtrace: execinfo (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>             MCA compress: bzip (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>             MCA compress: gzip (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                  MCA crs: none (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                   MCA db: hash (MCA v2.0, API v1.0, Component v1.8.4)
>>>
>>>                   MCA db: print (MCA v2.0, API v1.0, Component v1.8.4)
>>>
>>>                MCA event: libevent2021 (MCA v2.0, API v2.0, Component
>>> v1.8.4)
>>>
>>>                MCA hwloc: external (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                   MCA if: posix_ipv4 (MCA v2.0, API v2.0, Component
>>> v1.8.4)
>>>
>>>                   MCA if: linux_ipv6 (MCA v2.0, API v2.0, Component
>>> v1.8.4)
>>>
>>>          MCA installdirs: env (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>          MCA installdirs: config (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>           MCA memchecker: valgrind (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>               MCA memory: linux (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                MCA pstat: linux (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                  MCA sec: basic (MCA v2.0, API v1.0, Component v1.8.4)
>>>
>>>                MCA shmem: mmap (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                MCA shmem: posix (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                MCA shmem: sysv (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                MCA timer: linux (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                  MCA dfs: app (MCA v2.0, API v1.0, Component v1.8.4)
>>>
>>>                  MCA dfs: orted (MCA v2.0, API v1.0, Component v1.8.4)
>>>
>>>                  MCA dfs: test (MCA v2.0, API v1.0, Component v1.8.4)
>>>
>>>               MCA errmgr: default_app (MCA v2.0, API v3.0, Component
>>> v1.8.4)
>>>
>>>               MCA errmgr: default_hnp (MCA v2.0, API v3.0, Component
>>> v1.8.4)
>>>
>>>               MCA errmgr: default_orted (MCA v2.0, API v3.0, Component
>>>
>>>                           v1.8.4)
>>>
>>>               MCA errmgr: default_tool (MCA v2.0, API v3.0, Component
>>> v1.8.4)
>>>
>>>                  MCA ess: env (MCA v2.0, API v3.0, Component v1.8.4)
>>>
>>>                  MCA ess: hnp (MCA v2.0, API v3.0, Component v1.8.4)
>>>
>>>                  MCA ess: singleton (MCA v2.0, API v3.0, Component
>>> v1.8.4)
>>>
>>>                  MCA ess: tool (MCA v2.0, API v3.0, Component v1.8.4)
>>>
>>>                MCA filem: raw (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>              MCA grpcomm: bad (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                  MCA iof: hnp (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                  MCA iof: mr_hnp (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                  MCA iof: mr_orted (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                  MCA iof: orted (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                  MCA iof: tool (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                 MCA odls: default (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                  MCA oob: tcp (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                  MCA plm: isolated (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                  MCA plm: rsh (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                  MCA ras: loadleveler (MCA v2.0, API v2.0, Component
>>> v1.8.4)
>>>
>>>                  MCA ras: simulator (MCA v2.0, API v2.0, Component
>>> v1.8.4)
>>>
>>>                MCA rmaps: lama (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                MCA rmaps: mindist (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                MCA rmaps: ppr (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                MCA rmaps: rank_file (MCA v2.0, API v2.0, Component
>>> v1.8.4)
>>>
>>>                MCA rmaps: resilient (MCA v2.0, API v2.0, Component
>>> v1.8.4)
>>>
>>>                MCA rmaps: round_robin (MCA v2.0, API v2.0, Component
>>> v1.8.4)
>>>
>>>                MCA rmaps: seq (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                MCA rmaps: staged (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                  MCA rml: oob (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>               MCA routed: binomial (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>               MCA routed: debruijn (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>               MCA routed: direct (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>               MCA routed: radix (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                MCA state: app (MCA v2.0, API v1.0, Component v1.8.4)
>>>
>>>                MCA state: hnp (MCA v2.0, API v1.0, Component v1.8.4)
>>>
>>>                MCA state: novm (MCA v2.0, API v1.0, Component v1.8.4)
>>>
>>>                MCA state: orted (MCA v2.0, API v1.0, Component v1.8.4)
>>>
>>>                MCA state: staged_hnp (MCA v2.0, API v1.0, Component
>>> v1.8.4)
>>>
>>>                MCA state: staged_orted (MCA v2.0, API v1.0, Component
>>> v1.8.4)
>>>
>>>                MCA state: tool (MCA v2.0, API v1.0, Component v1.8.4)
>>>
>>>            MCA allocator: basic (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>            MCA allocator: bucket (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                 MCA bcol: basesmuma (MCA v2.0, API v2.0, Component
>>> v1.8.4)
>>>
>>>                 MCA bcol: ptpcoll (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                  MCA bml: r2 (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                  MCA btl: self (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                  MCA btl: sm (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                  MCA btl: tcp (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                  MCA btl: vader (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                 MCA coll: basic (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                 MCA coll: hierarch (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                 MCA coll: inter (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                 MCA coll: libnbc (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                 MCA coll: ml (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                 MCA coll: self (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                 MCA coll: sm (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                 MCA coll: tuned (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                  MCA dpm: orte (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                 MCA fbtl: posix (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                MCA fcoll: dynamic (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                MCA fcoll: individual (MCA v2.0, API v2.0, Component
>>> v1.8.4)
>>>
>>>                MCA fcoll: static (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                MCA fcoll: two_phase (MCA v2.0, API v2.0, Component
>>> v1.8.4)
>>>
>>>                MCA fcoll: ylib (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                   MCA fs: ufs (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                   MCA io: ompio (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                   MCA io: romio (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                MCA mpool: grdma (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                MCA mpool: sm (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                  MCA osc: rdma (MCA v2.0, API v3.0, Component v1.8.4)
>>>
>>>                  MCA osc: sm (MCA v2.0, API v3.0, Component v1.8.4)
>>>
>>>                  MCA pml: v (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                  MCA pml: bfo (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                  MCA pml: cm (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                  MCA pml: ob1 (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>               MCA pubsub: orte (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>               MCA rcache: vma (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                  MCA rte: orte (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                 MCA sbgp: basesmsocket (MCA v2.0, API v2.0, Component
>>> v1.8.4)
>>>
>>>                 MCA sbgp: basesmuma (MCA v2.0, API v2.0, Component
>>> v1.8.4)
>>>
>>>                 MCA sbgp: p2p (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>             MCA sharedfp: individual (MCA v2.0, API v2.0, Component
>>> v1.8.4)
>>>
>>>             MCA sharedfp: lockedfile (MCA v2.0, API v2.0, Component
>>> v1.8.4)
>>>
>>>             MCA sharedfp: sm (MCA v2.0, API v2.0, Component v1.8.4)
>>>
>>>                 MCA topo: basic (MCA v2.0, API v2.1, Component v1.8.4)
>>>
>>>            MCA vprotocol: pessimist (MCA v2.0, API v2.0, Component
>>> v1.8.4)
>>>
>>> Sachin
>>>
>>> >Sachin,
>>>
>>> >Can you, please, provide a command line? Additional information about
>>> your
>>> >system could be helpful also.
>>>
>>> >Josh
>>>
>>> >>On Wed, Feb 18, 2015 at 3:43 AM, Sachin Krishnan
>>> <sachkris_at_[hidden]> wrote:
>>>
>>> >> Hello,
>>> >>
>>> >> I am new to MPI and also this list.
>>> >> I wrote an MPI code with several MPI_Bcast calls in a loop. My code
>>> was
>>> >> getting stuck at random points, ie it was not systematic. After a few
>>> hours
>>> >> of debugging and googling, I found that the issue may be with the
>>> several
>>> >> MPI_Bcast calls in a loop.
>>> >>
>>> >> I stumbled on this test code which can reproduce the issue:
>>> >>
>>> https://github.com/fintler/ompi/blob/master/orte/test/mpi/bcast_loop.c
>>> >>
>>> >> Im using OpenMPI v1.8.4 installed from official Arch Linux repo.
>>> >>
>>> >> Is it a known issue with OpenMPI?
>>> >> Is it some problem with the way openmpi is configured in my system?
>>> >>
>>> >> Thanks in advance.
>>> >>
>>> >> Sachin
>>> >>
>>> >>
>>> >>
>>> >> _______________________________________________
>>> >> users mailing list
>>> >> users_at_[hidden]
>>> >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> >> Link to this post:
>>> >> http://www.open-mpi.org/community/lists/users/2015/02/26338.php
>>> >>
>>>
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/users/2015/02/26363.php
>>>
>>
>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/users/2015/02/26366.php
>>
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2015/02/26367.php
>

Reply via email to