Sachin, I cant replicate your issue neither with the latest 1.8 nor with the trunk. I tried using a single host, while forcing SM and then TP to no avail.
Can you try restricting the collective modules in use (adding --mca coll tuned,basic) to your mpirun command? George. On Fri, Feb 20, 2015 at 9:31 PM, Sachin Krishnan <sachk...@gmail.com> wrote: > Josh, > > Thanks for the help. > I'm running on a single host. How do I confirm that it is an issue with > the shared memory? > > Sachin > > On Fri, Feb 20, 2015 at 11:58 PM, Joshua Ladd <jladd.m...@gmail.com> > wrote: > >> Sachin, >> >> Are you running this on a single host or across multiple hosts (i.e. are >> you communicating between processes via networking.) If it's on a single >> host, then it might be an issue with shared memory. >> >> Josh >> >> On Fri, Feb 20, 2015 at 1:51 AM, Sachin Krishnan <sachk...@gmail.com> >> wrote: >> >>> Hello Josh, >>> >>> The command i use to compile the code is: >>> >>> mpicc bcast_loop.c >>> >>> >>> To run the code I use: >>> >>> mpirun -np 2 ./a.out >>> >>> Output is unpredictable. It gets stuck at different places. >>> >>> Im attaching lstopo and ompi_info outputs. Do you need any other info? >>> >>> >>> lstopo-no-graphics output: >>> >>> Machine (3433MB) >>> >>> Socket L#0 + L3 L#0 (8192KB) >>> >>> L2 L#0 (256KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 >>> >>> PU L#0 (P#0) >>> >>> PU L#1 (P#4) >>> >>> L2 L#1 (256KB) + L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 >>> >>> PU L#2 (P#1) >>> >>> PU L#3 (P#5) >>> >>> L2 L#2 (256KB) + L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 >>> >>> PU L#4 (P#2) >>> >>> PU L#5 (P#6) >>> >>> L2 L#3 (256KB) + L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 >>> >>> PU L#6 (P#3) >>> >>> PU L#7 (P#7) >>> >>> HostBridge L#0 >>> >>> PCI 8086:0162 >>> >>> GPU L#0 "card0" >>> >>> GPU L#1 "renderD128" >>> >>> GPU L#2 "controlD64" >>> >>> PCI 8086:1502 >>> >>> Net L#3 "eth0" >>> >>> PCI 8086:1e02 >>> >>> Block L#4 "sda" >>> >>> Block L#5 "sr0" >>> >>> >>> ompi_info output: >>> >>> >>> Package: Open MPI builduser@anatol Distribution >>> >>> Open MPI: 1.8.4 >>> >>> Open MPI repo revision: v1.8.3-330-g0344f04 >>> >>> Open MPI release date: Dec 19, 2014 >>> >>> Open RTE: 1.8.4 >>> >>> Open RTE repo revision: v1.8.3-330-g0344f04 >>> >>> Open RTE release date: Dec 19, 2014 >>> >>> OPAL: 1.8.4 >>> >>> OPAL repo revision: v1.8.3-330-g0344f04 >>> >>> OPAL release date: Dec 19, 2014 >>> >>> MPI API: 3.0 >>> >>> Ident string: 1.8.4 >>> >>> Prefix: /usr >>> >>> Configured architecture: i686-pc-linux-gnu >>> >>> Configure host: anatol >>> >>> Configured by: builduser >>> >>> Configured on: Sat Dec 20 17:00:34 PST 2014 >>> >>> Configure host: anatol >>> >>> Built by: builduser >>> >>> Built on: Sat Dec 20 17:12:16 PST 2014 >>> >>> Built host: anatol >>> >>> C bindings: yes >>> >>> C++ bindings: yes >>> >>> Fort mpif.h: yes (all) >>> >>> Fort use mpi: yes (full: ignore TKR) >>> >>> Fort use mpi size: deprecated-ompi-info-value >>> >>> Fort use mpi_f08: yes >>> >>> Fort mpi_f08 compliance: The mpi_f08 module is available, but due to >>> >>> limitations in the /usr/bin/gfortran compiler, >>> does >>> >>> not support the following: array subsections, >>> >>> direct passthru (where possible) to underlying >>> Open >>> >>> MPI's C functionality >>> >>> Fort mpi_f08 subarrays: no >>> >>> Java bindings: no >>> >>> Wrapper compiler rpath: runpath >>> >>> C compiler: gcc >>> >>> C compiler absolute: /usr/bin/gcc >>> >>> C compiler family name: GNU >>> >>> C compiler version: 4.9.2 >>> >>> C++ compiler: g++ >>> >>> C++ compiler absolute: /usr/bin/g++ >>> >>> Fort compiler: /usr/bin/gfortran >>> >>> Fort compiler abs: >>> >>> Fort ignore TKR: yes (!GCC$ ATTRIBUTES NO_ARG_CHECK ::) >>> >>> Fort 08 assumed shape: yes >>> >>> Fort optional args: yes >>> >>> Fort INTERFACE: yes >>> >>> Fort ISO_FORTRAN_ENV: yes >>> >>> Fort STORAGE_SIZE: yes >>> >>> Fort BIND(C) (all): yes >>> >>> Fort ISO_C_BINDING: yes >>> >>> Fort SUBROUTINE BIND(C): yes >>> >>> Fort TYPE,BIND(C): yes >>> >>> Fort T,BIND(C,name="a"): yes >>> >>> Fort PRIVATE: yes >>> >>> Fort PROTECTED: yes >>> >>> Fort ABSTRACT: yes >>> >>> Fort ASYNCHRONOUS: yes >>> >>> Fort PROCEDURE: yes >>> >>> Fort C_FUNLOC: yes >>> >>> Fort f08 using wrappers: yes >>> >>> Fort MPI_SIZEOF: yes >>> >>> C profiling: yes >>> >>> C++ profiling: yes >>> >>> Fort mpif.h profiling: yes >>> >>> Fort use mpi profiling: yes >>> >>> Fort use mpi_f08 prof: yes >>> >>> C++ exceptions: no >>> >>> Thread support: posix (MPI_THREAD_MULTIPLE: no, OPAL support: >>> yes, >>> >>> OMPI progress: no, ORTE progress: yes, Event >>> lib: >>> >>> yes) >>> >>> Sparse Groups: no >>> >>> Internal debug support: no >>> >>> MPI interface warnings: yes >>> >>> MPI parameter check: runtime >>> >>> Memory profiling support: no >>> >>> Memory debugging support: no >>> >>> libltdl support: yes >>> >>> Heterogeneous support: no >>> >>> mpirun default --prefix: no >>> >>> MPI I/O support: yes >>> >>> MPI_WTIME support: gettimeofday >>> >>> Symbol vis. support: yes >>> >>> Host topology support: yes >>> >>> MPI extensions: >>> >>> FT Checkpoint support: no (checkpoint thread: no) >>> >>> C/R Enabled Debugging: no >>> >>> VampirTrace support: yes >>> >>> MPI_MAX_PROCESSOR_NAME: 256 >>> >>> MPI_MAX_ERROR_STRING: 256 >>> >>> MPI_MAX_OBJECT_NAME: 64 >>> >>> MPI_MAX_INFO_KEY: 36 >>> >>> MPI_MAX_INFO_VAL: 256 >>> >>> MPI_MAX_PORT_NAME: 1024 >>> >>> MPI_MAX_DATAREP_STRING: 128 >>> >>> MCA backtrace: execinfo (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA compress: bzip (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA compress: gzip (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA crs: none (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA db: hash (MCA v2.0, API v1.0, Component v1.8.4) >>> >>> MCA db: print (MCA v2.0, API v1.0, Component v1.8.4) >>> >>> MCA event: libevent2021 (MCA v2.0, API v2.0, Component >>> v1.8.4) >>> >>> MCA hwloc: external (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA if: posix_ipv4 (MCA v2.0, API v2.0, Component >>> v1.8.4) >>> >>> MCA if: linux_ipv6 (MCA v2.0, API v2.0, Component >>> v1.8.4) >>> >>> MCA installdirs: env (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA installdirs: config (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA memchecker: valgrind (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA memory: linux (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA pstat: linux (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA sec: basic (MCA v2.0, API v1.0, Component v1.8.4) >>> >>> MCA shmem: mmap (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA shmem: posix (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA shmem: sysv (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA timer: linux (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA dfs: app (MCA v2.0, API v1.0, Component v1.8.4) >>> >>> MCA dfs: orted (MCA v2.0, API v1.0, Component v1.8.4) >>> >>> MCA dfs: test (MCA v2.0, API v1.0, Component v1.8.4) >>> >>> MCA errmgr: default_app (MCA v2.0, API v3.0, Component >>> v1.8.4) >>> >>> MCA errmgr: default_hnp (MCA v2.0, API v3.0, Component >>> v1.8.4) >>> >>> MCA errmgr: default_orted (MCA v2.0, API v3.0, Component >>> >>> v1.8.4) >>> >>> MCA errmgr: default_tool (MCA v2.0, API v3.0, Component >>> v1.8.4) >>> >>> MCA ess: env (MCA v2.0, API v3.0, Component v1.8.4) >>> >>> MCA ess: hnp (MCA v2.0, API v3.0, Component v1.8.4) >>> >>> MCA ess: singleton (MCA v2.0, API v3.0, Component >>> v1.8.4) >>> >>> MCA ess: tool (MCA v2.0, API v3.0, Component v1.8.4) >>> >>> MCA filem: raw (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA grpcomm: bad (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA iof: hnp (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA iof: mr_hnp (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA iof: mr_orted (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA iof: orted (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA iof: tool (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA odls: default (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA oob: tcp (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA plm: isolated (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA plm: rsh (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA ras: loadleveler (MCA v2.0, API v2.0, Component >>> v1.8.4) >>> >>> MCA ras: simulator (MCA v2.0, API v2.0, Component >>> v1.8.4) >>> >>> MCA rmaps: lama (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA rmaps: mindist (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA rmaps: ppr (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA rmaps: rank_file (MCA v2.0, API v2.0, Component >>> v1.8.4) >>> >>> MCA rmaps: resilient (MCA v2.0, API v2.0, Component >>> v1.8.4) >>> >>> MCA rmaps: round_robin (MCA v2.0, API v2.0, Component >>> v1.8.4) >>> >>> MCA rmaps: seq (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA rmaps: staged (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA rml: oob (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA routed: binomial (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA routed: debruijn (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA routed: direct (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA routed: radix (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA state: app (MCA v2.0, API v1.0, Component v1.8.4) >>> >>> MCA state: hnp (MCA v2.0, API v1.0, Component v1.8.4) >>> >>> MCA state: novm (MCA v2.0, API v1.0, Component v1.8.4) >>> >>> MCA state: orted (MCA v2.0, API v1.0, Component v1.8.4) >>> >>> MCA state: staged_hnp (MCA v2.0, API v1.0, Component >>> v1.8.4) >>> >>> MCA state: staged_orted (MCA v2.0, API v1.0, Component >>> v1.8.4) >>> >>> MCA state: tool (MCA v2.0, API v1.0, Component v1.8.4) >>> >>> MCA allocator: basic (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA allocator: bucket (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA bcol: basesmuma (MCA v2.0, API v2.0, Component >>> v1.8.4) >>> >>> MCA bcol: ptpcoll (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA bml: r2 (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA btl: self (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA btl: sm (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA btl: tcp (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA btl: vader (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA coll: basic (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA coll: hierarch (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA coll: inter (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA coll: libnbc (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA coll: ml (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA coll: self (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA coll: sm (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA coll: tuned (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA dpm: orte (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA fbtl: posix (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA fcoll: dynamic (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA fcoll: individual (MCA v2.0, API v2.0, Component >>> v1.8.4) >>> >>> MCA fcoll: static (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA fcoll: two_phase (MCA v2.0, API v2.0, Component >>> v1.8.4) >>> >>> MCA fcoll: ylib (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA fs: ufs (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA io: ompio (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA io: romio (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA mpool: grdma (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA mpool: sm (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA osc: rdma (MCA v2.0, API v3.0, Component v1.8.4) >>> >>> MCA osc: sm (MCA v2.0, API v3.0, Component v1.8.4) >>> >>> MCA pml: v (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA pml: bfo (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA pml: cm (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA pml: ob1 (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA pubsub: orte (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA rcache: vma (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA rte: orte (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA sbgp: basesmsocket (MCA v2.0, API v2.0, Component >>> v1.8.4) >>> >>> MCA sbgp: basesmuma (MCA v2.0, API v2.0, Component >>> v1.8.4) >>> >>> MCA sbgp: p2p (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA sharedfp: individual (MCA v2.0, API v2.0, Component >>> v1.8.4) >>> >>> MCA sharedfp: lockedfile (MCA v2.0, API v2.0, Component >>> v1.8.4) >>> >>> MCA sharedfp: sm (MCA v2.0, API v2.0, Component v1.8.4) >>> >>> MCA topo: basic (MCA v2.0, API v2.1, Component v1.8.4) >>> >>> MCA vprotocol: pessimist (MCA v2.0, API v2.0, Component >>> v1.8.4) >>> >>> Sachin >>> >>> >Sachin, >>> >>> >Can you, please, provide a command line? Additional information about >>> your >>> >system could be helpful also. >>> >>> >Josh >>> >>> >>On Wed, Feb 18, 2015 at 3:43 AM, Sachin Krishnan >>> <sachkris_at_[hidden]> wrote: >>> >>> >> Hello, >>> >> >>> >> I am new to MPI and also this list. >>> >> I wrote an MPI code with several MPI_Bcast calls in a loop. My code >>> was >>> >> getting stuck at random points, ie it was not systematic. After a few >>> hours >>> >> of debugging and googling, I found that the issue may be with the >>> several >>> >> MPI_Bcast calls in a loop. >>> >> >>> >> I stumbled on this test code which can reproduce the issue: >>> >> >>> https://github.com/fintler/ompi/blob/master/orte/test/mpi/bcast_loop.c >>> >> >>> >> Im using OpenMPI v1.8.4 installed from official Arch Linux repo. >>> >> >>> >> Is it a known issue with OpenMPI? >>> >> Is it some problem with the way openmpi is configured in my system? >>> >> >>> >> Thanks in advance. >>> >> >>> >> Sachin >>> >> >>> >> >>> >> >>> >> _______________________________________________ >>> >> users mailing list >>> >> users_at_[hidden] >>> >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >> Link to this post: >>> >> http://www.open-mpi.org/community/lists/users/2015/02/26338.php >>> >> >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>> Link to this post: >>> http://www.open-mpi.org/community/lists/users/2015/02/26363.php >>> >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2015/02/26366.php >> > > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/02/26367.php >