Assume that we have K=q*k nodes (slaves) where q,k are positive integers >= 2.
Based on the scheme that I am currently using I create [q^(k-1)]*(q-1) groups (along with their communicators). Each group consists of k nodes and within each group exactly k broadcasts take place (each node broadcasts something to the rest of them). So in total [q^(k-1)]*(q-1)*k MPI broadcasts take place. Let me skip the details of the above scheme. Now theoretically I figured out that there are q-1 groups that can communicate in parallel at the same time i.e. groups that have no common nodes and I would like to utilize that to speedup the shuffling. I have seen here https://stackoverflow.com/questions/11372012/mpi-several-broadcast-at-the-same-time that this is possible in MPI. In my case it's more complicated since q,k are parameters of the problem and change between different experiments. If I get the idea about the 2nd method that is proposed there and assume that we have only 3 groups within which some communication takes places one can simply do: *if my rank belongs to group 1{* * comm1.Bcast(..., ..., ..., rootId);* *}else if my rank belongs to group 2{* * comm2.Bcast(..., ..., ..., rootId);* *}else if my rank belongs to group3{* * comm3.Bcast(..., ..., ..., rootId);* *} * where comm1, comm2, comm3 are the corresponding sub-communicators that contain only the members of each group. But how can I generalize the above idea to arbitrary number of groups or perhaps do something else? The code is in C++ and the MPI installed is described in the attached file. Regards, Kostas
Package: Open MPI buildd@lgw01-57 Distribution Open MPI: 1.10.2 Open MPI repo revision: v1.10.1-145-g799148f Open MPI release date: Jan 21, 2016 Open RTE: 1.10.2 Open RTE repo revision: v1.10.1-145-g799148f Open RTE release date: Jan 21, 2016 OPAL: 1.10.2 OPAL repo revision: v1.10.1-145-g799148f OPAL release date: Jan 21, 2016 MPI API: 3.0.0 Ident string: 1.10.2 Prefix: /usr Configured architecture: x86_64-pc-linux-gnu Configure host: lgw01-57 Configured by: buildd Configured on: Thu Feb 25 16:33:01 UTC 2016 Configure host: lgw01-57 Built by: buildd Built on: Thu Feb 25 16:40:59 UTC 2016 Built host: lgw01-57 C bindings: yes C++ bindings: yes Fort mpif.h: yes (all) Fort use mpi: yes (full: ignore TKR) Fort use mpi size: deprecated-ompi-info-value Fort use mpi_f08: yes Fort mpi_f08 compliance: The mpi_f08 module is available, but due to limitations in the gfortran compiler, does not support the following: array subsections, direct passthru (where possible) to underlying Open MPI's C functionality Fort mpi_f08 subarrays: no Java bindings: no Wrapper compiler rpath: runpath C compiler: gcc C compiler absolute: /usr/bin/gcc C compiler family name: GNU C compiler version: 5.3.1 C++ compiler: g++ C++ compiler absolute: /usr/bin/g++ Fort compiler: gfortran Fort compiler abs: /usr/bin/gfortran Fort ignore TKR: yes (!GCC$ ATTRIBUTES NO_ARG_CHECK ::) Fort 08 assumed shape: yes Fort optional args: yes Fort INTERFACE: yes Fort ISO_FORTRAN_ENV: yes Fort STORAGE_SIZE: yes Fort BIND(C) (all): yes Fort ISO_C_BINDING: yes Fort SUBROUTINE BIND(C): yes Fort TYPE,BIND(C): yes Fort T,BIND(C,name="a"): yes Fort PRIVATE: yes Fort PROTECTED: yes Fort ABSTRACT: yes Fort ASYNCHRONOUS: yes Fort PROCEDURE: yes Fort USE...ONLY: yes Fort C_FUNLOC: yes Fort f08 using wrappers: yes Fort MPI_SIZEOF: yes C profiling: yes C++ profiling: yes Fort mpif.h profiling: yes Fort use mpi profiling: yes Fort use mpi_f08 prof: yes C++ exceptions: no Thread support: posix (MPI_THREAD_MULTIPLE: no, OPAL support: yes, OMPI progress: no, ORTE progress: yes, Event lib: yes) Sparse Groups: no Internal debug support: no MPI interface warnings: yes MPI parameter check: runtime Memory profiling support: no Memory debugging support: no dl support: yes Heterogeneous support: yes mpirun default --prefix: no MPI I/O support: yes MPI_WTIME support: gettimeofday Symbol vis. support: yes Host topology support: yes MPI extensions: FT Checkpoint support: no (checkpoint thread: no) C/R Enabled Debugging: no VampirTrace support: no MPI_MAX_PROCESSOR_NAME: 256 MPI_MAX_ERROR_STRING: 256 MPI_MAX_OBJECT_NAME: 64 MPI_MAX_INFO_KEY: 36 MPI_MAX_INFO_VAL: 256 MPI_MAX_PORT_NAME: 1024 MPI_MAX_DATAREP_STRING: 128 MCA backtrace: execinfo (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA compress: gzip (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA compress: bzip (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA crs: none (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA db: print (MCA v2.0.0, API v1.0.0, Component v1.10.2) MCA db: hash (MCA v2.0.0, API v1.0.0, Component v1.10.2) MCA dl: dlopen (MCA v2.0.0, API v1.0.0, Component v1.10.2) MCA event: libevent2021 (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA hwloc: external (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA if: posix_ipv4 (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA if: linux_ipv6 (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA installdirs: env (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA installdirs: config (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA memory: linux (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA pstat: linux (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA sec: basic (MCA v2.0.0, API v1.0.0, Component v1.10.2) MCA shmem: posix (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA shmem: sysv (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA shmem: mmap (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA timer: linux (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA dfs: test (MCA v2.0.0, API v1.0.0, Component v1.10.2) MCA dfs: app (MCA v2.0.0, API v1.0.0, Component v1.10.2) MCA dfs: orted (MCA v2.0.0, API v1.0.0, Component v1.10.2) MCA errmgr: default_hnp (MCA v2.0.0, API v3.0.0, Component v1.10.2) MCA errmgr: default_orted (MCA v2.0.0, API v3.0.0, Component v1.10.2) MCA errmgr: default_tool (MCA v2.0.0, API v3.0.0, Component v1.10.2) MCA errmgr: default_app (MCA v2.0.0, API v3.0.0, Component v1.10.2) MCA ess: slurm (MCA v2.0.0, API v3.0.0, Component v1.10.2) MCA ess: env (MCA v2.0.0, API v3.0.0, Component v1.10.2) MCA ess: singleton (MCA v2.0.0, API v3.0.0, Component v1.10.2) MCA ess: tool (MCA v2.0.0, API v3.0.0, Component v1.10.2) MCA ess: hnp (MCA v2.0.0, API v3.0.0, Component v1.10.2) MCA filem: raw (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA grpcomm: bad (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA iof: tool (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA iof: mr_orted (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA iof: orted (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA iof: mr_hnp (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA iof: hnp (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA odls: default (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA oob: tcp (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA plm: slurm (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA plm: rsh (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA plm: isolated (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA ras: slurm (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA ras: simulator (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA ras: loadleveler (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA ras: gridengine (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA rmaps: rank_file (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA rmaps: seq (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA rmaps: mindist (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA rmaps: staged (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA rmaps: ppr (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA rmaps: resilient (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA rmaps: round_robin (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA rml: oob (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA routed: binomial (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA routed: radix (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA routed: debruijn (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA routed: direct (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA state: staged_orted (MCA v2.0.0, API v1.0.0, Component v1.10.2) MCA state: hnp (MCA v2.0.0, API v1.0.0, Component v1.10.2) MCA state: staged_hnp (MCA v2.0.0, API v1.0.0, Component v1.10.2) MCA state: orted (MCA v2.0.0, API v1.0.0, Component v1.10.2) MCA state: app (MCA v2.0.0, API v1.0.0, Component v1.10.2) MCA state: dvm (MCA v2.0.0, API v1.0.0, Component v1.10.2) MCA state: tool (MCA v2.0.0, API v1.0.0, Component v1.10.2) MCA state: novm (MCA v2.0.0, API v1.0.0, Component v1.10.2) MCA allocator: basic (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA allocator: bucket (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA bcol: basesmuma (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA bcol: ptpcoll (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA bml: r2 (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA btl: self (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA btl: tcp (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA btl: openib (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA btl: sm (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA btl: vader (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA coll: self (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA coll: sm (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA coll: libnbc (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA coll: tuned (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA coll: hierarch (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA coll: basic (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA coll: inter (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA coll: ml (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA dpm: orte (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA fbtl: posix (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA fcoll: dynamic (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA fcoll: two_phase (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA fcoll: ylib (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA fcoll: static (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA fcoll: individual (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA fs: ufs (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA io: romio (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA io: ompio (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA mpool: grdma (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA mpool: sm (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA osc: sm (MCA v2.0.0, API v3.0.0, Component v1.10.2) MCA osc: pt2pt (MCA v2.0.0, API v3.0.0, Component v1.10.2) MCA pml: v (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA pml: cm (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA pml: bfo (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA pml: ob1 (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA pubsub: orte (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA rcache: vma (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA rte: orte (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA sbgp: basesmsocket (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA sbgp: basesmuma (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA sbgp: p2p (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA sharedfp: lockedfile (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA sharedfp: individual (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA sharedfp: sm (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA topo: basic (MCA v2.0.0, API v2.1.0, Component v1.10.2) MCA vprotocol: pessimist (MCA v2.0.0, API v2.0.0, Component v1.10.2)
_______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users