Collecting data during execution is possible in OMPI either with an external tool, such as mpiP, or the internal infrastructure, SPC. Take a look at ./examples/spc_example.c or ./test/spc/spc_test.c to see how to use this.
George. On Fri, Feb 11, 2022 at 9:43 AM Bertini, Denis Dr. via users < users@lists.open-mpi.org> wrote: > I have seen in OSU INAM paper: > > > " > While we chose MVAPICH2 for implementing our designs, any MPI > runtime (e.g.: OpenMPI [12]) can be modified to perform similar data > collection and > transmission. > " > > But i do not know what it is meant with "modified" openMPI ? > > > Cheers, > > Denis > > > ------------------------------ > *From:* Joseph Schuchart <schuch...@icl.utk.edu> > *Sent:* Friday, February 11, 2022 3:02:36 PM > *To:* Bertini, Denis Dr.; Open MPI Users > *Subject:* Re: [OMPI users] Using OSU benchmarks for checking Infiniband > network > > I am not aware of anything similar in Open MPI. Maybe OSU-INAM can work > with other MPI implementations? Would be worth investigating... > > Joseph > > On 2/11/22 06:54, Bertini, Denis Dr. wrote: > > > > Hi Joseph > > > > Looking at the MVAPICH i noticed that, in this MPI implementation > > > > a Infiniband Network Analysis and Profiling Tool is provided: > > > > > > OSU-INAM > > > > > > Is there something equivalent using openMPI ? > > > > Best > > > > Denis > > > > > > ------------------------------------------------------------------------ > > *From:* users <users-boun...@lists.open-mpi.org> on behalf of Joseph > > Schuchart via users <users@lists.open-mpi.org> > > *Sent:* Tuesday, February 8, 2022 4:02:53 PM > > *To:* users@lists.open-mpi.org > > *Cc:* Joseph Schuchart > > *Subject:* Re: [OMPI users] Using OSU benchmarks for checking > > Infiniband network > > Hi Denis, > > > > Sorry if I missed it in your previous messages but could you also try > > running a different MPI implementation (MVAPICH) to see whether Open MPI > > is at fault or the system is somehow to blame for it? > > > > Thanks > > Joseph > > > > On 2/8/22 03:06, Bertini, Denis Dr. via users wrote: > > > > > > Hi > > > > > > Thanks for all these informations ! > > > > > > > > > But i have to confess that in this multi-tuning-parameter space, > > > > > > i got somehow lost. > > > > > > Furthermore it is somtimes mixing between user-space and kernel-space. > > > > > > I have only possibility to act on the user space. > > > > > > > > > 1) So i have on the system max locked memory: > > > > > > - ulimit -l unlimited (default ) > > > > > > and i do not see any warnings/errors related to that when > > launching MPI. > > > > > > > > > 2) I tried differents algorithms for MPI_all_reduce op. all showing > > > drop in > > > > > > bw for size=16384 > > > > > > > > > 4) I disable openIB ( no RDMA, ) and used only TCP, and i noticed > > > > > > the same behaviour. > > > > > > > > > 3) i realized that increasing the so-called warm up parameter in the > > > > > > OSU benchmark (argument -x 200 as default) the discrepancy. > > > > > > At the contrary putting lower threshold ( -x 10 ) can increase this BW > > > > > > discrepancy up to factor 300 at message size 16384 compare to > > > > > > message size 8192 for example. > > > > > > So does it means that there are some caching effects > > > > > > in the internode communication? > > > > > > > > > From my experience, to tune parameters is a time-consuming and > > cumbersome > > > > > > task. > > > > > > > > > Could it also be the problem is not really on the openMPI > > > implemenation but on the > > > > > > system? > > > > > > > > > Best > > > > > > Denis > > > > > > > ------------------------------------------------------------------------ > > > *From:* users <users-boun...@lists.open-mpi.org> on behalf of Gus > > > Correa via users <users@lists.open-mpi.org> > > > *Sent:* Monday, February 7, 2022 9:14:19 PM > > > *To:* Open MPI Users > > > *Cc:* Gus Correa > > > *Subject:* Re: [OMPI users] Using OSU benchmarks for checking > > > Infiniband network > > > This may have changed since, but these used to be relevant points. > > > Overall, the Open MPI FAQ have lots of good suggestions: > > > https://www.open-mpi.org/faq/ > > > some specific for performance tuning: > > > https://www.open-mpi.org/faq/?category=tuning > > > https://www.open-mpi.org/faq/?category=openfabrics > > > > > > 1) Make sure you are not using the Ethernet TCP/IP, which is widely > > > available in compute nodes: > > > mpirun --mca btl self,sm,openib ... > > > > > > https://www.open-mpi.org/faq/?category=tuning#selecting-components > > > > > > However, this may have changed lately: > > > https://www.open-mpi.org/faq/?category=tcp#tcp-auto-disable > > > 2) Maximum locked memory used by IB and their system limit. Start > > > here: > > > > > > https://www.open-mpi.org/faq/?category=openfabrics#limiting-registered-memory-usage > > > 3) The eager vs. rendezvous message size threshold. I wonder if it may > > > sit right where you see the latency spike. > > > https://www.open-mpi.org/faq/?category=all#ib-locked-pages-user > > > 4) Processor and memory locality/affinity and binding (please check > > > the current options and syntax) > > > https://www.open-mpi.org/faq/?category=tuning#using-paffinity-v1.4 > > > > > > On Mon, Feb 7, 2022 at 11:01 AM Benson Muite via users > > > <users@lists.open-mpi.org> wrote: > > > > > > Following https://www.open-mpi.org/doc/v3.1/man1/mpirun.1.php > > > > > > mpirun --verbose --display-map > > > > > > Have you tried newer OpenMPI versions? > > > > > > Do you get similar behavior for the osu_reduce and osu_gather > > > benchmarks? > > > > > > Typically internal buffer sizes as well as your hardware will > affect > > > performance. Can you give specifications similar to what is > > > available at: > > > http://mvapich.cse.ohio-state.edu/performance/collectives/ > > > where the operating system, switch, node type and memory are > > > indicated. > > > > > > If you need good performance, may want to also specify the > algorithm > > > used. You can find some of the parameters you can tune using: > > > > > > ompi_info --all > > > > > > A particular helpful parameter is: > > > > > > MCA coll tuned: parameter "coll_tuned_allreduce_algorithm" (current > > > value: "ignore", data source: default, level: 5 tuner/detail, > > > type: int) > > > Which allreduce algorithm is used. Can > be > > > locked down to any of: 0 ignore, 1 basic linear, 2 nonoverlapping > > > (tuned > > > reduce + tuned bcast), 3 recursive doubling, 4 ring, 5 segmented > > ring > > > Valid values: 0:"ignore", > > > 1:"basic_linear", > > > 2:"nonoverlapping", 3:"recursive_doubling", 4:"ring", > > > 5:"segmented_ring", 6:"rabenseifner" > > > MCA coll tuned: parameter > > > "coll_tuned_allreduce_algorithm_segmentsize" (current value: "0", > > > data > > > source: default, level: 5 tuner/detail, type: int) > > > > > > For OpenMPI 4.0, there is a tuning program [2] that might also be > > > helpful. > > > > > > [1] > > > > > > https://stackoverflow.com/questions/36635061/how-to-check-which-mca-parameters-are-used-in-openmpi > > > [2] https://github.com/open-mpi/ompi-collectives-tuning > > > > > > On 2/7/22 4:49 PM, Bertini, Denis Dr. wrote: > > > > Hi > > > > > > > > When i repeat i always got the huge discrepancy at the > > > > > > > > message size of 16384. > > > > > > > > May be there is a way to run mpi in verbose mode in order > > > > > > > > to further investigate this behaviour? > > > > > > > > Best > > > > > > > > Denis > > > > > > > > > > > > ------------------------------------------------------------------------ > > > > *From:* users <users-boun...@lists.open-mpi.org> on behalf of > > > Benson > > > > Muite via users <users@lists.open-mpi.org> > > > > *Sent:* Monday, February 7, 2022 2:27:34 PM > > > > *To:* users@lists.open-mpi.org > > > > *Cc:* Benson Muite > > > > *Subject:* Re: [OMPI users] Using OSU benchmarks for checking > > > Infiniband > > > > network > > > > Hi, > > > > Do you get similar results when you repeat the test? Another job > > > could > > > > have interfered with your run. > > > > Benson > > > > On 2/7/22 3:56 PM, Bertini, Denis Dr. via users wrote: > > > >> Hi > > > >> > > > >> I am using OSU microbenchmarks compiled with openMPI 3.1.6 in > > > order to > > > >> check/benchmark > > > >> > > > >> the infiniband network for our cluster. > > > >> > > > >> For that i use the collective all_reduce benchmark and run over > > > 200 > > > >> nodes, using 1 process per node. > > > >> > > > >> And this is the results i obtained 😎 > > > >> > > > >> > > > >> > > > >> ################################################################ > > > >> > > > >> # OSU MPI Allreduce Latency Test v5.7.1 > > > >> # Size Avg Latency(us) Min Latency(us) Max > > > Latency(us) Iterations > > > >> 4 114.65 83.22 147.98 > > > 1000 > > > >> 8 133.85 106.47 164.93 > > > 1000 > > > >> 16 116.41 87.57 150.58 > > > 1000 > > > >> 32 112.17 93.25 130.23 > > > 1000 > > > >> 64 106.85 81.93 134.74 > > > 1000 > > > >> 128 117.53 87.50 152.27 > > > 1000 > > > >> 256 143.08 115.63 173.97 > > > 1000 > > > >> 512 130.34 100.20 167.56 > > > 1000 > > > >> 1024 155.67 111.29 188.20 > > > 1000 > > > >> 2048 151.82 116.03 198.19 > > > 1000 > > > >> 4096 159.11 122.09 199.24 > > > 1000 > > > >> 8192 176.74 143.54 221.98 > > > 1000 > > > >> 16384 48862.85 39270.21 54970.96 > > > 1000 > > > >> 32768 2737.37 2614.60 2802.68 > > > 1000 > > > >> 65536 2723.15 2585.62 2813.65 > > > 1000 > > > >> > > > >> > > > #################################################################### > > > >> > > > >> Could someone explain me what is happening for message = 16384 ? > > > >> One can notice a huge latency (~ 300 time larger) compare to > > > message > > > >> size = 8192. > > > >> I do not really understand what could create such an increase > > > in the > > > >> latency. > > > >> The reason i use the OSU microbenchmarks is that we > > > >> sporadically experience a drop > > > >> in the bandwith for typical collective operations such as > > > MPI_Reduce in > > > >> our cluster > > > >> which is difficult to understand. > > > >> I would be grateful if somebody can share its expertise or such > > > problem > > > >> with me. > > > >> > > > >> Best, > > > >> Denis > > > >> > > > >> > > > >> > > > >> --------- > > > >> Denis Bertini > > > >> Abteilung: CIT > > > >> Ort: SB3 2.265a > > > >> > > > >> Tel: +49 6159 71 2240 > > > >> Fax: +49 6159 71 2986 > > > >> E-Mail: d.bert...@gsi.de > > > >> > > > >> GSI Helmholtzzentrum für Schwerionenforschung GmbH > > > >> Planckstraße 1, 64291 Darmstadt, Germany, www.gsi.de > > > <http://www.gsi.de> > > > >> > > > >> Commercial Register / Handelsregister: Amtsgericht Darmstadt, > > > HRB 1528 > > > >> Managing Directors / Geschäftsführung: > > > >> Professor Dr. Paolo Giubellino, Dr. Ulrich Breuer, Jörg Blaurock > > > >> Chairman of the GSI Supervisory Board / Vorsitzender des > > > GSI-Aufsichtsrats: > > > >> Ministerialdirigent Dr. Volkmar Dietz > > > >> > > > > > > > > > > >