Hi

I changed the algorithm used to ring algorithm 4 ( for example ) and the

scan changed to


# OSU MPI Allreduce Latency Test v5.7.1
# Size       Avg Latency(us)   Min Latency(us)   Max Latency(us)  Iterations
4                      59.39             51.04             65.36       10000
8                     109.13             90.14            126.32       10000
16                    253.26             60.89            290.31       10000
32                     75.04             54.53             83.28       10000
64                     96.40             59.73            111.45       10000
128                    67.86             59.73             76.44       10000
256                    76.32             67.33             85.18       10000
512                   129.93             85.76            170.31       10000
1024                  168.51            129.15            194.68       10000
2048                  136.17            110.09            156.94       10000
4096                  173.59            130.76            199.21       10000
8192                  236.05            170.77            269.98       10000
16384                4212.65           3627.71           4992.04       10000
32768                1243.05           1205.11           1276.11       10000
65536                1464.50           1364.76           1531.48       10000
131072               1558.71           1454.52           1632.91       10000
262144               1681.58           1609.15           1745.44       10000
524288               2305.73           2178.17           2402.69       10000
1048576              3389.83           3220.44           3517.61       10000

Would this means that the first results was linked to the underlying algorithm 
used by defaults

in openMPI ( 0=ignore)?

Do you know what is this algorithm (0=ignore)?

I still see the wall for message=16384 though ...

Best

Denis






________________________________
From: Benson Muite <benson_mu...@emailplus.org>
Sent: Monday, February 7, 2022 4:59:45 PM
To: Bertini, Denis Dr.; Open MPI Users
Subject: Re: [OMPI users] Using OSU benchmarks for checking Infiniband network

Following https://www.open-mpi.org/doc/v3.1/man1/mpirun.1.php

mpirun --verbose --display-map

Have you tried newer OpenMPI versions?

Do you get similar behavior for the osu_reduce and osu_gather benchmarks?

Typically internal buffer sizes as well as your hardware will affect
performance. Can you give specifications similar to what is available at:
http://mvapich.cse.ohio-state.edu/performance/collectives/
where the operating system, switch, node type and memory are indicated.

If you need good performance, may want to also specify the algorithm
used. You can find some of the parameters you can tune using:

ompi_info --all

A particular helpful parameter is:

MCA coll tuned: parameter "coll_tuned_allreduce_algorithm" (current
value: "ignore", data source: default, level: 5 tuner/detail, type: int)
                           Which allreduce algorithm is used. Can be
locked down to any of: 0 ignore, 1 basic linear, 2 nonoverlapping (tuned
reduce + tuned bcast), 3 recursive doubling, 4 ring, 5 segmented ring
                           Valid values: 0:"ignore", 1:"basic_linear",
2:"nonoverlapping", 3:"recursive_doubling", 4:"ring",
5:"segmented_ring", 6:"rabenseifner"
           MCA coll tuned: parameter
"coll_tuned_allreduce_algorithm_segmentsize" (current value: "0", data
source: default, level: 5 tuner/detail, type: int)

For OpenMPI 4.0, there is a tuning program [2] that might also be helpful.

[1]
https://stackoverflow.com/questions/36635061/how-to-check-which-mca-parameters-are-used-in-openmpi
[2] https://github.com/open-mpi/ompi-collectives-tuning

On 2/7/22 4:49 PM, Bertini, Denis Dr. wrote:
> Hi
>
> When i repeat i always got the huge discrepancy at the
>
> message size of 16384.
>
> May be there is a way to run mpi in verbose mode in order
>
> to further investigate this behaviour?
>
> Best
>
> Denis
>
> ------------------------------------------------------------------------
> *From:* users <users-boun...@lists.open-mpi.org> on behalf of Benson
> Muite via users <users@lists.open-mpi.org>
> *Sent:* Monday, February 7, 2022 2:27:34 PM
> *To:* users@lists.open-mpi.org
> *Cc:* Benson Muite
> *Subject:* Re: [OMPI users] Using OSU benchmarks for checking Infiniband
> network
> Hi,
> Do you get similar results when you repeat the test? Another job could
> have interfered with your run.
> Benson
> On 2/7/22 3:56 PM, Bertini, Denis Dr. via users wrote:
>> Hi
>>
>> I am using OSU microbenchmarks compiled with openMPI 3.1.6 in order to
>> check/benchmark
>>
>> the infiniband network for our cluster.
>>
>> For that i use the collective all_reduce benchmark and run over 200
>> nodes, using 1 process per node.
>>
>> And this is the results i obtained 😎
>>
>>
>>
>> ################################################################
>>
>> # OSU MPI Allreduce Latency Test v5.7.1
>> # Size       Avg Latency(us)   Min Latency(us)   Max Latency(us)  Iterations
>> 4                     114.65             83.22            147.98        1000
>> 8                     133.85            106.47            164.93        1000
>> 16                    116.41             87.57            150.58        1000
>> 32                    112.17             93.25            130.23        1000
>> 64                    106.85             81.93            134.74        1000
>> 128                   117.53             87.50            152.27        1000
>> 256                   143.08            115.63            173.97        1000
>> 512                   130.34            100.20            167.56        1000
>> 1024                  155.67            111.29            188.20        1000
>> 2048                  151.82            116.03            198.19        1000
>> 4096                  159.11            122.09            199.24        1000
>> 8192                  176.74            143.54            221.98        1000
>> 16384               48862.85          39270.21          54970.96        1000
>> 32768                2737.37           2614.60           2802.68        1000
>> 65536                2723.15           2585.62           2813.65        1000
>>
>> ####################################################################
>>
>> Could someone explain me what is happening for message = 16384 ?
>> One can notice a huge latency (~ 300 time larger)  compare to message
>> size = 8192.
>> I do not really understand what could  create such an increase in the
>> latency.
>> The reason i use the OSU microbenchmarks is that we
>> sporadically experience a drop
>> in the bandwith for typical collective operations such as MPI_Reduce in
>> our cluster
>> which is difficult to understand.
>> I would be grateful if somebody can share its expertise or such problem
>> with me.
>>
>> Best,
>> Denis
>>
>>
>>
>> ---------
>> Denis Bertini
>> Abteilung: CIT
>> Ort: SB3 2.265a
>>
>> Tel: +49 6159 71 2240
>> Fax: +49 6159 71 2986
>> E-Mail: d.bert...@gsi.de
>>
>> GSI Helmholtzzentrum für Schwerionenforschung GmbH
>> Planckstraße 1, 64291 Darmstadt, Germany, www.gsi.de
>>
>> Commercial Register / Handelsregister: Amtsgericht Darmstadt, HRB 1528
>> Managing Directors / Geschäftsführung:
>> Professor Dr. Paolo Giubellino, Dr. Ulrich Breuer, Jörg Blaurock
>> Chairman of the GSI Supervisory Board / Vorsitzender des GSI-Aufsichtsrats:
>> Ministerialdirigent Dr. Volkmar Dietz
>>
>

Reply via email to