Hi

When i repeat i always got the huge discrepancy at the

message size of 16384.

May be there is a way to run mpi in verbose mode in order

to further investigate this behaviour?

Best

Denis

________________________________
From: users <users-boun...@lists.open-mpi.org> on behalf of Benson Muite via 
users <users@lists.open-mpi.org>
Sent: Monday, February 7, 2022 2:27:34 PM
To: users@lists.open-mpi.org
Cc: Benson Muite
Subject: Re: [OMPI users] Using OSU benchmarks for checking Infiniband network

Hi,
Do you get similar results when you repeat the test? Another job could
have interfered with your run.
Benson
On 2/7/22 3:56 PM, Bertini, Denis Dr. via users wrote:
> Hi
>
> I am using OSU microbenchmarks compiled with openMPI 3.1.6 in order to
> check/benchmark
>
> the infiniband network for our cluster.
>
> For that i use the collective all_reduce benchmark and run over 200
> nodes, using 1 process per node.
>
> And this is the results i obtained 😎
>
>
>
> ################################################################
>
> # OSU MPI Allreduce Latency Test v5.7.1
> # Size       Avg Latency(us)   Min Latency(us)   Max Latency(us)  Iterations
> 4                     114.65             83.22            147.98        1000
> 8                     133.85            106.47            164.93        1000
> 16                    116.41             87.57            150.58        1000
> 32                    112.17             93.25            130.23        1000
> 64                    106.85             81.93            134.74        1000
> 128                   117.53             87.50            152.27        1000
> 256                   143.08            115.63            173.97        1000
> 512                   130.34            100.20            167.56        1000
> 1024                  155.67            111.29            188.20        1000
> 2048                  151.82            116.03            198.19        1000
> 4096                  159.11            122.09            199.24        1000
> 8192                  176.74            143.54            221.98        1000
> 16384               48862.85          39270.21          54970.96        1000
> 32768                2737.37           2614.60           2802.68        1000
> 65536                2723.15           2585.62           2813.65        1000
>
> ####################################################################
>
> Could someone explain me what is happening for message = 16384 ?
> One can notice a huge latency (~ 300 time larger)  compare to message
> size = 8192.
> I do not really understand what could  create such an increase in the
> latency.
> The reason i use the OSU microbenchmarks is that we
> sporadically experience a drop
> in the bandwith for typical collective operations such as MPI_Reduce in
> our cluster
> which is difficult to understand.
> I would be grateful if somebody can share its expertise or such problem
> with me.
>
> Best,
> Denis
>
>
>
> ---------
> Denis Bertini
> Abteilung: CIT
> Ort: SB3 2.265a
>
> Tel: +49 6159 71 2240
> Fax: +49 6159 71 2986
> E-Mail: d.bert...@gsi.de
>
> GSI Helmholtzzentrum für Schwerionenforschung GmbH
> Planckstraße 1, 64291 Darmstadt, Germany, www.gsi.de
>
> Commercial Register / Handelsregister: Amtsgericht Darmstadt, HRB 1528
> Managing Directors / Geschäftsführung:
> Professor Dr. Paolo Giubellino, Dr. Ulrich Breuer, Jörg Blaurock
> Chairman of the GSI Supervisory Board / Vorsitzender des GSI-Aufsichtsrats:
> Ministerialdirigent Dr. Volkmar Dietz
>

Reply via email to