Hi

I am using OSU microbenchmarks compiled with openMPI 3.1.6 in order to 
check/benchmark

the infiniband network for our cluster.

For that i use the collective all_reduce benchmark and run over 200 nodes, 
using 1 process per node.

And this is the results i obtained 😎



################################################################

# OSU MPI Allreduce Latency Test v5.7.1
# Size       Avg Latency(us)   Min Latency(us)   Max Latency(us)  Iterations
4                     114.65             83.22            147.98        1000
8                     133.85            106.47            164.93        1000
16                    116.41             87.57            150.58        1000
32                    112.17             93.25            130.23        1000
64                    106.85             81.93            134.74        1000
128                   117.53             87.50            152.27        1000
256                   143.08            115.63            173.97        1000
512                   130.34            100.20            167.56        1000
1024                  155.67            111.29            188.20        1000
2048                  151.82            116.03            198.19        1000
4096                  159.11            122.09            199.24        1000
8192                  176.74            143.54            221.98        1000
16384               48862.85          39270.21          54970.96        1000
32768                2737.37           2614.60           2802.68        1000
65536                2723.15           2585.62           2813.65        1000

####################################################################

Could someone explain me what is happening for message = 16384 ?
One can notice a huge latency (~ 300 time larger)  compare to message size = 
8192.
I do not really understand what could  create such an increase in the latency.
The reason i use the OSU microbenchmarks is that we sporadically experience a 
drop
in the bandwith for typical collective operations such as MPI_Reduce in our 
cluster
which is difficult to understand.
I would be grateful if somebody can share its expertise or such problem with me.

Best,
Denis



---------
Denis Bertini
Abteilung: CIT
Ort: SB3 2.265a

Tel: +49 6159 71 2240
Fax: +49 6159 71 2986
E-Mail: d.bert...@gsi.de

GSI Helmholtzzentrum für Schwerionenforschung GmbH
Planckstraße 1, 64291 Darmstadt, Germany, www.gsi.de

Commercial Register / Handelsregister: Amtsgericht Darmstadt, HRB 1528
Managing Directors / Geschäftsführung:
Professor Dr. Paolo Giubellino, Dr. Ulrich Breuer, Jörg Blaurock
Chairman of the GSI Supervisory Board / Vorsitzender des GSI-Aufsichtsrats:
Ministerialdirigent Dr. Volkmar Dietz

Reply via email to