Hello Peter,
in 1.2.2 the allgatherv is called from coll basic component,
and is implemented as a gatherv followed by a broadcast.
The broadcast is executed with single element of MPI_TYPE_INDEXED.
The decision function in coll tuned makes mistake of using segmented
broadcast algorithm for this case (which results in sending single segment
possibly using pipeline -> bad performance).
I will fix this in the trunk and ask for it to be moved to 1.2.2 if it
solves the problem.
Thanks,
Jelena
On Tue, 26 Jun 2007, Peter Drakenberg wrote:
Hello,
When running the Intel MPI Benchmark (IMB) on our cluster
(Sun X2200M2 nodes, Voltaire DDRx Infiniband, OFED-1.1) we
see rather strange (i.e., unreasonably bad) performance for the
Allgatherv part of the IMP when using OpenMPI-1.2.2. The
performance figures reported by the IMB are provided immediately
below, and the corresponding figures for Voltaire's MPI
implementation (which in most cases performs worse than
OpenMPI, but not in this case) is provided further below for
comparison.
Best regards,
Peter Drakenberg
OpenMPI-1.2.2 results
# Benchmarking Allgatherv
# #processes = 128
# ( 128 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.27 0.37 0.28
1 1000 2963.90 2964.34 2964.18
2 1000 2964.58 2965.27 2965.12
4 1000 2957.89 2960.70 2960.34
8 1000 2957.16 2957.48 2957.41
16 1000 1476.52 1477.29 1476.92
32 1000 1262.78 1264.01 1263.42
64 1000 1777.36 1781.58 1780.73
128 1000 3179.43 3184.41 3182.70
256 1000 5585.14 5590.76 5588.06
512 1000 9305.17 9314.22 9309.76
1024 1000 15080.38 15095.19 15087.83
2048 1000 26654.10 26680.41 26667.36
4096 1000 51284.44 51335.00 51310.00
8192 1000 128715.45 128845.60 128781.31
16384 1000 268331.78 268600.18 268467.99
32768 1000 523252.38 523771.30 523512.56
65536 640 1026546.88 1028134.76 1027342.31
131072 320 2032981.61 2039325.76 2036150.23
262144 160 4036154.11 4061263.87 4048673.85
524288 80 7985005.01 8084825.39 8034942.12
1048576 40 15676708.42 16074075.63 15875796.81
2097152 20 30574097.00 32172253.91 31373789.00
4194304 < never completes, no results reported >
Voltaire MPI results:
[0] # Benchmarking Allgatherv
[0] # #processes = 128
[0] # ( 128 additional processes waiting in MPI_Barrier)
[0] #----------------------------------------------------------------
[0] #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
[0] 0 1000 0.15 0.17 0.15
[0] 1 1000 491.80 492.27 491.93
[0] 2 1000 442.39 442.77 442.49
[0] 4 1000 422.91 423.79 423.20
[0] 8 1000 491.78 492.04 491.88
[0] 16 1000 493.50 494.55 493.82
[0] 32 1000 439.22 439.80 439.55
[0] 64 1000 474.54 475.11 474.80
[0] 128 1000 520.39 521.44 520.73
[0] 256 1000 480.01 480.57 480.26
[0] 512 1000 802.98 803.68 803.27
[0] 1024 1000 1501.60 1502.54 1502.19
[0] 2048 1000 2863.70 2867.45 2864.90
[0] 4096 1000 4990.05 4990.86 4990.49
[0] 8192 1000 7508.46 7513.27 7511.21
[0] 16384 1000 17513.71 17523.39 17519.27
[0] 32768 1000 26655.31 26664.25 26659.42
[0] 65536 640 46089.07 46122.00 46106.04
[0] 131072 320 93248.85 93381.15 93319.76
[0] 262144 160 187527.89 188133.28 187883.96
[0] 524288 80 366881.26 369236.00 368337.76
[0] 1048576 40 663046.35 667853.38 665697.79
[0] 2097152 20 1324301.75 1345114.85 1335772.46
[0] 4194304 10 2593647.10 2662831.10 2633482.66
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
--
Jelena Pjesivac-Grbovic, Pjesa
Graduate Research Assistant
Innovative Computing Laboratory
Computer Science Department, UTK
Claxton Complex 350
(865) 974 - 6722
(865) 974 - 6321
jpjes...@utk.edu
Murphy's Law of Research:
Enough research will tend to support your theory.