hi, I found the reason. It is because besides the direct links between 2
PCs, there is another link going through many switches and TCP BTL seems
to use
this slower link. So I run again with eth0 only.
So I build ompi with: ./configure --disable-mpi-f90 --disable-mpi-f77
--disable-mpi-cxx --disable-vt --disable-io-romio --prefix=/usr
--with-platform=optimized
And ran with : mpirun -n 6 --mca btl tcp,self --mca btl_tcp_if_include
eth0 -hostfile my_hostfile --bynode ./IMB-MPI1 > tcp_0901
And get the result as in appendix. It seems that TCP has better
performances with smaller message while TIPC with larger message.
/Xin
On 08/30/2011 05:50 PM, Jeff Squyres wrote:
On Aug 29, 2011, at 3:51 AM, Xin He wrote:
-----
$ mpirun --mca btl tcp,self --bynode -np 2 --mca btl_tcp_if_include eth0
hostname
svbu-mpi008
svbu-mpi009
$ mpirun --mca btl tcp,self --bynode -np 2 --mca btl_tcp_if_include eth0
IMB-MPI1 PingPong
#---------------------------------------------------
# Intel (R) MPI Benchmark Suite V3.2, MPI-1 part
#---------------------------------------------------
Hi, I think these models are reasonably new :)
The result I gave you, they are tested on 2 processes but on 2 different
servers. I get that the result you showed is 2 processes on one machine?
Nope -- check my output -- I'm running across 2 different servers and through a
1GB TOR ethernet switch (it's not a particularly high-performance ethernet
switch, either).
Can you run some native netpipe TCP numbers across the same nodes that you ran
the TIPC MPI tests over? You should be getting lower latency than what you're
seeing.
Do you have jumbo frames enabled, perchance? Are you going through only 1 switch? If
you're on a NUMA server, do you have processor affinity enabled, and have the processes
located "near" the NIC?
BTW, I forgot to tell you about SM& TIPC. Unfortunately, TIPC does not beat
SM...
That's probably not surprising; SM is tuned pretty well specifically for MPI
communication across shared memory.
#---------------------------------------------------
# Intel (R) MPI Benchmark Suite V3.2, MPI-1 part
#---------------------------------------------------
# Date : Thu Sep 1 10:42:40 2011
# Machine : i686
# System : Linux
# Release : 2.6.32-24-generic-pae
# Version : #39-Ubuntu SMP Wed Jul 28 07:39:26 UTC 2010
# MPI Version : 2.1
# MPI Thread Environment: MPI_THREAD_SINGLE
# New default behavior from Version 3.2 on:
# the number of iterations per message size is cut down
# dynamically when a certain run time (per message size sample)
# is expected to be exceeded. Time limit is defined by variable
# "SECS_PER_SAMPLE" (=> IMB_settings.h)
# or through the flag => -time
# Calling sequence was:
# ./IMB-MPI1
# Minimum message length in bytes: 0
# Maximum message length in bytes: 4194304
#
# MPI_Datatype : MPI_BYTE
# MPI_Datatype for reductions : MPI_FLOAT
# MPI_Op : MPI_SUM
#
#
# List of Benchmarks to run:
# PingPong
# PingPing
# Sendrecv
# Exchange
# Allreduce
# Reduce
# Reduce_scatter
# Allgather
# Allgatherv
# Gather
# Gatherv
# Scatter
# Scatterv
# Alltoall
# Alltoallv
# Bcast
# Barrier
#---------------------------------------------------
# Benchmarking PingPong
# #processes = 2
# ( 4 additional processes waiting in MPI_Barrier)
#---------------------------------------------------
#bytes #repetitions t[usec] Mbytes/sec
0 1000 51.32 0.00
1 1000 51.80 0.02
2 1000 51.75 0.04
4 1000 51.64 0.07
8 1000 51.87 0.15
16 1000 51.62 0.30
32 1000 52.14 0.59
64 1000 51.88 1.18
128 1000 52.81 2.31
256 1000 54.87 4.45
512 1000 57.65 8.47
1024 1000 74.70 13.07
2048 1000 90.91 21.49
4096 1000 115.36 33.86
8192 1000 147.96 52.80
16384 1000 228.96 68.24
32768 1000 390.84 79.96
65536 640 789.71 79.14
131072 320 1349.10 92.65
262144 160 2479.60 100.82
524288 80 4722.49 105.88
1048576 40 9181.69 108.91
2097152 20 18110.10 110.44
4194304 10 35916.29 111.37
#---------------------------------------------------
# Benchmarking PingPing
# #processes = 2
# ( 4 additional processes waiting in MPI_Barrier)
#---------------------------------------------------
#bytes #repetitions t[usec] Mbytes/sec
0 1000 59.08 0.00
1 1000 58.87 0.02
2 1000 59.84 0.03
4 1000 58.83 0.06
8 1000 59.28 0.13
16 1000 59.40 0.26
32 1000 59.84 0.51
64 1000 60.63 1.01
128 1000 61.31 1.99
256 1000 63.26 3.86
512 1000 70.72 6.90
1024 1000 84.78 11.52
2048 1000 117.30 16.65
4096 1000 143.70 27.18
8192 1000 186.05 41.99
16384 1000 233.02 67.05
32768 1000 395.94 78.93
65536 640 806.53 77.49
131072 320 1401.89 89.17
262144 160 2597.34 96.25
524288 80 4957.96 100.85
1048576 40 9743.42 102.63
2097152 20 19338.00 103.42
4194304 10 38740.09 103.25
#-----------------------------------------------------------------------------
# Benchmarking Sendrecv
# #processes = 2
# ( 4 additional processes waiting in MPI_Barrier)
#-----------------------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec
0 1000 58.92 58.92 58.92 0.00
1 1000 59.33 59.34 59.34 0.03
2 1000 59.37 59.38 59.38 0.06
4 1000 59.34 59.43 59.39 0.13
8 1000 59.76 59.78 59.77 0.26
16 1000 59.72 59.73 59.72 0.51
32 1000 59.89 59.91 59.90 1.02
64 1000 60.40 60.44 60.42 2.02
128 1000 61.87 61.90 61.89 3.94
256 1000 63.16 63.25 63.21 7.72
512 1000 71.90 71.91 71.91 13.58
1024 1000 89.86 89.86 89.86 21.73
2048 1000 117.62 117.66 117.64 33.20
4096 1000 144.66 144.73 144.70 53.98
8192 1000 186.22 186.28 186.25 83.88
16384 1000 234.77 234.86 234.82 133.06
32768 1000 394.60 394.66 394.63 158.36
65536 640 1438.97 1439.86 1439.42 86.81
131072 320 2552.69 2556.05 2554.37 97.81
262144 160 4782.49 4795.14 4788.81 104.27
524288 80 9262.81 9284.33 9273.57 107.71
1048576 40 18158.72 18202.53 18180.62 109.87
2097152 20 35121.65 35215.95 35168.80 113.58
4194304 10 71480.81 71681.31 71581.06 111.61
#-----------------------------------------------------------------------------
# Benchmarking Sendrecv
# #processes = 4
# ( 2 additional processes waiting in MPI_Barrier)
#-----------------------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec
0 1000 63.92 64.05 63.98 0.00
1 1000 61.64 61.67 61.65 0.03
2 1000 62.25 62.37 62.29 0.06
4 1000 64.67 64.90 64.79 0.12
8 1000 62.47 62.49 62.48 0.24
16 1000 62.36 62.39 62.38 0.49
32 1000 62.37 62.39 62.38 0.98
64 1000 64.07 64.09 64.09 1.90
128 1000 66.27 66.30 66.28 3.68
256 1000 81.61 81.74 81.68 5.97
512 1000 96.15 96.27 96.18 10.14
1024 1000 115.34 115.57 115.46 16.90
2048 1000 114.87 114.90 114.89 34.00
4096 1000 131.80 132.00 131.87 59.19
8192 1000 197.71 197.95 197.82 78.93
16384 1000 286.15 286.36 286.26 109.13
32768 1000 569.97 570.75 570.41 109.51
65536 640 1362.82 1363.91 1363.36 91.65
131072 320 2807.40 2814.55 2811.46 88.82
262144 160 5359.95 5386.03 5378.53 92.83
524288 80 9967.26 9985.25 9975.29 100.15
1048576 40 19238.65 19276.50 19255.88 103.75
2097152 20 38024.45 38145.35 38107.19 104.86
4194304 10 75315.10 75488.11 75409.50 105.98
#-----------------------------------------------------------------------------
# Benchmarking Sendrecv
# #processes = 6
#-----------------------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec
0 1000 68.89 69.13 68.99 0.00
1 1000 68.42 68.57 68.50 0.03
2 1000 67.84 67.88 67.87 0.06
4 1000 67.95 68.12 68.04 0.11
8 1000 68.71 68.96 68.84 0.22
16 1000 68.67 68.78 68.71 0.44
32 1000 68.74 68.93 68.85 0.89
64 1000 70.46 70.66 70.55 1.73
128 1000 73.09 73.24 73.17 3.33
256 1000 79.54 79.76 79.67 6.12
512 1000 79.92 80.08 80.01 12.19
1024 1000 94.06 94.24 94.13 20.73
2048 1000 113.99 114.20 114.10 34.21
4096 1000 141.14 141.46 141.32 55.23
8192 1000 215.58 216.04 215.85 72.33
16384 1000 428.15 428.50 428.33 72.93
32768 1000 852.85 853.55 853.22 73.22
65536 640 1889.91 1891.92 1890.88 66.07
131072 320 3726.31 3730.88 3727.82 67.01
262144 160 7132.04 7140.41 7136.32 70.02
524288 80 14739.86 14753.01 14746.25 67.78
1048576 40 28282.17 28303.13 28291.57 70.66
2097152 20 55472.20 55637.05 55583.79 71.89
4194304 10 111106.71 112461.79 111884.37 71.14
#-----------------------------------------------------------------------------
# Benchmarking Exchange
# #processes = 2
# ( 4 additional processes waiting in MPI_Barrier)
#-----------------------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec
0 1000 60.42 60.43 60.42 0.00
1 1000 62.22 62.24 62.23 0.06
2 1000 62.32 62.33 62.32 0.12
4 1000 61.70 61.71 61.71 0.25
8 1000 62.70 62.71 62.70 0.49
16 1000 63.28 63.28 63.28 0.96
32 1000 64.56 64.56 64.56 1.89
64 1000 66.87 66.88 66.87 3.65
128 1000 74.15 74.16 74.15 6.58
256 1000 83.77 83.78 83.78 11.66
512 1000 94.60 94.60 94.60 20.65
1024 1000 115.01 115.04 115.03 33.95
2048 1000 119.79 119.86 119.82 65.18
4096 1000 191.88 191.94 191.91 81.41
8192 1000 227.23 227.27 227.25 137.50
16384 1000 383.27 383.31 383.29 163.05
32768 1000 701.07 701.13 701.10 178.28
65536 640 1589.27 1589.43 1589.35 157.29
131072 320 2817.68 2817.96 2817.82 177.43
262144 160 5731.64 5732.14 5731.89 174.45
524288 80 12593.90 12617.03 12605.46 158.52
1048576 40 27821.95 27868.53 27845.24 143.53
2097152 20 55251.05 55346.31 55298.68 144.54
4194304 10 109927.00 110125.00 110026.00 145.29
#-----------------------------------------------------------------------------
# Benchmarking Exchange
# #processes = 4
# ( 2 additional processes waiting in MPI_Barrier)
#-----------------------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec
0 1000 72.74 72.75 72.74 0.00
1 1000 73.99 74.02 74.00 0.05
2 1000 73.78 73.79 73.78 0.10
4 1000 73.63 73.66 73.65 0.21
8 1000 74.28 74.30 74.29 0.41
16 1000 74.52 74.54 74.53 0.82
32 1000 74.92 74.93 74.92 1.63
64 1000 75.69 75.70 75.69 3.23
128 1000 79.09 79.11 79.10 6.17
256 1000 86.02 86.04 86.03 11.35
512 1000 94.22 94.24 94.23 20.72
1024 1000 116.95 116.96 116.95 33.40
2048 1000 144.19 144.32 144.25 54.13
4096 1000 202.27 202.34 202.31 77.22
8192 1000 317.67 317.81 317.74 98.33
16384 1000 572.85 573.10 572.96 109.06
32768 1000 1145.11 1145.62 1145.33 109.11
65536 640 3030.14 3031.19 3030.65 82.48
131072 320 5898.22 5909.70 5905.05 84.61
262144 160 11082.41 11097.04 11090.31 90.11
524288 80 20299.25 20308.36 20303.67 98.48
1048576 40 39609.05 39699.60 39655.17 100.76
2097152 20 77599.66 77752.25 77669.88 102.89
4194304 10 153690.61 154220.71 154007.65 103.75
#-----------------------------------------------------------------------------
# Benchmarking Exchange
# #processes = 6
#-----------------------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec
0 1000 77.36 77.50 77.43 0.00
1 1000 78.03 78.15 78.09 0.05
2 1000 77.65 77.71 77.69 0.10
4 1000 77.85 77.88 77.86 0.20
8 1000 78.23 78.27 78.24 0.39
16 1000 78.51 78.56 78.54 0.78
32 1000 78.72 78.73 78.73 1.55
64 1000 79.88 79.98 79.93 3.05
128 1000 82.71 82.87 82.78 5.89
256 1000 92.18 92.21 92.20 10.59
512 1000 103.88 103.96 103.92 18.79
1024 1000 130.57 130.74 130.65 29.88
2048 1000 167.94 168.05 168.01 46.49
4096 1000 229.83 229.96 229.91 67.95
8192 1000 429.89 430.14 430.05 72.65
16384 1000 856.50 856.98 856.75 72.93
32768 1000 1707.52 1708.48 1708.07 73.16
65536 640 3991.60 3994.52 3992.94 62.59
131072 320 7754.96 7763.51 7759.48 64.40
262144 160 15631.14 15642.23 15638.05 63.93
524288 80 28613.91 28637.26 28621.25 69.84
1048576 40 56986.85 57039.72 57019.72 70.13
2097152 20 112567.05 112852.75 112757.27 70.89
4194304 10 225418.31 226183.70 225852.90 70.74
#----------------------------------------------------------------
# Benchmarking Allreduce
# #processes = 2
# ( 4 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.09 0.09 0.09
4 1000 58.13 58.14 58.13
8 1000 57.77 57.77 57.77
16 1000 58.53 58.54 58.53
32 1000 58.51 58.53 58.52
64 1000 59.07 59.07 59.07
128 1000 60.62 60.64 60.63
256 1000 62.04 62.04 62.04
512 1000 74.90 75.02 74.96
1024 1000 88.98 89.01 89.00
2048 1000 118.77 118.79 118.78
4096 1000 148.32 148.39 148.35
8192 1000 207.24 207.31 207.28
16384 1000 412.18 412.26 412.22
32768 1000 489.03 489.12 489.07
65536 640 830.58 830.80 830.69
131072 320 1727.87 1728.12 1728.00
262144 160 3075.42 3075.67 3075.54
524288 80 5841.02 5841.66 5841.34
1048576 40 11550.90 11553.85 11552.37
2097152 20 22649.10 22658.55 22653.82
4194304 10 44857.40 44888.20 44872.80
#----------------------------------------------------------------
# Benchmarking Allreduce
# #processes = 4
# ( 2 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.08 0.09 0.09
4 1000 88.60 88.61 88.60
8 1000 88.93 89.03 88.98
16 1000 89.53 89.67 89.60
32 1000 89.87 89.92 89.90
64 1000 90.31 90.32 90.32
128 1000 91.45 91.59 91.52
256 1000 93.85 93.98 93.92
512 1000 98.64 98.75 98.69
1024 1000 114.32 114.36 114.34
2048 1000 146.34 146.45 146.39
4096 1000 203.62 203.68 203.65
8192 1000 274.83 274.86 274.84
16384 1000 791.80 792.05 791.94
32768 1000 1220.61 1220.88 1220.79
65536 640 1720.73 1721.03 1720.90
131072 320 3421.22 3422.66 3421.90
262144 160 9549.01 9553.67 9552.01
524288 80 18395.99 18413.81 18407.68
1048576 40 35390.80 35465.60 35427.53
2097152 20 64337.35 64367.20 64352.31
4194304 10 126474.51 126657.10 126551.30
#----------------------------------------------------------------
# Benchmarking Allreduce
# #processes = 6
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.08 0.10 0.09
4 1000 158.60 158.74 158.67
8 1000 158.18 158.30 158.24
16 1000 158.88 158.99 158.93
32 1000 159.87 159.98 159.93
64 1000 161.00 161.13 161.05
128 1000 172.12 172.27 172.20
256 1000 186.53 186.65 186.59
512 1000 208.61 208.73 208.68
1024 1000 244.50 244.63 244.57
2048 1000 282.98 283.16 283.07
4096 1000 403.92 404.13 404.03
8192 1000 586.04 586.34 586.18
16384 1000 1257.12 1257.49 1257.30
32768 1000 1600.85 1601.10 1600.99
65536 640 2861.34 2862.42 2861.91
131072 320 5703.53 5705.34 5704.65
262144 160 11367.74 11373.46 11370.72
524288 80 28876.84 28902.58 28890.39
1048576 40 53027.03 53136.35 53107.32
2097152 20 107002.75 107053.25 107028.85
4194304 10 203388.29 203538.70 203492.53
#----------------------------------------------------------------
# Benchmarking Reduce
# #processes = 2
# ( 4 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.10 0.10 0.10
4 1000 51.24 51.25 51.25
8 1000 51.19 51.20 51.19
16 1000 51.57 51.57 51.57
32 1000 51.39 51.39 51.39
64 1000 51.84 51.85 51.84
128 1000 52.21 52.22 52.21
256 1000 52.87 52.87 52.87
512 1000 55.85 55.87 55.86
1024 1000 73.73 73.74 73.73
2048 1000 90.49 90.52 90.51
4096 1000 116.86 116.93 116.90
8192 1000 150.97 151.06 151.01
16384 1000 235.14 235.31 235.23
32768 1000 405.16 405.50 405.33
65536 640 679.83 680.76 680.30
131072 320 1567.70 1569.61 1568.65
262144 160 3067.68 3071.52 3069.60
524288 80 6084.49 6092.30 6088.39
1048576 40 12115.52 12131.17 12123.35
2097152 20 24111.95 24143.20 24127.58
4194304 10 48061.19 48123.80 48092.49
#----------------------------------------------------------------
# Benchmarking Reduce
# #processes = 4
# ( 2 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.10 0.11 0.10
4 1000 59.37 59.51 59.43
8 1000 59.25 59.37 59.30
16 1000 59.58 59.71 59.64
32 1000 59.85 59.98 59.92
64 1000 59.96 60.09 60.02
128 1000 60.61 60.75 60.68
256 1000 62.40 62.54 62.47
512 1000 86.85 86.92 86.89
1024 1000 98.44 98.51 98.48
2048 1000 127.42 127.50 127.46
4096 1000 161.12 161.23 161.18
8192 1000 247.65 247.84 247.75
16384 1000 416.31 416.68 416.49
32768 1000 654.73 655.90 655.39
65536 640 1027.17 1029.77 1028.59
131072 320 2386.91 2393.82 2390.78
262144 160 4631.63 4658.53 4646.76
524288 80 9096.36 9170.44 9141.98
1048576 40 21073.22 21150.02 21119.61
2097152 20 42635.55 42786.10 42730.42
4194304 10 81815.01 82113.49 82014.40
#----------------------------------------------------------------
# Benchmarking Reduce
# #processes = 6
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.10 0.12 0.11
4 1000 66.23 66.50 66.37
8 1000 66.51 66.78 66.65
16 1000 66.58 66.85 66.72
32 1000 66.55 66.81 66.67
64 1000 66.54 66.81 66.68
128 1000 67.93 68.19 68.06
256 1000 70.03 70.29 70.16
512 1000 93.31 93.38 93.34
1024 1000 107.91 107.99 107.95
2048 1000 131.16 131.28 131.21
4096 1000 197.88 198.04 197.96
8192 1000 275.00 275.28 275.14
16384 1000 441.70 442.25 441.98
32768 1000 613.08 614.79 614.03
65536 640 926.29 930.25 928.42
131072 320 3519.29 3529.75 3525.09
262144 160 6891.86 6933.28 6915.01
524288 80 13580.38 13713.93 13663.36
1048576 40 26931.58 27235.32 27124.49
2097152 20 62906.50 63227.14 63106.85
4194304 10 124174.10 124932.30 124688.20
#----------------------------------------------------------------
# Benchmarking Reduce_scatter
# #processes = 2
# ( 4 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.09 0.10 0.09
4 1000 5.23 5.34 5.29
8 1000 58.49 58.58 58.54
16 1000 58.78 58.89 58.84
32 1000 59.09 59.11 59.10
64 1000 59.09 59.19 59.14
128 1000 59.67 59.78 59.73
256 1000 60.38 60.49 60.43
512 1000 63.56 63.57 63.57
1024 1000 76.83 76.96 76.90
2048 1000 76.34 76.43 76.38
4096 1000 119.79 119.94 119.86
8192 1000 150.72 150.89 150.81
16384 1000 212.95 213.12 213.04
32768 1000 264.57 264.61 264.59
65536 640 448.86 448.94 448.90
131072 320 925.79 926.04 925.92
262144 160 1775.71 1776.47 1776.09
524288 80 3996.21 3997.42 3996.82
1048576 40 8044.30 8048.25 8046.28
2097152 20 15855.81 15864.80 15860.30
4194304 10 31426.41 31429.20 31427.81
#----------------------------------------------------------------
# Benchmarking Reduce_scatter
# #processes = 4
# ( 2 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.09 0.11 0.10
4 1000 6.40 6.54 6.46
8 1000 108.77 109.67 109.26
16 1000 88.58 88.60 88.59
32 1000 89.35 89.49 89.42
64 1000 89.04 89.09 89.07
128 1000 90.13 90.26 90.20
256 1000 91.28 91.40 91.34
512 1000 92.98 93.01 93.00
1024 1000 96.93 97.06 96.99
2048 1000 102.49 102.55 102.52
4096 1000 117.72 117.74 117.73
8192 1000 155.36 155.39 155.37
16384 1000 210.11 210.21 210.15
32768 1000 341.14 341.24 341.20
65536 640 613.92 614.28 614.10
131072 320 1316.24 1318.14 1317.00
262144 160 2993.06 2999.81 2996.18
524288 80 10236.76 10263.19 10250.47
1048576 40 19968.40 20025.00 19988.14
2097152 20 36968.80 37116.44 37060.85
4194304 10 71565.50 71802.01 71697.75
#----------------------------------------------------------------
# Benchmarking Reduce_scatter
# #processes = 6
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.10 0.11 0.10
4 1000 8.27 8.44 8.30
8 1000 107.30 108.24 107.66
16 1000 118.18 119.36 118.97
32 1000 155.24 155.35 155.30
64 1000 156.35 156.44 156.41
128 1000 159.27 159.37 159.33
256 1000 165.38 165.49 165.45
512 1000 178.04 178.15 178.10
1024 1000 196.89 197.00 196.96
2048 1000 222.30 222.44 222.37
4096 1000 296.44 296.59 296.53
8192 1000 405.64 405.82 405.74
16384 1000 599.62 599.80 599.72
32768 1000 807.91 808.27 808.12
65536 640 1431.45 1432.25 1431.97
131072 320 2846.33 2850.40 2848.69
262144 160 5674.20 5685.83 5680.47
524288 80 15673.87 15694.54 15683.68
1048576 40 28812.93 28860.65 28836.81
2097152 20 56938.61 57238.55 57144.57
4194304 10 111578.30 111863.90 111730.25
#----------------------------------------------------------------
# Benchmarking Allgather
# #processes = 2
# ( 4 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.07 0.07 0.07
1 1000 56.08 56.09 56.08
2 1000 56.24 56.25 56.24
4 1000 56.24 56.24 56.24
8 1000 56.20 56.20 56.20
16 1000 56.58 56.58 56.58
32 1000 56.79 56.83 56.81
64 1000 57.66 57.76 57.71
128 1000 59.79 59.81 59.80
256 1000 60.95 60.97 60.96
512 1000 70.37 70.37 70.37
1024 1000 87.49 87.51 87.50
2048 1000 117.40 117.43 117.42
4096 1000 143.95 144.00 143.98
8192 1000 192.81 192.88 192.85
16384 1000 237.72 237.76 237.74
32768 1000 402.67 402.75 402.71
65536 640 1446.47 1447.35 1446.91
131072 320 2605.22 2608.61 2606.91
262144 160 4975.78 4988.57 4982.17
524288 80 9778.49 9800.66 9789.57
1048576 40 19183.30 19222.40 19202.85
2097152 20 21998.40 22007.10 22002.75
4194304 10 45430.11 45524.11 45477.11
#----------------------------------------------------------------
# Benchmarking Allgather
# #processes = 4
# ( 2 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.07 0.08 0.07
1 1000 87.82 87.83 87.83
2 1000 88.15 88.29 88.22
4 1000 89.02 89.04 89.04
8 1000 88.94 88.95 88.94
16 1000 88.72 88.85 88.79
32 1000 88.98 89.00 88.99
64 1000 88.82 88.94 88.88
128 1000 91.05 91.16 91.10
256 1000 94.23 94.25 94.24
512 1000 99.64 99.72 99.68
1024 1000 113.70 113.77 113.73
2048 1000 149.01 149.03 149.02
4096 1000 203.75 203.82 203.78
8192 1000 306.04 306.12 306.08
16384 1000 857.30 857.63 857.47
32768 1000 2175.73 2176.29 2176.02
65536 640 4665.72 4667.57 4666.71
131072 320 7636.61 7643.74 7640.27
262144 160 14901.39 14911.22 14906.46
524288 80 32554.01 32591.05 32571.71
1048576 40 63214.48 63309.85 63260.99
2097152 20 124057.50 124142.15 124101.25
4194304 10 254358.71 254727.49 254547.42
#----------------------------------------------------------------
# Benchmarking Allgather
# #processes = 6
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.07 0.09 0.08
1 1000 112.09 112.11 112.10
2 1000 112.79 112.93 112.86
4 1000 112.11 112.13 112.12
8 1000 112.99 113.10 113.05
16 1000 113.91 114.00 113.96
32 1000 113.64 113.74 113.70
64 1000 115.05 115.20 115.13
128 1000 117.59 117.60 117.60
256 1000 124.49 124.51 124.50
512 1000 135.30 135.39 135.35
1024 1000 159.16 159.24 159.20
2048 1000 202.94 202.97 202.95
4096 1000 282.31 282.37 282.34
8192 1000 450.90 450.97 450.94
16384 1000 2133.57 2134.24 2133.88
32768 1000 5364.82 5366.20 5365.52
65536 640 10654.37 10660.00 10657.02
131072 320 18334.33 18348.24 18343.08
262144 160 36934.48 36955.99 36945.74
524288 80 79600.35 79651.80 79629.44
1048576 40 168354.20 168949.45 168729.36
2097152 20 330295.70 333866.70 332570.89
4194304 10 677119.50 678249.89 677844.40
#----------------------------------------------------------------
# Benchmarking Allgatherv
# #processes = 2
# ( 4 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.08 0.08 0.08
1 1000 56.83 56.83 56.83
2 1000 57.25 57.25 57.25
4 1000 57.12 57.13 57.12
8 1000 57.53 57.54 57.54
16 1000 57.32 57.32 57.32
32 1000 57.77 57.78 57.78
64 1000 58.34 58.34 58.34
128 1000 59.60 59.71 59.65
256 1000 61.29 61.29 61.29
512 1000 72.24 72.28 72.26
1024 1000 74.20 74.22 74.21
2048 1000 117.53 117.56 117.55
4096 1000 145.66 145.74 145.70
8192 1000 190.26 190.32 190.29
16384 1000 237.51 237.57 237.54
32768 1000 404.17 404.24 404.20
65536 640 1453.08 1453.98 1453.53
131072 320 2610.79 2614.13 2612.46
262144 160 4985.82 4998.32 4992.07
524288 80 9781.66 9803.56 9792.61
1048576 40 19161.82 19205.75 19183.79
2097152 20 22808.19 22835.21 22821.70
4194304 10 42139.10 42210.60 42174.85
#----------------------------------------------------------------
# Benchmarking Allgatherv
# #processes = 4
# ( 2 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.09 0.10 0.09
1 1000 96.57 96.58 96.57
2 1000 96.52 96.53 96.53
4 1000 97.15 97.17 97.16
8 1000 97.09 97.10 97.09
16 1000 96.82 96.85 96.83
32 1000 97.44 97.45 97.44
64 1000 97.96 97.97 97.96
128 1000 99.97 100.00 99.99
256 1000 103.09 103.11 103.09
512 1000 108.12 108.13 108.13
1024 1000 120.92 121.05 120.99
2048 1000 154.37 154.40 154.39
4096 1000 205.46 205.65 205.55
8192 1000 310.40 310.45 310.43
16384 1000 857.90 858.26 858.08
32768 1000 2160.68 2161.40 2161.01
65536 640 4606.01 4607.84 4606.98
131072 320 7646.55 7653.59 7650.14
262144 160 14910.61 14920.39 14915.69
524288 80 33854.31 33876.09 33863.26
1048576 40 64364.98 64451.33 64412.15
2097152 20 126848.80 128796.95 127805.71
4194304 10 247236.10 254043.19 250641.52
#----------------------------------------------------------------
# Benchmarking Allgatherv
# #processes = 6
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.09 0.11 0.10
1 1000 124.24 124.32 124.28
2 1000 124.31 124.41 124.36
4 1000 124.11 124.23 124.17
8 1000 127.21 127.32 127.26
16 1000 124.75 124.86 124.80
32 1000 124.86 124.98 124.91
64 1000 126.67 126.75 126.71
128 1000 129.29 129.42 129.35
256 1000 133.60 133.71 133.65
512 1000 143.59 143.68 143.63
1024 1000 166.53 166.64 166.58
2048 1000 209.11 209.14 209.12
4096 1000 286.85 286.91 286.89
8192 1000 442.50 442.62 442.57
16384 1000 2133.62 2134.21 2133.92
32768 1000 5258.54 5261.66 5260.11
65536 640 10555.15 10560.76 10557.79
131072 320 18947.57 18957.10 18951.64
262144 160 36968.54 36991.06 36980.07
524288 80 81273.51 81412.75 81360.72
1048576 40 168894.12 169293.75 169142.50
2097152 20 337660.00 339779.55 339020.63
4194304 10 661611.61 662097.80 661869.35
#----------------------------------------------------------------
# Benchmarking Gather
# #processes = 2
# ( 4 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.10 0.10 0.10
1 1000 51.39 51.39 51.39
2 1000 51.39 51.39 51.39
4 1000 51.31 51.32 51.31
8 1000 51.41 51.41 51.41
16 1000 51.41 51.42 51.42
32 1000 51.50 51.51 51.50
64 1000 51.53 51.53 51.53
128 1000 51.84 51.84 51.84
256 1000 52.75 52.75 52.75
512 1000 55.34 55.34 55.34
1024 1000 72.65 72.67 72.66
2048 1000 88.73 88.77 88.75
4096 1000 114.58 114.64 114.61
8192 1000 149.85 149.94 149.90
16384 1000 224.78 224.94 224.86
32768 1000 386.84 387.15 387.00
65536 640 698.70 699.62 699.16
131072 320 1404.55 1407.06 1405.81
262144 160 2562.17 2573.31 2567.74
524288 80 4778.22 4796.09 4787.16
1048576 40 9228.30 9264.60 9246.45
2097152 20 18474.40 18549.96 18512.18
4194304 10 38051.00 38206.10 38128.55
#----------------------------------------------------------------
# Benchmarking Gather
# #processes = 4
# ( 2 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.10 0.11 0.10
1 1000 59.11 59.24 59.18
2 1000 59.09 59.21 59.15
4 1000 59.19 59.32 59.26
8 1000 59.14 59.27 59.20
16 1000 59.40 59.52 59.46
32 1000 59.50 59.63 59.56
64 1000 59.83 59.95 59.89
128 1000 60.44 60.56 60.50
256 1000 61.32 61.46 61.39
512 1000 64.03 64.18 64.11
1024 1000 88.82 88.99 88.92
2048 1000 111.73 111.94 111.84
4096 1000 129.19 129.49 129.34
8192 1000 306.28 306.62 306.46
16384 1000 369.78 370.14 369.94
32768 1000 605.89 606.49 606.11
65536 640 1150.82 1152.62 1151.39
131072 320 2344.50 2351.19 2347.31
262144 160 4483.49 4508.70 4491.44
524288 80 8857.67 8957.35 8909.91
1048576 40 17362.47 17755.43 17611.33
2097152 20 34057.11 35650.40 35135.20
4194304 10 67594.10 73823.69 72047.07
#----------------------------------------------------------------
# Benchmarking Gather
# #processes = 6
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.10 0.11 0.10
1 1000 66.43 66.70 66.56
2 1000 66.50 66.76 66.63
4 1000 66.18 66.45 66.32
8 1000 66.53 66.82 66.68
16 1000 66.54 66.79 66.66
32 1000 66.69 66.95 66.82
64 1000 66.64 66.93 66.79
128 1000 67.64 67.89 67.77
256 1000 69.07 69.35 69.22
512 1000 73.51 73.83 73.69
1024 1000 102.22 102.59 102.42
2048 1000 102.14 102.54 102.35
4096 1000 122.94 123.45 123.20
8192 1000 453.40 453.96 453.73
16384 1000 552.33 552.96 552.66
32768 1000 854.60 855.51 855.02
65536 640 1639.44 1642.26 1640.66
131072 320 3321.86 3331.42 3326.09
262144 160 6159.62 6209.01 6184.04
524288 80 12751.70 12900.45 12825.65
1048576 40 25408.35 26063.65 25794.48
2097152 20 50488.60 52873.65 51964.08
4194304 10 97687.70 107899.80 104412.90
#----------------------------------------------------------------
# Benchmarking Gatherv
# #processes = 2
# ( 4 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.17 0.24 0.20
1 1000 50.90 50.90 50.90
2 1000 50.82 50.82 50.82
4 1000 50.87 50.87 50.87
8 1000 51.11 51.12 51.12
16 1000 50.91 50.92 50.91
32 1000 51.08 51.09 51.08
64 1000 51.08 51.08 51.08
128 1000 51.39 51.40 51.39
256 1000 52.68 52.69 52.68
512 1000 55.21 55.21 55.21
1024 1000 72.23 72.26 72.25
2048 1000 88.40 88.44 88.42
4096 1000 114.40 114.47 114.43
8192 1000 147.62 147.71 147.66
16384 1000 229.83 230.01 229.92
32768 1000 394.75 395.08 394.91
65536 640 793.20 794.14 793.67
131072 320 1373.33 1376.91 1375.12
262144 160 2562.08 2575.97 2569.03
524288 80 4939.00 4966.82 4952.91
1048576 40 9636.33 9701.93 9669.13
2097152 20 18982.35 19166.74 19074.55
4194304 10 37639.89 38160.89 37900.39
#----------------------------------------------------------------
# Benchmarking Gatherv
# #processes = 4
# ( 2 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.19 0.28 0.23
1 1000 59.07 59.19 59.13
2 1000 59.07 59.19 59.13
4 1000 59.07 59.20 59.14
8 1000 59.14 59.28 59.21
16 1000 59.32 59.47 59.39
32 1000 59.62 59.75 59.69
64 1000 59.69 59.83 59.76
128 1000 60.18 60.30 60.24
256 1000 62.54 62.66 62.60
512 1000 63.92 64.04 63.98
1024 1000 88.92 89.12 89.03
2048 1000 110.98 111.21 111.10
4096 1000 129.26 129.54 129.40
8192 1000 177.79 178.20 178.00
16384 1000 260.06 260.63 260.34
32768 1000 441.57 442.66 442.11
65536 640 1517.71 1520.18 1518.97
131072 320 2613.07 2621.69 2617.46
262144 160 4673.97 4706.78 4690.29
524288 80 9081.26 9177.22 9129.95
1048576 40 18023.93 18381.32 18208.55
2097152 20 35495.50 36754.20 36130.33
4194304 10 69192.00 74149.91 72654.75
#----------------------------------------------------------------
# Benchmarking Gatherv
# #processes = 6
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.29 0.35 0.31
1 1000 66.43 66.67 66.55
2 1000 66.22 66.51 66.37
4 1000 66.34 66.61 66.48
8 1000 66.46 66.73 66.59
16 1000 66.45 66.73 66.60
32 1000 66.54 66.81 66.67
64 1000 66.71 66.96 66.83
128 1000 67.31 67.56 67.43
256 1000 69.30 69.59 69.45
512 1000 74.71 75.02 74.86
1024 1000 101.43 101.80 101.63
2048 1000 101.58 102.02 101.80
4096 1000 122.80 123.35 123.08
8192 1000 183.27 184.07 183.69
16384 1000 270.66 271.84 271.26
32768 1000 472.30 474.50 473.39
65536 640 2236.42 2241.19 2239.17
131072 320 3869.68 3883.21 3877.82
262144 160 6959.24 7021.19 6994.87
524288 80 13343.08 13511.30 13448.85
1048576 40 25979.25 26626.53 26352.78
2097152 20 50793.95 53127.15 52302.12
4194304 10 99074.10 108540.99 104589.62
#----------------------------------------------------------------
# Benchmarking Scatter
# #processes = 2
# ( 4 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.09 0.10 0.10
1 1000 51.28 51.38 51.33
2 1000 51.41 51.51 51.46
4 1000 51.28 51.37 51.32
8 1000 51.32 51.41 51.36
16 1000 51.33 51.43 51.38
32 1000 51.49 51.59 51.54
64 1000 51.58 51.69 51.64
128 1000 51.96 52.05 52.01
256 1000 52.97 53.07 53.02
512 1000 55.36 55.47 55.42
1024 1000 72.63 72.76 72.70
2048 1000 88.85 88.99 88.92
4096 1000 114.54 114.67 114.61
8192 1000 148.14 148.32 148.23
16384 1000 229.94 230.20 230.07
32768 1000 395.95 396.35 396.15
65536 640 796.74 797.77 797.25
131072 320 1372.33 1375.78 1374.06
262144 160 2536.79 2549.09 2542.94
524288 80 4902.10 4919.89 4910.99
1048576 40 9660.32 9685.03 9672.67
2097152 20 19068.05 19073.70 19070.88
4194304 10 37802.19 37964.51 37883.35
#----------------------------------------------------------------
# Benchmarking Scatter
# #processes = 4
# ( 2 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.10 0.11 0.10
1 1000 59.15 59.25 59.21
2 1000 59.03 59.16 59.10
4 1000 58.98 59.13 59.06
8 1000 59.29 59.40 59.34
16 1000 59.52 59.62 59.57
32 1000 59.68 59.77 59.73
64 1000 60.44 60.54 60.48
128 1000 61.55 61.67 61.60
256 1000 63.00 63.10 63.05
512 1000 76.45 76.57 76.51
1024 1000 90.90 91.04 90.98
2048 1000 115.47 115.62 115.55
4096 1000 141.89 142.05 141.97
8192 1000 215.02 215.26 215.13
16384 1000 331.40 331.75 331.55
32768 1000 570.10 570.72 570.35
65536 640 1438.98 1440.92 1439.90
131072 320 2520.10 2527.04 2523.43
262144 160 4489.33 4513.45 4500.41
524288 80 8796.98 8866.25 8830.10
1048576 40 17076.02 17311.20 17189.12
2097152 20 34601.50 35616.60 35135.71
4194304 10 68154.90 72586.10 71308.10
#----------------------------------------------------------------
# Benchmarking Scatter
# #processes = 6
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.09 0.10 0.10
1 1000 68.90 69.00 68.95
2 1000 68.73 68.84 68.79
4 1000 68.53 68.63 68.58
8 1000 68.68 68.80 68.74
16 1000 69.01 69.11 69.06
32 1000 69.15 69.23 69.19
64 1000 69.98 70.08 70.03
128 1000 72.16 72.28 72.22
256 1000 77.18 77.28 77.23
512 1000 93.59 93.72 93.66
1024 1000 106.65 106.79 106.72
2048 1000 121.44 121.59 121.53
4096 1000 168.90 169.09 168.99
8192 1000 242.93 243.26 243.08
16384 1000 405.30 405.79 405.51
32768 1000 719.63 720.49 719.96
65536 640 2006.63 2009.71 2008.34
131072 320 3518.24 3526.73 3522.53
262144 160 6480.69 6518.69 6501.17
524288 80 12253.60 12370.57 12325.41
1048576 40 23958.93 24423.23 24229.18
2097152 20 48906.65 50866.90 50190.55
4194304 10 97132.60 106029.30 102348.00
#----------------------------------------------------------------
# Benchmarking Scatterv
# #processes = 2
# ( 4 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.20 0.29 0.25
1 1000 50.67 50.77 50.72
2 1000 50.66 50.77 50.71
4 1000 50.66 50.76 50.71
8 1000 50.63 50.73 50.68
16 1000 50.70 50.80 50.75
32 1000 50.77 50.88 50.83
64 1000 50.82 50.92 50.87
128 1000 51.02 51.11 51.07
256 1000 52.59 52.70 52.64
512 1000 54.74 54.84 54.79
1024 1000 72.38 72.50 72.44
2048 1000 88.56 88.70 88.63
4096 1000 113.95 114.10 114.03
8192 1000 147.49 147.67 147.58
16384 1000 228.99 229.23 229.11
32768 1000 394.04 394.44 394.24
65536 640 795.18 796.19 795.68
131072 320 1370.79 1374.24 1372.51
262144 160 2533.74 2545.83 2539.79
524288 80 4892.14 4909.89 4901.01
1048576 40 9647.30 9671.65 9659.48
2097152 20 19065.40 19071.75 19068.57
4194304 10 37817.70 37991.21 37904.45
#----------------------------------------------------------------
# Benchmarking Scatterv
# #processes = 4
# ( 2 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.18 0.30 0.24
1 1000 58.87 59.01 58.95
2 1000 58.76 58.87 58.81
4 1000 59.01 59.13 59.08
8 1000 59.12 59.21 59.17
16 1000 59.31 59.42 59.36
32 1000 59.49 59.61 59.54
64 1000 59.93 60.05 59.99
128 1000 60.99 61.08 61.04
256 1000 62.82 62.94 62.86
512 1000 75.45 75.58 75.51
1024 1000 90.84 90.96 90.90
2048 1000 114.84 114.97 114.91
4096 1000 141.63 141.82 141.72
8192 1000 214.73 214.97 214.84
16384 1000 331.67 332.02 331.82
32768 1000 569.55 570.19 569.81
65536 640 1440.84 1442.76 1441.73
131072 320 2493.45 2500.49 2496.84
262144 160 4468.91 4493.01 4480.08
524288 80 8749.84 8820.40 8783.93
1048576 40 17369.92 17605.18 17482.66
2097152 20 34087.25 35047.30 34580.78
4194304 10 67819.69 71969.30 70830.50
#----------------------------------------------------------------
# Benchmarking Scatterv
# #processes = 6
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.22 0.30 0.26
1 1000 68.89 69.00 68.93
2 1000 69.03 69.12 69.07
4 1000 68.80 68.90 68.85
8 1000 68.93 69.06 69.00
16 1000 69.11 69.23 69.18
32 1000 69.31 69.42 69.37
64 1000 70.36 70.51 70.42
128 1000 72.50 72.61 72.55
256 1000 77.07 77.17 77.12
512 1000 93.53 93.62 93.57
1024 1000 106.69 106.83 106.76
2048 1000 121.62 121.78 121.70
4096 1000 168.83 169.03 168.92
8192 1000 243.39 243.69 243.52
16384 1000 403.47 403.94 403.65
32768 1000 719.37 720.18 719.67
65536 640 1995.16 1998.14 1996.82
131072 320 3518.75 3527.07 3522.89
262144 160 6404.10 6442.39 6424.76
524288 80 12260.99 12377.75 12333.69
1048576 40 23940.70 24413.50 24215.43
2097152 20 48474.05 50554.25 49820.27
4194304 10 96856.90 105808.69 102109.13
#----------------------------------------------------------------
# Benchmarking Alltoall
# #processes = 2
# ( 4 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.07 0.07 0.07
1 1000 57.04 57.05 57.04
2 1000 57.28 57.28 57.28
4 1000 56.96 57.07 57.01
8 1000 57.28 57.29 57.28
16 1000 57.38 57.40 57.39
32 1000 57.73 57.73 57.73
64 1000 58.59 58.60 58.60
128 1000 59.53 59.64 59.58
256 1000 61.02 61.02 61.02
512 1000 71.12 71.12 71.12
1024 1000 85.02 85.03 85.03
2048 1000 116.69 116.71 116.70
4096 1000 143.16 143.21 143.19
8192 1000 193.30 193.37 193.34
16384 1000 239.48 239.53 239.51
32768 1000 401.82 401.87 401.85
65536 640 1461.20 1462.08 1461.64
131072 320 2623.29 2626.65 2624.97
262144 160 5016.91 5029.31 5023.11
524288 80 9785.61 9806.36 9795.99
1048576 40 19167.68 19207.65 19187.66
2097152 20 22888.15 22891.75 22889.95
4194304 10 42193.20 42255.30 42224.25
#----------------------------------------------------------------
# Benchmarking Alltoall
# #processes = 4
# ( 2 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.07 0.07 0.07
1 1000 90.17 90.19 90.18
2 1000 90.96 90.98 90.97
4 1000 89.44 89.57 89.50
8 1000 91.55 91.57 91.56
16 1000 91.66 91.67 91.67
32 1000 92.00 92.03 92.02
64 1000 94.18 94.20 94.19
128 1000 96.38 96.39 96.38
256 1000 102.49 102.51 102.50
512 1000 104.74 104.77 104.75
1024 1000 118.39 118.41 118.40
2048 1000 146.64 146.79 146.71
4096 1000 312.00 312.12 312.07
8192 1000 451.75 451.80 451.78
16384 1000 714.69 714.82 714.76
32768 1000 1345.37 1345.71 1345.55
65536 640 3818.38 3820.41 3819.42
131072 320 6921.78 6927.78 6924.31
262144 160 12214.65 12223.44 12218.59
524288 80 23369.66 23379.84 23374.18
1048576 40 45019.30 45070.70 45047.59
2097152 20 88316.00 88496.95 88444.01
4194304 10 175043.11 175613.30 175337.05
#----------------------------------------------------------------
# Benchmarking Alltoall
# #processes = 6
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.07 0.07 0.07
1 1000 119.77 119.85 119.81
2 1000 119.41 119.45 119.42
4 1000 117.80 117.85 117.84
8 1000 117.97 118.09 118.03
16 1000 119.12 119.24 119.16
32 1000 118.87 118.95 118.91
64 1000 119.77 119.89 119.83
128 1000 123.35 123.44 123.42
256 1000 139.23 139.30 139.28
512 1000 142.96 142.99 142.97
1024 1000 176.76 176.81 176.78
2048 1000 244.77 244.88 244.83
4096 1000 541.57 541.63 541.59
8192 1000 808.14 808.28 808.21
16384 1000 1379.83 1380.16 1379.97
32768 1000 2643.54 2644.26 2643.90
65536 640 7565.47 7568.20 7566.81
131072 320 14641.30 14652.53 14647.36
262144 160 27768.28 27789.86 27777.06
524288 80 55881.58 55886.25 55883.87
1048576 40 109620.65 109781.40 109735.88
2097152 20 226124.60 226271.10 226227.77
4194304 10 422511.30 423497.11 423021.30
#----------------------------------------------------------------
# Benchmarking Alltoallv
# #processes = 2
# ( 4 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.14 0.14 0.14
1 1000 57.18 57.18 57.18
2 1000 57.15 57.15 57.15
4 1000 57.24 57.26 57.25
8 1000 57.37 57.37 57.37
16 1000 57.67 57.70 57.69
32 1000 57.82 57.84 57.83
64 1000 58.47 58.48 58.47
128 1000 60.03 60.04 60.03
256 1000 61.08 61.09 61.09
512 1000 72.28 72.30 72.29
1024 1000 87.12 87.13 87.13
2048 1000 117.67 117.70 117.68
4096 1000 144.18 144.24 144.21
8192 1000 192.79 192.85 192.82
16384 1000 239.19 239.21 239.20
32768 1000 406.28 406.36 406.32
65536 640 1453.75 1454.63 1454.19
131072 320 1485.30 1485.45 1485.37
262144 160 2821.49 2822.21 2821.85
524288 80 5454.79 5455.86 5455.33
1048576 40 10688.10 10697.32 10692.71
2097152 20 21159.40 21166.25 21162.83
4194304 10 42310.01 42383.71 42346.86
#----------------------------------------------------------------
# Benchmarking Alltoallv
# #processes = 4
# ( 2 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.15 0.16 0.15
1 1000 85.82 85.87 85.85
2 1000 86.80 86.89 86.85
4 1000 86.50 86.61 86.55
8 1000 86.92 87.03 86.98
16 1000 87.27 87.35 87.31
32 1000 87.71 87.79 87.75
64 1000 88.30 88.40 88.35
128 1000 91.09 91.13 91.11
256 1000 94.46 94.55 94.51
512 1000 100.83 100.92 100.87
1024 1000 117.67 117.76 117.71
2048 1000 148.66 148.75 148.70
4096 1000 199.70 199.83 199.77
8192 1000 318.39 318.52 318.47
16384 1000 572.54 572.90 572.75
32768 1000 1138.89 1139.25 1139.08
65536 640 3032.84 3034.21 3033.49
131072 320 5881.38 5892.30 5887.70
262144 160 16696.20 16746.91 16721.81
524288 80 32174.21 32284.28 32240.24
1048576 40 68246.48 68485.40 68387.65
2097152 20 95430.05 95787.05 95628.55
4194304 10 159342.49 159923.59 159633.42
#----------------------------------------------------------------
# Benchmarking Alltoallv
# #processes = 6
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.17 0.18 0.17
1 1000 122.79 122.82 122.81
2 1000 122.63 122.67 122.65
4 1000 121.91 122.10 122.01
8 1000 123.48 123.53 123.51
16 1000 124.83 124.87 124.86
32 1000 123.94 124.08 124.01
64 1000 125.92 125.99 125.96
128 1000 129.53 129.63 129.58
256 1000 141.54 141.59 141.56
512 1000 148.23 148.30 148.26
1024 1000 167.44 167.49 167.45
2048 1000 243.29 243.32 243.30
4096 1000 340.28 340.50 340.41
8192 1000 643.97 644.21 644.09
16384 1000 1287.60 1288.11 1287.83
32768 1000 2585.31 2586.42 2585.82
65536 640 5949.87 5954.05 5952.34
131072 320 11770.69 11790.63 11781.70
262144 160 36469.79 36590.22 36534.72
524288 80 75128.57 75365.21 75266.78
1048576 40 152921.60 153425.32 153221.98
2097152 20 217665.50 217883.40 217776.84
4194304 10 376332.91 396450.71 392562.15
#----------------------------------------------------------------
# Benchmarking Bcast
# #processes = 2
# ( 4 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.07 0.09 0.08
1 1000 51.60 51.70 51.65
2 1000 51.60 51.71 51.66
4 1000 51.63 51.73 51.68
8 1000 51.62 51.72 51.67
16 1000 51.56 51.66 51.61
32 1000 51.62 51.72 51.67
64 1000 51.69 51.79 51.74
128 1000 52.02 52.13 52.08
256 1000 52.60 52.70 52.65
512 1000 55.28 55.37 55.32
1024 1000 72.48 72.61 72.55
2048 1000 87.06 87.19 87.12
4096 1000 117.03 117.17 117.10
8192 1000 144.74 144.90 144.82
16384 1000 227.05 227.23 227.14
32768 1000 372.77 373.11 372.94
65536 640 660.64 661.44 661.04
131072 320 1411.97 1414.08 1413.02
262144 160 2576.20 2583.42 2579.81
524288 80 5265.41 5279.90 5272.66
1048576 40 10492.00 10521.35 10506.68
2097152 20 20904.55 20961.70 20933.13
4194304 10 41707.11 41822.11 41764.61
#----------------------------------------------------------------
# Benchmarking Bcast
# #processes = 4
# ( 2 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.08 0.09 0.08
1 1000 57.77 57.87 57.82
2 1000 57.93 58.05 58.00
4 1000 57.90 58.02 57.96
8 1000 57.96 58.11 58.04
16 1000 57.91 57.95 57.93
32 1000 58.07 58.21 58.13
64 1000 58.45 58.56 58.50
128 1000 59.62 59.77 59.70
256 1000 65.27 65.39 65.33
512 1000 72.08 72.26 72.18
1024 1000 82.99 83.14 83.06
2048 1000 129.79 129.96 129.91
4096 1000 148.97 149.17 149.10
8192 1000 206.37 206.61 206.52
16384 1000 340.51 340.80 340.71
32768 1000 598.34 598.76 598.60
65536 640 1124.13 1125.21 1124.77
131072 320 2874.58 2880.39 2877.57
262144 160 5371.42 5393.14 5383.24
524288 80 9229.81 9313.55 9277.36
1048576 40 17966.33 18182.45 18086.43
2097152 20 37258.75 37548.05 37429.03
4194304 10 73024.49 73599.79 73362.07
#----------------------------------------------------------------
# Benchmarking Bcast
# #processes = 6
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.07 0.10 0.08
1 1000 63.51 63.63 63.57
2 1000 62.75 62.87 62.81
4 1000 62.57 62.72 62.64
8 1000 62.66 62.79 62.72
16 1000 63.04 63.19 63.11
32 1000 63.19 63.34 63.26
64 1000 64.20 64.34 64.27
128 1000 67.15 67.30 67.23
256 1000 73.85 73.99 73.92
512 1000 79.65 79.81 79.73
1024 1000 90.73 90.88 90.81
2048 1000 140.90 141.06 141.02
4096 1000 179.48 179.72 179.65
8192 1000 266.56 266.86 266.78
16384 1000 444.57 445.01 444.89
32768 1000 779.12 779.90 779.63
65536 640 1458.74 1460.79 1460.09
131072 320 3731.42 3741.65 3737.59
262144 160 7662.66 7691.72 7679.68
524288 80 13532.62 13665.15 13615.63
1048576 40 26478.07 26852.15 26717.24
2097152 20 52535.80 53741.15 53317.91
4194304 10 111857.09 113262.41 112738.45
#---------------------------------------------------
# Benchmarking Barrier
# #processes = 2
# ( 4 additional processes waiting in MPI_Barrier)
#---------------------------------------------------
#repetitions t_min[usec] t_max[usec] t_avg[usec]
1000 56.41 56.41 56.41
#---------------------------------------------------
# Benchmarking Barrier
# #processes = 4
# ( 2 additional processes waiting in MPI_Barrier)
#---------------------------------------------------
#repetitions t_min[usec] t_max[usec] t_avg[usec]
1000 88.00 88.00 88.00
#---------------------------------------------------
# Benchmarking Barrier
# #processes = 6
#---------------------------------------------------
#repetitions t_min[usec] t_max[usec] t_avg[usec]
1000 117.00 117.02 117.01
# All processes entering MPI_Finalize
#---------------------------------------------------
# Intel (R) MPI Benchmark Suite V3.2, MPI-1 part
#---------------------------------------------------
# Date : Thu Sep 1 10:24:31 2011
# Machine : i686
# System : Linux
# Release : 2.6.32-24-generic-pae
# Version : #39-Ubuntu SMP Wed Jul 28 07:39:26 UTC 2010
# MPI Version : 2.1
# MPI Thread Environment: MPI_THREAD_SINGLE
# New default behavior from Version 3.2 on:
# the number of iterations per message size is cut down
# dynamically when a certain run time (per message size sample)
# is expected to be exceeded. Time limit is defined by variable
# "SECS_PER_SAMPLE" (=> IMB_settings.h)
# or through the flag => -time
# Calling sequence was:
# ./IMB-MPI1
# Minimum message length in bytes: 0
# Maximum message length in bytes: 4194304
#
# MPI_Datatype : MPI_BYTE
# MPI_Datatype for reductions : MPI_FLOAT
# MPI_Op : MPI_SUM
#
#
# List of Benchmarks to run:
# PingPong
# PingPing
# Sendrecv
# Exchange
# Allreduce
# Reduce
# Reduce_scatter
# Allgather
# Allgatherv
# Gather
# Gatherv
# Scatter
# Scatterv
# Alltoall
# Alltoallv
# Bcast
# Barrier
#---------------------------------------------------
# Benchmarking PingPong
# #processes = 2
# ( 4 additional processes waiting in MPI_Barrier)
#---------------------------------------------------
#bytes #repetitions t[usec] Mbytes/sec
0 1000 51.41 0.00
1 1000 55.07 0.02
2 1000 54.87 0.03
4 1000 55.28 0.07
8 1000 54.95 0.14
16 1000 55.16 0.28
32 1000 55.53 0.55
64 1000 56.87 1.07
128 1000 57.67 2.12
256 1000 60.32 4.05
512 1000 64.34 7.59
1024 1000 73.57 13.27
2048 1000 84.52 23.11
4096 1000 105.72 36.95
8192 1000 137.14 56.97
16384 1000 213.70 73.12
32768 1000 363.57 85.95
65536 640 754.12 82.88
131072 320 1309.91 95.43
262144 160 2413.36 103.59
524288 80 4614.26 108.36
1048576 40 9013.52 110.94
2097152 20 17809.52 112.30
4194304 10 35406.00 112.98
#---------------------------------------------------
# Benchmarking PingPing
# #processes = 2
# ( 4 additional processes waiting in MPI_Barrier)
#---------------------------------------------------
#bytes #repetitions t[usec] Mbytes/sec
0 1000 63.73 0.00
1 1000 57.10 0.02
2 1000 57.66 0.03
4 1000 57.36 0.07
8 1000 57.18 0.13
16 1000 57.55 0.27
32 1000 57.56 0.53
64 1000 59.12 1.03
128 1000 59.75 2.04
256 1000 62.98 3.88
512 1000 65.06 7.51
1024 1000 72.83 13.41
2048 1000 99.49 19.63
4096 1000 106.02 36.84
8192 1000 157.73 49.53
16384 1000 219.59 71.16
32768 1000 364.25 85.79
65536 640 763.40 81.87
131072 320 1319.38 94.74
262144 160 2424.62 103.11
524288 80 4631.57 107.95
1048576 40 9035.73 110.67
2097152 20 17879.45 111.86
4194304 10 35426.10 112.91
#-----------------------------------------------------------------------------
# Benchmarking Sendrecv
# #processes = 2
# ( 4 additional processes waiting in MPI_Barrier)
#-----------------------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec
0 1000 64.00 64.06 64.03 0.00
1 1000 57.84 57.84 57.84 0.03
2 1000 58.38 58.39 58.39 0.07
4 1000 57.89 57.90 57.90 0.13
8 1000 57.72 57.77 57.74 0.26
16 1000 57.72 57.73 57.72 0.53
32 1000 58.01 58.04 58.02 1.05
64 1000 59.23 59.24 59.24 2.06
128 1000 60.32 60.35 60.34 4.05
256 1000 63.27 63.32 63.29 7.71
512 1000 64.73 64.74 64.74 15.09
1024 1000 73.15 73.16 73.16 26.70
2048 1000 100.22 100.23 100.23 38.97
4096 1000 106.00 106.02 106.01 73.69
8192 1000 158.07 158.09 158.08 98.84
16384 1000 220.15 220.18 220.17 141.93
32768 1000 364.38 364.38 364.38 171.52
65536 640 763.34 763.39 763.36 163.74
131072 320 1319.17 1319.39 1319.28 189.48
262144 160 2425.20 2425.63 2425.41 206.13
524288 80 4630.47 4631.35 4630.91 215.92
1048576 40 9033.75 9035.10 9034.43 221.36
2097152 20 17831.65 17833.95 17832.80 224.29
4194304 10 35420.71 35425.79 35423.25 225.82
#-----------------------------------------------------------------------------
# Benchmarking Sendrecv
# #processes = 4
# ( 2 additional processes waiting in MPI_Barrier)
#-----------------------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec
0 1000 70.44 70.54 70.49 0.00
1 1000 67.79 67.82 67.81 0.03
2 1000 67.95 68.09 68.01 0.06
4 1000 68.42 68.44 68.43 0.11
8 1000 68.46 68.49 68.47 0.22
16 1000 68.33 68.36 68.34 0.45
32 1000 69.06 69.08 69.08 0.88
64 1000 69.36 69.50 69.43 1.76
128 1000 70.97 70.99 70.98 3.44
256 1000 74.37 74.50 74.44 6.55
512 1000 78.64 78.66 78.65 12.41
1024 1000 83.53 83.66 83.59 23.35
2048 1000 116.91 117.26 117.09 33.31
4096 1000 108.72 108.86 108.79 71.77
8192 1000 166.43 166.50 166.46 93.85
16384 1000 276.75 276.90 276.81 112.86
32768 1000 549.71 550.11 549.93 113.61
65536 640 1157.90 1158.35 1158.14 107.91
131072 320 2339.19 2341.18 2339.97 106.78
262144 160 4453.19 4463.61 4458.93 112.02
524288 80 8817.12 8855.05 8838.51 112.93
1048576 40 17565.17 17691.85 17628.31 113.05
2097152 20 34809.15 35406.90 35118.71 112.97
4194304 10 68562.40 70870.10 69710.72 112.88
#-----------------------------------------------------------------------------
# Benchmarking Sendrecv
# #processes = 6
#-----------------------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec
0 1000 64.64 64.76 64.68 0.00
1 1000 82.97 83.29 83.13 0.02
2 1000 87.65 88.06 87.87 0.04
4 1000 88.91 89.31 89.13 0.09
8 1000 70.93 70.95 70.94 0.22
16 1000 85.12 85.42 85.28 0.36
32 1000 72.13 72.16 72.15 0.85
64 1000 73.18 73.19 73.18 1.67
128 1000 73.88 73.93 73.90 3.30
256 1000 78.56 78.67 78.62 6.21
512 1000 87.29 87.52 87.39 11.16
1024 1000 91.23 91.42 91.32 21.36
2048 1000 122.05 122.34 122.25 31.93
4096 1000 117.10 117.28 117.19 66.61
8192 1000 176.67 177.04 176.87 88.26
16384 1000 286.89 287.05 286.96 108.86
32768 1000 553.29 553.62 553.49 112.89
65536 640 1230.45 1231.94 1231.24 101.47
131072 320 2388.02 2394.78 2391.89 104.39
262144 160 4640.99 4652.48 4649.23 107.47
524288 80 9118.17 9198.21 9168.40 108.72
1048576 40 17832.53 18169.35 18079.06 110.08
2097152 20 34985.30 36198.40 35815.23 110.50
4194304 10 67432.29 71931.11 70601.57 111.22
#-----------------------------------------------------------------------------
# Benchmarking Exchange
# #processes = 2
# ( 4 additional processes waiting in MPI_Barrier)
#-----------------------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec
0 1000 67.05 67.07 67.06 0.00
1 1000 75.37 75.40 75.39 0.05
2 1000 75.70 75.71 75.70 0.10
4 1000 74.82 74.89 74.86 0.20
8 1000 75.89 75.90 75.90 0.40
16 1000 75.59 75.60 75.59 0.81
32 1000 74.95 75.06 75.00 1.63
64 1000 77.25 77.26 77.26 3.16
128 1000 78.66 78.67 78.66 6.21
256 1000 82.30 82.30 82.30 11.87
512 1000 88.94 88.95 88.95 21.96
1024 1000 100.49 100.63 100.56 38.82
2048 1000 120.06 120.06 120.06 65.07
4096 1000 145.38 145.47 145.42 107.41
8192 1000 219.07 219.14 219.10 142.61
16384 1000 374.56 374.59 374.57 166.85
32768 1000 645.63 645.71 645.67 193.59
65536 640 1729.46 1730.03 1729.74 144.51
131072 320 3332.87 3334.10 3333.48 149.97
262144 160 6618.56 6621.07 6619.82 151.03
524288 80 13325.61 13328.44 13327.02 150.06
1048576 40 22026.03 22032.05 22029.04 181.55
2097152 20 42507.70 42511.10 42509.40 188.19
4194304 10 105635.69 105663.70 105649.70 151.42
#-----------------------------------------------------------------------------
# Benchmarking Exchange
# #processes = 4
# ( 2 additional processes waiting in MPI_Barrier)
#-----------------------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec
0 1000 77.15 77.19 77.17 0.00
1 1000 88.58 88.60 88.59 0.04
2 1000 88.49 88.51 88.50 0.09
4 1000 89.81 89.84 89.82 0.17
8 1000 89.07 89.11 89.10 0.34
16 1000 90.22 90.39 90.31 0.68
32 1000 90.33 90.37 90.36 1.35
64 1000 91.46 91.48 91.47 2.67
128 1000 93.62 93.65 93.64 5.21
256 1000 99.31 99.33 99.32 9.83
512 1000 105.04 105.09 105.08 18.59
1024 1000 115.79 115.91 115.85 33.70
2048 1000 135.66 135.73 135.70 57.56
4096 1000 168.06 168.10 168.08 92.95
8192 1000 280.06 280.14 280.11 111.55
16384 1000 553.83 554.05 553.95 112.81
32768 1000 1100.63 1100.98 1100.85 113.53
65536 640 2415.04 2416.43 2415.79 103.46
131072 320 5242.45 5247.50 5245.24 95.28
262144 160 10153.71 10166.51 10162.33 98.36
524288 80 19813.74 19883.29 19851.69 100.59
1048576 40 39709.90 39815.58 39762.36 100.46
2097152 20 81658.25 82536.35 82103.56 96.93
4194304 10 158092.89 161604.10 160119.10 99.01
#-----------------------------------------------------------------------------
# Benchmarking Exchange
# #processes = 6
#-----------------------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec
0 1000 86.82 86.87 86.84 0.00
1 1000 100.05 100.11 100.09 0.04
2 1000 100.19 100.23 100.21 0.08
4 1000 101.18 101.22 101.21 0.15
8 1000 100.71 100.74 100.72 0.30
16 1000 99.99 100.01 100.00 0.61
32 1000 101.17 101.20 101.19 1.21
64 1000 102.42 102.47 102.44 2.38
128 1000 104.01 104.08 104.05 4.69
256 1000 110.69 110.75 110.73 8.82
512 1000 116.91 116.97 116.94 16.70
1024 1000 125.25 125.38 125.33 31.15
2048 1000 160.65 160.76 160.71 48.60
4096 1000 184.73 184.87 184.82 84.52
8192 1000 285.79 285.91 285.86 109.30
16384 1000 558.17 558.61 558.44 111.88
32768 1000 1110.92 1112.13 1111.77 112.40
65536 640 2471.67 2472.76 2472.28 101.10
131072 320 5139.48 5143.32 5141.05 97.21
262144 160 11331.59 11371.48 11357.14 87.94
524288 80 23774.21 23949.52 23874.19 83.51
1048576 40 46545.22 47236.95 46934.54 84.68
2097152 20 92496.15 95121.25 94389.51 84.10
4194304 10 183384.01 196924.10 191677.32 81.25
#----------------------------------------------------------------
# Benchmarking Allreduce
# #processes = 2
# ( 4 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.08 0.09 0.09
4 1000 59.21 59.24 59.22
8 1000 59.84 59.85 59.84
16 1000 59.29 59.30 59.29
32 1000 59.51 59.57 59.54
64 1000 61.23 61.33 61.28
128 1000 63.84 63.85 63.85
256 1000 65.96 66.04 66.00
512 1000 68.30 68.31 68.30
1024 1000 75.50 75.52 75.51
2048 1000 104.86 104.87 104.87
4096 1000 111.88 112.00 111.94
8192 1000 167.55 167.64 167.59
16384 1000 330.70 330.71 330.70
32768 1000 465.07 465.24 465.15
65536 640 780.63 780.65 780.64
131072 320 1647.08 1647.26 1647.17
262144 160 2936.92 2937.23 2937.08
524288 80 5571.96 5572.81 5572.39
1048576 40 11066.30 11067.68 11066.99
2097152 20 21585.00 21591.05 21588.03
4194304 10 42703.10 42712.41 42707.76
#----------------------------------------------------------------
# Benchmarking Allreduce
# #processes = 4
# ( 2 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.09 0.10 0.09
4 1000 91.71 91.73 91.72
8 1000 91.51 91.56 91.54
16 1000 91.78 91.83 91.81
32 1000 92.53 92.56 92.54
64 1000 93.74 93.78 93.77
128 1000 95.87 95.88 95.87
256 1000 101.58 101.61 101.60
512 1000 106.26 106.27 106.26
1024 1000 116.49 116.65 116.57
2048 1000 139.59 139.61 139.60
4096 1000 165.23 165.30 165.26
8192 1000 251.07 251.14 251.11
16384 1000 682.13 682.24 682.19
32768 1000 1031.56 1031.88 1031.73
65536 640 1661.61 1661.87 1661.74
131072 320 3299.65 3300.88 3300.34
262144 160 6961.82 6963.59 6962.84
524288 80 14207.14 14216.59 14211.55
1048576 40 27506.08 27545.72 27527.58
2097152 20 54197.60 54347.20 54282.43
4194304 10 109518.91 109830.90 109690.93
#----------------------------------------------------------------
# Benchmarking Allreduce
# #processes = 6
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.08 0.09 0.09
4 1000 161.78 161.92 161.85
8 1000 162.34 162.47 162.41
16 1000 162.57 162.71 162.64
32 1000 163.12 163.24 163.18
64 1000 165.68 165.84 165.76
128 1000 168.10 168.26 168.18
256 1000 175.94 176.07 176.01
512 1000 188.23 188.40 188.31
1024 1000 214.99 215.13 215.06
2048 1000 251.40 251.55 251.48
4096 1000 342.94 343.18 343.05
8192 1000 527.98 528.30 528.12
16384 1000 1170.27 1170.48 1170.39
32768 1000 1399.54 1399.71 1399.63
65536 640 2145.77 2146.25 2146.03
131072 320 3810.44 3811.28 3810.82
262144 160 7524.33 7526.99 7525.76
524288 80 16787.28 16795.20 16791.61
1048576 40 33385.72 33421.95 33404.94
2097152 20 66735.60 66968.39 66898.74
4194304 10 131373.39 132278.31 132021.34
#----------------------------------------------------------------
# Benchmarking Reduce
# #processes = 2
# ( 4 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.10 0.11 0.11
4 1000 57.44 57.44 57.44
8 1000 57.84 57.84 57.84
16 1000 58.27 58.27 58.27
32 1000 58.65 58.66 58.65
64 1000 59.78 59.78 59.78
128 1000 60.82 60.83 60.82
256 1000 62.45 62.45 62.45
512 1000 68.08 68.08 68.08
1024 1000 77.34 77.35 77.34
2048 1000 88.47 88.50 88.48
4096 1000 110.36 110.37 110.37
8192 1000 144.02 144.10 144.06
16384 1000 225.35 225.51 225.43
32768 1000 383.14 383.42 383.28
65536 640 663.51 664.17 663.84
131072 320 1484.58 1485.62 1485.10
262144 160 2852.07 2853.96 2853.02
524288 80 5570.24 5574.27 5572.26
1048576 40 10992.05 10999.83 10995.94
2097152 20 21848.45 21863.95 21856.20
4194304 10 43588.31 43619.11 43603.71
#----------------------------------------------------------------
# Benchmarking Reduce
# #processes = 4
# ( 2 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.10 0.11 0.11
4 1000 55.02 55.13 55.08
8 1000 55.59 55.72 55.66
16 1000 55.52 55.64 55.59
32 1000 55.84 55.96 55.90
64 1000 56.75 56.86 56.80
128 1000 58.33 58.44 58.39
256 1000 60.58 60.71 60.64
512 1000 81.55 81.58 81.57
1024 1000 93.59 93.61 93.60
2048 1000 110.21 110.26 110.24
4096 1000 156.03 156.13 156.08
8192 1000 236.94 237.12 237.03
16384 1000 385.22 385.54 385.37
32768 1000 546.02 546.96 546.56
65536 640 911.35 913.80 912.79
131072 320 2316.72 2321.62 2319.24
262144 160 4517.96 4528.42 4523.19
524288 80 8902.59 8939.15 8921.42
1048576 40 18941.92 19016.77 18990.31
2097152 20 36844.40 36993.85 36943.16
4194304 10 72863.89 73105.30 73018.32
#----------------------------------------------------------------
# Benchmarking Reduce
# #processes = 6
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.10 0.11 0.11
4 1000 63.46 63.70 63.58
8 1000 63.86 64.12 64.00
16 1000 64.07 64.32 64.20
32 1000 64.41 64.64 64.52
64 1000 65.08 65.32 65.20
128 1000 65.43 65.67 65.55
256 1000 67.62 67.89 67.76
512 1000 85.81 85.87 85.84
1024 1000 97.49 97.57 97.53
2048 1000 109.82 109.91 109.86
4096 1000 146.89 147.00 146.95
8192 1000 213.46 213.70 213.61
16384 1000 340.91 341.33 341.16
32768 1000 509.37 510.76 510.11
65536 640 832.18 835.57 834.01
131072 320 3056.25 3063.21 3059.98
262144 160 5150.13 5164.76 5157.64
524288 80 9677.89 9748.85 9717.34
1048576 40 18482.80 18583.87 18532.20
2097152 20 40818.60 41141.05 41030.19
4194304 10 78643.79 79151.70 78977.43
#----------------------------------------------------------------
# Benchmarking Reduce_scatter
# #processes = 2
# ( 4 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.09 0.10 0.09
4 1000 10.18 10.32 10.25
8 1000 60.96 61.06 61.01
16 1000 60.84 60.93 60.88
32 1000 60.55 60.65 60.60
64 1000 60.97 61.07 61.02
128 1000 63.65 63.74 63.70
256 1000 63.95 63.95 63.95
512 1000 67.71 67.75 67.73
1024 1000 70.56 70.67 70.62
2048 1000 76.80 76.91 76.85
4096 1000 107.26 107.42 107.34
8192 1000 115.11 115.25 115.18
16384 1000 174.02 174.18 174.10
32768 1000 252.34 252.38 252.36
65536 640 426.26 426.45 426.35
131072 320 908.50 908.72 908.61
262144 160 1712.25 1712.55 1712.40
524288 80 3892.74 3893.14 3892.94
1048576 40 7993.58 7995.55 7994.56
2097152 20 15661.15 15668.41 15664.78
4194304 10 30506.39 30509.30 30507.85
#----------------------------------------------------------------
# Benchmarking Reduce_scatter
# #processes = 4
# ( 2 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.09 0.10 0.10
4 1000 22.80 22.98 22.86
8 1000 130.08 131.06 130.60
16 1000 92.26 92.41 92.33
32 1000 92.56 92.71 92.63
64 1000 93.33 93.48 93.40
128 1000 94.00 94.18 94.09
256 1000 95.24 95.39 95.31
512 1000 97.05 97.19 97.11
1024 1000 102.39 102.52 102.46
2048 1000 110.71 110.86 110.78
4096 1000 122.89 123.04 122.96
8192 1000 149.22 149.26 149.24
16384 1000 189.11 189.28 189.19
32768 1000 304.40 304.55 304.46
65536 640 501.02 501.29 501.16
131072 320 1069.78 1070.95 1070.39
262144 160 2122.43 2124.39 2123.74
524288 80 8169.81 8173.39 8171.11
1048576 40 15744.27 15773.32 15757.41
2097152 20 30848.00 30983.25 30921.04
4194304 10 61837.40 62227.09 62030.67
#----------------------------------------------------------------
# Benchmarking Reduce_scatter
# #processes = 6
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.10 0.12 0.11
4 1000 2.96 39.64 33.50
8 1000 115.42 146.67 140.79
16 1000 119.91 138.67 135.12
32 1000 151.63 151.78 151.71
64 1000 153.00 153.16 153.08
128 1000 155.00 155.14 155.08
256 1000 159.93 160.11 160.04
512 1000 167.81 167.98 167.92
1024 1000 186.56 186.71 186.66
2048 1000 211.69 211.84 211.78
4096 1000 252.82 252.96 252.90
8192 1000 367.01 367.17 367.10
16384 1000 590.72 591.14 590.94
32768 1000 714.52 714.72 714.62
65536 640 1096.66 1097.22 1096.97
131072 320 1933.53 1934.14 1933.86
262144 160 3880.41 3883.89 3882.46
524288 80 9389.45 9400.90 9395.93
1048576 40 18916.55 18969.05 18947.56
2097152 20 38180.00 38352.10 38302.32
4194304 10 75242.91 76074.30 75827.13
#----------------------------------------------------------------
# Benchmarking Allgather
# #processes = 2
# ( 4 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.07 0.09 0.08
1 1000 58.75 58.84 58.80
2 1000 58.70 58.74 58.72
4 1000 59.02 59.10 59.06
8 1000 59.34 59.41 59.38
16 1000 59.09 59.09 59.09
32 1000 60.05 60.14 60.09
64 1000 61.25 61.29 61.27
128 1000 61.92 61.92 61.92
256 1000 64.75 64.85 64.80
512 1000 67.27 67.38 67.32
1024 1000 74.56 74.67 74.61
2048 1000 103.04 103.06 103.05
4096 1000 108.96 108.97 108.96
8192 1000 162.35 162.38 162.37
16384 1000 227.32 227.35 227.33
32768 1000 379.68 379.77 379.72
65536 640 794.55 794.65 794.60
131072 320 1390.47 1390.68 1390.57
262144 160 2622.39 2622.63 2622.51
524288 80 5134.37 5135.34 5134.86
1048576 40 10093.73 10098.30 10096.01
2097152 20 19952.25 19974.90 19963.58
4194304 10 39757.29 39804.31 39780.80
#----------------------------------------------------------------
# Benchmarking Allgather
# #processes = 4
# ( 2 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.07 0.09 0.08
1 1000 91.85 91.88 91.87
2 1000 91.93 91.96 91.95
4 1000 91.01 91.13 91.06
8 1000 91.77 91.79 91.78
16 1000 91.76 91.89 91.82
32 1000 92.52 92.55 92.54
64 1000 94.41 94.44 94.43
128 1000 95.72 95.75 95.74
256 1000 101.04 101.08 101.06
512 1000 106.36 106.54 106.46
1024 1000 118.95 118.96 118.95
2048 1000 141.26 141.43 141.34
4096 1000 171.76 171.91 171.83
8192 1000 259.60 259.66 259.63
16384 1000 826.73 827.04 826.85
32768 1000 1658.18 1658.49 1658.34
65536 640 3509.23 3510.10 3509.73
131072 320 7083.19 7086.03 7084.55
262144 160 13909.96 13926.46 13918.20
524288 80 27441.36 27503.15 27472.14
1048576 40 54294.30 54553.68 54424.61
2097152 20 107526.15 108426.86 107976.05
4194304 10 212957.80 216152.80 214553.63
#----------------------------------------------------------------
# Benchmarking Allgather
# #processes = 6
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.07 0.09 0.08
1 1000 120.64 120.67 120.65
2 1000 120.69 120.71 120.70
4 1000 120.49 120.53 120.51
8 1000 121.08 121.12 121.11
16 1000 121.01 121.05 121.04
32 1000 122.04 122.19 122.12
64 1000 123.61 123.67 123.64
128 1000 124.65 124.82 124.74
256 1000 130.91 130.94 130.92
512 1000 138.97 139.03 139.00
1024 1000 155.76 155.82 155.78
2048 1000 189.82 189.89 189.85
4096 1000 239.89 239.96 239.93
8192 1000 349.09 349.15 349.13
16384 1000 1503.11 1503.83 1503.48
32768 1000 3042.96 3043.49 3043.21
65536 640 6435.05 6436.60 6435.85
131072 320 11606.62 11611.33 11609.44
262144 160 23294.90 23327.17 23315.98
524288 80 46543.69 46705.54 46647.87
1048576 40 94786.82 95433.57 95215.21
2097152 20 179402.60 181415.00 180735.89
4194304 10 351612.41 359080.41 356579.62
#----------------------------------------------------------------
# Benchmarking Allgatherv
# #processes = 2
# ( 4 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.08 0.08 0.08
1 1000 59.89 59.89 59.89
2 1000 60.33 60.35 60.34
4 1000 59.75 59.78 59.76
8 1000 59.81 59.85 59.83
16 1000 59.74 59.79 59.76
32 1000 60.00 60.05 60.03
64 1000 61.68 61.68 61.68
128 1000 62.21 62.27 62.24
256 1000 64.76 64.76 64.76
512 1000 67.52 67.54 67.53
1024 1000 75.17 75.19 75.18
2048 1000 102.85 102.87 102.86
4096 1000 109.63 109.71 109.67
8192 1000 163.16 163.22 163.19
16384 1000 227.18 227.33 227.25
32768 1000 379.27 379.36 379.31
65536 640 793.24 793.35 793.30
131072 320 1388.31 1388.58 1388.45
262144 160 2623.62 2624.03 2623.83
524288 80 5140.31 5141.43 5140.87
1048576 40 10084.08 10089.08 10086.58
2097152 20 19941.35 19967.10 19954.22
4194304 10 39612.99 39720.59 39666.79
#----------------------------------------------------------------
# Benchmarking Allgatherv
# #processes = 4
# ( 2 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.09 0.10 0.09
1 1000 94.56 94.70 94.63
2 1000 94.54 94.68 94.61
4 1000 94.50 94.52 94.51
8 1000 94.48 94.59 94.54
16 1000 94.56 94.57 94.56
32 1000 94.99 95.04 95.02
64 1000 96.45 96.59 96.52
128 1000 97.70 97.72 97.71
256 1000 102.92 103.08 103.00
512 1000 110.97 110.98 110.97
1024 1000 124.63 124.67 124.66
2048 1000 149.16 149.20 149.18
4096 1000 179.25 179.28 179.26
8192 1000 271.93 271.98 271.95
16384 1000 826.92 827.24 827.06
32768 1000 1662.66 1662.98 1662.83
65536 640 3520.45 3521.63 3520.93
131072 320 7093.75 7099.70 7096.68
262144 160 13944.40 13963.07 13953.72
524288 80 27426.59 27490.51 27458.43
1048576 40 54377.17 54536.28 54456.76
2097152 20 107817.10 108526.15 108173.40
4194304 10 212809.21 216206.69 214506.48
#----------------------------------------------------------------
# Benchmarking Allgatherv
# #processes = 6
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.09 0.10 0.10
1 1000 128.41 128.54 128.48
2 1000 128.48 128.54 128.50
4 1000 128.57 128.62 128.59
8 1000 128.82 128.86 128.84
16 1000 128.94 128.98 128.97
32 1000 128.88 129.03 128.95
64 1000 130.56 130.63 130.59
128 1000 132.21 132.23 132.22
256 1000 138.39 138.42 138.41
512 1000 145.70 145.74 145.72
1024 1000 161.95 162.02 161.98
2048 1000 191.20 191.27 191.24
4096 1000 245.28 245.39 245.32
8192 1000 349.13 349.27 349.21
16384 1000 1505.01 1505.65 1505.32
32768 1000 3058.60 3059.25 3058.90
65536 640 6391.15 6392.21 6391.59
131072 320 11644.97 11653.13 11648.61
262144 160 22943.85 22973.64 22963.15
524288 80 46508.08 46629.00 46587.58
1048576 40 94836.67 95471.78 95255.68
2097152 20 180951.64 183209.00 182446.79
4194304 10 351933.90 359562.29 356996.70
#----------------------------------------------------------------
# Benchmarking Gather
# #processes = 2
# ( 4 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.10 0.11 0.10
1 1000 57.54 57.54 57.54
2 1000 57.40 57.40 57.40
4 1000 57.72 57.72 57.72
8 1000 58.18 58.18 58.18
16 1000 57.96 57.96 57.96
32 1000 57.95 57.96 57.96
64 1000 59.37 59.37 59.37
128 1000 60.05 60.05 60.05
256 1000 62.29 62.29 62.29
512 1000 67.27 67.27 67.27
1024 1000 75.99 76.00 75.99
2048 1000 86.59 86.63 86.61
4096 1000 108.57 108.61 108.59
8192 1000 146.53 146.60 146.56
16384 1000 229.37 229.50 229.43
32768 1000 378.49 378.79 378.64
65536 640 659.80 660.56 660.18
131072 320 1330.63 1331.84 1331.24
262144 160 2432.88 2435.31 2434.09
524288 80 4867.66 4872.64 4870.15
1048576 40 9816.05 9823.27 9819.66
2097152 20 19544.30 19564.05 19554.17
4194304 10 39049.20 39084.90 39067.05
#----------------------------------------------------------------
# Benchmarking Gather
# #processes = 4
# ( 2 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.10 0.11 0.10
1 1000 55.26 55.39 55.33
2 1000 55.27 55.37 55.32
4 1000 55.20 55.32 55.26
8 1000 55.36 55.47 55.42
16 1000 56.60 56.71 56.66
32 1000 55.60 55.72 55.66
64 1000 56.93 57.06 57.00
128 1000 57.57 57.68 57.63
256 1000 59.87 59.98 59.93
512 1000 62.53 62.63 62.58
1024 1000 72.20 72.34 72.27
2048 1000 90.63 90.83 90.73
4096 1000 99.76 99.95 99.85
8192 1000 282.44 282.71 282.57
16384 1000 346.55 346.88 346.68
32768 1000 538.62 539.13 538.79
65536 640 972.58 974.27 973.31
131072 320 2101.42 2107.63 2104.57
262144 160 3911.30 3936.05 3926.30
524288 80 7571.44 7672.01 7636.23
1048576 40 14918.10 15319.02 15183.63
2097152 20 29322.95 30946.35 30418.02
4194304 10 56828.49 63111.70 61471.73
#----------------------------------------------------------------
# Benchmarking Gather
# #processes = 6
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.10 0.11 0.10
1 1000 63.37 63.61 63.49
2 1000 63.23 63.48 63.36
4 1000 63.35 63.61 63.49
8 1000 63.33 63.56 63.44
16 1000 63.22 63.46 63.33
32 1000 64.11 64.36 64.24
64 1000 64.29 64.53 64.41
128 1000 64.72 64.96 64.84
256 1000 66.96 67.21 67.08
512 1000 71.65 71.91 71.78
1024 1000 80.97 81.29 81.13
2048 1000 100.30 100.71 100.51
4096 1000 107.60 108.02 107.81
8192 1000 421.00 421.45 421.26
16384 1000 526.12 526.63 526.38
32768 1000 757.66 758.59 758.12
65536 640 1360.39 1362.97 1361.67
131072 320 2919.98 2927.38 2923.74
262144 160 5346.37 5386.70 5370.69
524288 80 10138.52 10281.59 10230.93
1048576 40 20029.20 20657.57 20412.65
2097152 20 38665.59 41092.90 40291.83
4194304 10 72197.21 82246.10 78464.50
#----------------------------------------------------------------
# Benchmarking Gatherv
# #processes = 2
# ( 4 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.15 0.25 0.20
1 1000 57.19 57.20 57.20
2 1000 57.27 57.27 57.27
4 1000 56.56 56.57 56.57
8 1000 57.20 57.20 57.20
16 1000 57.60 57.60 57.60
32 1000 57.96 57.97 57.96
64 1000 59.48 59.48 59.48
128 1000 60.60 60.60 60.60
256 1000 62.54 62.54 62.54
512 1000 66.55 66.55 66.55
1024 1000 75.19 75.20 75.20
2048 1000 86.56 86.60 86.58
4096 1000 108.39 108.43 108.41
8192 1000 140.04 140.13 140.08
16384 1000 218.27 218.42 218.34
32768 1000 370.16 370.43 370.29
65536 640 772.03 772.46 772.24
131072 320 1351.96 1353.43 1352.69
262144 160 2522.24 2525.96 2524.10
524288 80 4872.56 4882.79 4877.68
1048576 40 9554.30 9594.00 9574.15
2097152 20 18879.30 18989.44 18934.37
4194304 10 37271.60 37678.39 37475.00
#----------------------------------------------------------------
# Benchmarking Gatherv
# #processes = 4
# ( 2 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.18 0.27 0.22
1 1000 55.26 55.38 55.33
2 1000 54.68 54.79 54.74
4 1000 54.82 54.93 54.88
8 1000 55.25 55.36 55.31
16 1000 55.38 55.49 55.44
32 1000 55.43 55.55 55.49
64 1000 56.28 56.40 56.34
128 1000 57.66 57.77 57.72
256 1000 59.22 59.33 59.27
512 1000 62.26 62.38 62.32
1024 1000 71.33 71.50 71.41
2048 1000 90.04 90.23 90.13
4096 1000 99.30 99.50 99.40
8192 1000 153.81 154.17 153.99
16384 1000 226.01 226.56 226.29
32768 1000 377.38 378.26 377.80
65536 640 1202.91 1204.53 1203.79
131072 320 2267.96 2273.32 2270.59
262144 160 4405.09 4425.65 4415.54
524288 80 8690.89 8767.78 8729.17
1048576 40 17400.18 17704.25 17551.56
2097152 20 34672.25 35875.20 35273.80
4194304 10 67204.90 71867.20 70447.98
#----------------------------------------------------------------
# Benchmarking Gatherv
# #processes = 6
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.21 0.31 0.25
1 1000 62.37 62.61 62.49
2 1000 62.13 62.39 62.26
4 1000 62.12 62.36 62.24
8 1000 62.07 62.29 62.18
16 1000 62.53 62.78 62.66
32 1000 62.90 63.15 63.03
64 1000 64.04 64.28 64.15
128 1000 64.75 64.99 64.87
256 1000 67.30 67.54 67.41
512 1000 71.58 71.83 71.70
1024 1000 81.49 81.82 81.67
2048 1000 100.50 100.92 100.72
4096 1000 107.46 107.88 107.68
8192 1000 161.09 161.74 161.39
16384 1000 243.38 244.39 243.87
32768 1000 413.48 415.16 414.27
65536 640 1570.90 1573.75 1572.57
131072 320 3162.14 3171.14 3167.63
262144 160 6279.64 6316.83 6300.81
524288 80 12524.01 12664.15 12613.47
1048576 40 24873.50 25445.07 25199.72
2097152 20 49345.75 51573.05 50762.44
4194304 10 96326.51 105392.59 101517.83
#----------------------------------------------------------------
# Benchmarking Scatter
# #processes = 2
# ( 4 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.10 0.10 0.10
1 1000 56.49 56.59 56.54
2 1000 57.16 57.27 57.22
4 1000 56.92 57.02 56.97
8 1000 57.09 57.19 57.14
16 1000 57.29 57.38 57.34
32 1000 57.72 57.82 57.77
64 1000 58.94 59.04 58.99
128 1000 59.77 59.87 59.82
256 1000 62.70 62.80 62.75
512 1000 66.80 66.91 66.85
1024 1000 75.61 75.74 75.68
2048 1000 86.29 86.43 86.36
4096 1000 107.61 107.77 107.69
8192 1000 139.11 139.30 139.21
16384 1000 217.47 217.71 217.59
32768 1000 368.09 368.46 368.27
65536 640 770.61 771.12 770.86
131072 320 1339.78 1341.00 1340.39
262144 160 2479.48 2481.49 2480.48
524288 80 4808.44 4809.19 4808.81
1048576 40 9515.65 9528.02 9521.84
2097152 20 18783.85 18881.55 18832.70
4194304 10 37302.61 37746.41 37524.51
#----------------------------------------------------------------
# Benchmarking Scatter
# #processes = 4
# ( 2 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.09 0.10 0.10
1 1000 57.45 57.58 57.51
2 1000 57.55 57.69 57.61
4 1000 57.24 57.38 57.31
8 1000 57.39 57.51 57.45
16 1000 57.83 57.96 57.90
32 1000 58.23 58.36 58.29
64 1000 58.90 59.05 58.98
128 1000 59.77 59.89 59.83
256 1000 61.35 61.49 61.41
512 1000 63.91 64.03 63.97
1024 1000 73.23 73.38 73.30
2048 1000 93.93 94.08 94.00
4096 1000 113.98 114.16 114.07
8192 1000 169.96 170.19 170.07
16384 1000 263.36 263.70 263.50
32768 1000 456.97 457.48 457.16
65536 640 1187.11 1188.43 1187.71
131072 320 2204.68 2209.05 2206.79
262144 160 4287.46 4302.36 4294.72
524288 80 8472.64 8535.51 8505.45
1048576 40 16997.13 17264.25 17133.96
2097152 20 33984.80 35089.15 34555.52
4194304 10 66180.20 70769.99 69378.25
#----------------------------------------------------------------
# Benchmarking Scatter
# #processes = 6
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.09 0.10 0.10
1 1000 69.84 69.95 69.90
2 1000 69.74 69.86 69.80
4 1000 69.67 69.77 69.72
8 1000 69.91 70.03 69.97
16 1000 70.50 70.60 70.55
32 1000 70.92 71.03 70.97
64 1000 72.86 72.94 72.90
128 1000 73.50 73.64 73.58
256 1000 76.76 76.88 76.83
512 1000 80.33 80.46 80.40
1024 1000 89.85 89.98 89.91
2048 1000 107.79 107.96 107.87
4096 1000 134.73 134.92 134.82
8192 1000 199.78 200.10 199.92
16384 1000 318.38 318.84 318.56
32768 1000 571.95 572.77 572.36
65536 640 1610.41 1612.87 1611.81
131072 320 3013.70 3020.04 3017.61
262144 160 6053.87 6085.56 6072.79
524288 80 12026.79 12151.17 12108.57
1048576 40 24154.93 24692.95 24469.15
2097152 20 48019.50 50155.85 49401.52
4194304 10 94025.29 102860.30 99106.62
#----------------------------------------------------------------
# Benchmarking Scatterv
# #processes = 2
# ( 4 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.20 0.25 0.23
1 1000 57.42 57.52 57.47
2 1000 56.94 57.03 56.98
4 1000 57.30 57.40 57.35
8 1000 56.81 56.91 56.86
16 1000 56.62 56.72 56.67
32 1000 57.10 57.19 57.14
64 1000 58.41 58.51 58.46
128 1000 59.38 59.48 59.43
256 1000 62.32 62.42 62.37
512 1000 65.90 66.02 65.96
1024 1000 76.06 76.20 76.13
2048 1000 86.55 86.69 86.62
4096 1000 108.18 108.35 108.26
8192 1000 139.11 139.29 139.20
16384 1000 217.05 217.29 217.17
32768 1000 368.30 368.68 368.49
65536 640 771.40 771.90 771.65
131072 320 1339.35 1340.63 1339.99
262144 160 2477.92 2478.84 2478.38
524288 80 4798.51 4799.26 4798.89
1048576 40 9509.15 9522.38 9515.76
2097152 20 18769.50 18865.60 18817.55
4194304 10 37423.69 37833.50 37628.60
#----------------------------------------------------------------
# Benchmarking Scatterv
# #processes = 4
# ( 2 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.17 0.28 0.23
1 1000 57.42 57.56 57.48
2 1000 57.79 57.94 57.86
4 1000 57.84 57.99 57.91
8 1000 58.26 58.39 58.32
16 1000 58.17 58.30 58.23
32 1000 58.11 58.25 58.18
64 1000 59.07 59.21 59.13
128 1000 59.67 59.80 59.73
256 1000 61.27 61.39 61.33
512 1000 63.84 63.98 63.90
1024 1000 73.39 73.54 73.46
2048 1000 94.00 94.15 94.07
4096 1000 114.38 114.57 114.48
8192 1000 169.99 170.20 170.09
16384 1000 263.42 263.76 263.56
32768 1000 456.23 456.78 456.46
65536 640 1187.85 1189.15 1188.45
131072 320 2205.29 2209.73 2207.46
262144 160 4283.23 4297.92 4290.28
524288 80 8476.81 8539.97 8509.88
1048576 40 17007.05 17275.50 17145.82
2097152 20 33980.79 35087.00 34553.44
4194304 10 66023.31 70665.00 69269.36
#----------------------------------------------------------------
# Benchmarking Scatterv
# #processes = 6
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.19 0.32 0.25
1 1000 69.23 69.32 69.27
2 1000 69.15 69.25 69.20
4 1000 69.05 69.19 69.13
8 1000 69.17 69.31 69.25
16 1000 69.46 69.58 69.52
32 1000 70.17 70.27 70.22
64 1000 71.73 71.88 71.81
128 1000 73.08 73.18 73.13
256 1000 76.26 76.36 76.31
512 1000 79.81 79.93 79.87
1024 1000 89.94 90.08 90.00
2048 1000 107.44 107.59 107.52
4096 1000 135.33 135.52 135.42
8192 1000 199.41 199.71 199.55
16384 1000 317.70 318.18 317.89
32768 1000 573.03 573.86 573.44
65536 640 1610.43 1612.84 1611.80
131072 320 3009.78 3016.31 3013.83
262144 160 6046.84 6078.49 6065.74
524288 80 12033.17 12156.59 12114.16
1048576 40 24168.27 24701.68 24479.27
2097152 20 48025.81 50147.20 49401.83
4194304 10 93884.69 102724.10 98996.48
#----------------------------------------------------------------
# Benchmarking Alltoall
# #processes = 2
# ( 4 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.07 0.07 0.07
1 1000 59.32 59.39 59.36
2 1000 59.49 59.54 59.52
4 1000 59.27 59.29 59.28
8 1000 59.29 59.37 59.33
16 1000 59.50 59.52 59.51
32 1000 59.73 59.74 59.73
64 1000 61.27 61.37 61.32
128 1000 62.30 62.30 62.30
256 1000 64.53 64.54 64.53
512 1000 67.68 67.80 67.74
1024 1000 75.06 75.07 75.06
2048 1000 102.81 102.82 102.82
4096 1000 109.36 109.45 109.40
8192 1000 162.39 162.46 162.42
16384 1000 227.32 227.33 227.32
32768 1000 377.91 378.05 377.98
65536 640 793.08 793.14 793.11
131072 320 1398.12 1398.41 1398.27
262144 160 2658.37 2658.61 2658.49
524288 80 5141.82 5142.03 5141.92
1048576 40 10089.17 10093.88 10091.52
2097152 20 19971.25 19996.80 19984.02
4194304 10 39670.40 39679.10 39674.75
#----------------------------------------------------------------
# Benchmarking Alltoall
# #processes = 4
# ( 2 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.07 0.08 0.07
1 1000 94.23 94.28 94.26
2 1000 94.22 94.26 94.24
4 1000 94.45 94.51 94.48
8 1000 94.79 94.86 94.83
16 1000 94.71 94.86 94.78
32 1000 95.48 95.62 95.55
64 1000 96.60 96.67 96.63
128 1000 97.58 97.74 97.67
256 1000 103.16 103.23 103.21
512 1000 107.58 107.64 107.61
1024 1000 118.62 118.76 118.70
2048 1000 142.73 142.75 142.74
4096 1000 272.47 272.60 272.53
8192 1000 404.48 404.67 404.58
16384 1000 666.44 666.60 666.48
32768 1000 1170.38 1170.59 1170.45
65536 640 2825.44 2826.11 2825.79
131072 320 5494.79 5497.81 5496.06
262144 160 10210.74 10216.48 10214.20
524288 80 19883.49 19900.60 19894.66
1048576 40 39379.35 39441.43 39417.67
2097152 20 79954.90 80005.10 79972.04
4194304 10 160551.80 160765.20 160692.93
#----------------------------------------------------------------
# Benchmarking Alltoall
# #processes = 6
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.07 0.07 0.07
1 1000 136.74 136.88 136.81
2 1000 137.65 137.78 137.72
4 1000 137.25 137.32 137.31
8 1000 138.03 138.19 138.13
16 1000 137.89 137.98 137.95
32 1000 138.96 139.10 139.04
64 1000 140.53 140.64 140.60
128 1000 141.19 141.35 141.27
256 1000 148.08 148.25 148.19
512 1000 153.03 153.24 153.15
1024 1000 163.88 164.00 163.96
2048 1000 208.34 208.40 208.38
4096 1000 450.52 450.63 450.58
8192 1000 663.93 664.15 664.04
16384 1000 1067.31 1067.51 1067.44
32768 1000 1922.36 1922.84 1922.62
65536 640 4610.50 4611.17 4610.82
131072 320 9046.64 9048.11 9047.16
262144 160 17739.79 17747.91 17744.40
524288 80 34564.79 34601.87 34585.90
1048576 40 69244.05 69460.60 69404.26
2097152 20 137667.55 138299.10 138096.77
4194304 10 273951.20 276605.30 275820.55
#----------------------------------------------------------------
# Benchmarking Alltoallv
# #processes = 2
# ( 4 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.14 0.14 0.14
1 1000 58.83 58.90 58.87
2 1000 58.90 58.99 58.95
4 1000 59.27 59.31 59.29
8 1000 59.47 59.50 59.49
16 1000 59.14 59.19 59.17
32 1000 59.40 59.44 59.42
64 1000 60.37 60.46 60.42
128 1000 61.61 61.63 61.62
256 1000 64.97 65.07 65.02
512 1000 66.85 66.95 66.90
1024 1000 74.75 74.88 74.81
2048 1000 102.71 102.72 102.72
4096 1000 108.79 108.82 108.80
8192 1000 162.75 162.78 162.76
16384 1000 227.30 227.33 227.31
32768 1000 378.25 378.26 378.26
65536 640 791.85 792.00 791.93
131072 320 1397.13 1397.38 1397.26
262144 160 2656.60 2656.97 2656.79
524288 80 5141.39 5141.96 5141.68
1048576 40 10110.67 10112.08 10111.38
2097152 20 19998.30 20001.00 19999.65
4194304 10 39687.51 39692.90 39690.20
#----------------------------------------------------------------
# Benchmarking Alltoallv
# #processes = 4
# ( 2 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.15 0.16 0.15
1 1000 92.21 92.33 92.26
2 1000 92.40 92.54 92.46
4 1000 92.60 92.64 92.62
8 1000 92.79 92.85 92.82
16 1000 92.77 92.92 92.84
32 1000 93.19 93.25 93.23
64 1000 94.37 94.43 94.41
128 1000 96.10 96.25 96.16
256 1000 100.98 101.14 101.06
512 1000 109.19 109.25 109.23
1024 1000 120.64 120.70 120.67
2048 1000 160.93 160.99 160.95
4096 1000 174.82 174.95 174.89
8192 1000 280.28 280.40 280.34
16384 1000 553.39 553.59 553.48
32768 1000 1101.79 1102.19 1101.98
65536 640 2236.50 2237.39 2236.92
131072 320 4509.34 4510.64 4509.99
262144 160 8935.16 8948.81 8943.68
524288 80 17945.09 17982.32 17968.44
1048576 40 36513.08 36584.43 36549.77
2097152 20 73159.55 73367.55 73285.91
4194304 10 147135.60 147336.50 147236.55
#----------------------------------------------------------------
# Benchmarking Alltoallv
# #processes = 6
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.17 0.19 0.18
1 1000 135.05 135.18 135.11
2 1000 134.44 134.55 134.51
4 1000 134.58 134.72 134.67
8 1000 135.06 135.19 135.13
16 1000 135.72 135.86 135.80
32 1000 136.55 136.64 136.60
64 1000 136.71 136.84 136.78
128 1000 139.12 139.20 139.17
256 1000 145.85 145.97 145.93
512 1000 151.87 152.00 151.95
1024 1000 157.89 158.11 158.03
2048 1000 203.47 203.58 203.51
4096 1000 258.22 258.38 258.30
8192 1000 422.37 422.58 422.45
16384 1000 832.89 833.24 833.10
32768 1000 1679.72 1680.17 1679.92
65536 640 3557.24 3559.11 3558.30
131072 320 7103.16 7111.29 7108.27
262144 160 13975.49 13992.91 13989.44
524288 80 27949.23 28110.82 28072.41
1048576 40 56100.12 56777.20 56639.71
2097152 20 112481.00 114944.40 114417.47
4194304 10 220160.11 231195.99 228782.03
#----------------------------------------------------------------
# Benchmarking Bcast
# #processes = 2
# ( 4 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.08 0.08 0.08
1 1000 57.48 57.57 57.52
2 1000 57.40 57.50 57.45
4 1000 57.03 57.11 57.07
8 1000 57.19 57.29 57.24
16 1000 56.55 56.64 56.60
32 1000 57.06 57.15 57.11
64 1000 58.79 58.89 58.84
128 1000 59.54 59.64 59.59
256 1000 61.67 61.78 61.72
512 1000 66.69 66.81 66.75
1024 1000 75.72 75.86 75.79
2048 1000 90.78 90.91 90.84
4096 1000 106.06 106.22 106.14
8192 1000 149.34 149.52 149.43
16384 1000 238.29 238.55 238.42
32768 1000 408.88 409.18 409.03
65536 640 746.58 747.17 746.88
131072 320 1522.37 1523.54 1522.95
262144 160 2838.91 2841.93 2840.42
524288 80 5000.33 5005.45 5002.89
1048576 40 9917.45 9930.02 9923.74
2097152 20 19734.10 19751.05 19742.57
4194304 10 39405.61 39448.21 39426.91
#----------------------------------------------------------------
# Benchmarking Bcast
# #processes = 4
# ( 2 additional processes waiting in MPI_Barrier)
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.07 0.09 0.08
1 1000 60.28 60.39 60.33
2 1000 60.31 60.44 60.38
4 1000 60.28 60.38 60.33
8 1000 60.39 60.49 60.44
16 1000 60.32 60.45 60.40
32 1000 60.37 60.50 60.43
64 1000 61.22 61.32 61.27
128 1000 61.73 61.84 61.79
256 1000 63.37 63.50 63.44
512 1000 69.21 69.37 69.29
1024 1000 79.70 79.87 79.79
2048 1000 103.78 103.97 103.90
4096 1000 134.05 134.21 134.16
8192 1000 191.61 191.93 191.82
16384 1000 320.38 320.65 320.54
32768 1000 581.32 581.78 581.64
65536 640 1094.62 1095.47 1095.16
131072 320 2206.34 2207.97 2207.23
262144 160 4263.81 4268.89 4266.17
524288 80 6605.11 6613.68 6609.60
1048576 40 12997.30 13033.73 13015.46
2097152 20 30518.20 30677.05 30611.83
4194304 10 60507.41 60809.41 60676.08
#----------------------------------------------------------------
# Benchmarking Bcast
# #processes = 6
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.07 0.09 0.08
1 1000 63.56 63.71 63.63
2 1000 63.52 63.67 63.59
4 1000 63.78 63.93 63.85
8 1000 63.32 63.47 63.39
16 1000 63.99 64.13 64.05
32 1000 64.06 64.18 64.12
64 1000 65.85 65.97 65.91
128 1000 67.11 67.25 67.18
256 1000 71.25 71.40 71.33
512 1000 78.80 78.93 78.87
1024 1000 88.33 88.48 88.41
2048 1000 120.72 120.92 120.87
4096 1000 155.73 155.92 155.87
8192 1000 216.06 216.30 216.24
16384 1000 373.64 373.96 373.87
32768 1000 681.76 682.26 682.11
65536 640 1302.10 1303.43 1303.00
131072 320 2651.97 2654.06 2653.44
262144 160 5508.27 5518.73 5512.38
524288 80 6521.70 6545.33 6529.50
1048576 40 12837.60 12943.95 12896.20
2097152 20 25314.55 25733.60 25542.16
4194304 10 73688.99 74682.89 74323.27
#---------------------------------------------------
# Benchmarking Barrier
# #processes = 2
# ( 4 additional processes waiting in MPI_Barrier)
#---------------------------------------------------
#repetitions t_min[usec] t_max[usec] t_avg[usec]
1000 65.91 65.91 65.91
#---------------------------------------------------
# Benchmarking Barrier
# #processes = 4
# ( 2 additional processes waiting in MPI_Barrier)
#---------------------------------------------------
#repetitions t_min[usec] t_max[usec] t_avg[usec]
1000 85.44 85.50 85.47
#---------------------------------------------------
# Benchmarking Barrier
# #processes = 6
#---------------------------------------------------
#repetitions t_min[usec] t_max[usec] t_avg[usec]
1000 106.92 106.95 106.94
# All processes entering MPI_Finalize