On Tuesday 16 February 2010, Jeff Squyres wrote: > We've only got 2 "critical" 1.5.0 bugs left, and I think that those will > both be closed out pretty soon. > > https://svn.open-mpi.org/trac/ompi/report/15 > > Rainer and I both feel that a RC for 1.5.0 could be pretty soon. > > Does anyone have any heartburn with this? Does anyone have any things they > still need to get in v1.5.0?
I noticed that 1.5a1r22627 still has a very suboptimal default selection of (at least) alltoall algorithms. This has been mentioned several times since the first major discussion[1] but nothing seems to have improved. A short re-cap of the situation is that by default ompi switches from bruck to basic-linear at ~100 bytes pkg size and this is bad<tm>. The first set of figures below are with vanilla ompi and the second set is with a dynamic rules file [2] that foreces bruck for all pkg sizes. For details on the system see [3]. The problem is equally visible on tcp as on openib. A concrete result is that OpenMPI on IB is way slower than other MPIs on 1G eth (for the affected pkg sizes (100-3000 bytes)). [cap@n115 mpi]$ mpirun --host $(hostlist --expand -s',' $SLURM_JOB_NODELIST) --bind-to-core ./alltoall.ompi15a1r22627 profile.ompibadness running in profile-from-file mode bw for 400 x 1 B : 2.0 Mbytes/s time was: 24.9 ms bw for 400 x 25 B : 52.8 Mbytes/s time was: 23.9 ms bw for 400 x 50 B : 82.2 Mbytes/s time was: 30.7 ms bw for 400 x 75 B : 90.4 Mbytes/s time was: 41.8 ms bw for 400 x 100 B : 109.2 Mbytes/s time was: 46.1 ms bw for 400 x 200 B : 4.8 Mbytes/s time was: 2.1 s bw for 400 x 300 B : 7.0 Mbytes/s time was: 2.2 s bw for 400 x 400 B : 9.8 Mbytes/s time was: 2.1 s bw for 400 x 500 B : 12.3 Mbytes/s time was: 2.0 s bw for 400 x 750 B : 18.5 Mbytes/s time was: 2.0 s bw for 400 x 1000 B : 24.6 Mbytes/s time was: 2.0 s bw for 400 x 1250 B : 29.9 Mbytes/s time was: 2.1 s bw for 400 x 1500 B : 35.1 Mbytes/s time was: 2.2 s bw for 400 x 2000 B : 45.5 Mbytes/s time was: 2.2 s bw for 400 x 2500 B : 51.0 Mbytes/s time was: 2.5 s bw for 400 x 3000 B : 113.6 Mbytes/s time was: 1.3 s bw for 400 x 3500 B : 123.3 Mbytes/s time was: 1.4 s bw for 400 x 4000 B : 135.7 Mbytes/s time was: 1.5 s totaltime was: 25.8 s [cap@n115 mpi]$ mpirun --host $(hostlist --expand -s',' $SLURM_JOB_NODELIST) --bind-to-core -mca coll_tuned_use_dynamic_rules 1 -mca coll_tuned_dynamic_rules_filename ./dyn_rules ./alltoall.ompi15a1r22627 profile.ompibadness running in profile-from-file mode bw for 400 x 1 B : 2.1 Mbytes/s time was: 24.3 ms bw for 400 x 25 B : 55.1 Mbytes/s time was: 22.9 ms bw for 400 x 50 B : 82.6 Mbytes/s time was: 30.5 ms bw for 400 x 75 B : 89.4 Mbytes/s time was: 42.3 ms bw for 400 x 100 B : 109.9 Mbytes/s time was: 45.9 ms bw for 400 x 200 B : 115.1 Mbytes/s time was: 87.6 ms bw for 400 x 300 B : 117.8 Mbytes/s time was: 128.3 ms bw for 400 x 400 B : 105.4 Mbytes/s time was: 191.2 ms bw for 400 x 500 B : 113.4 Mbytes/s time was: 222.1 ms bw for 400 x 750 B : 119.3 Mbytes/s time was: 316.9 ms bw for 400 x 1000 B : 120.9 Mbytes/s time was: 416.9 ms bw for 400 x 1250 B : 121.0 Mbytes/s time was: 520.6 ms bw for 400 x 1500 B : 120.3 Mbytes/s time was: 628.2 ms bw for 400 x 2000 B : 118.0 Mbytes/s time was: 854.1 ms bw for 400 x 2500 B : 96.5 Mbytes/s time was: 1.3 s bw for 400 x 3000 B : 107.4 Mbytes/s time was: 1.4 s bw for 400 x 3500 B : 109.1 Mbytes/s time was: 1.6 s bw for 400 x 4000 B : 109.2 Mbytes/s time was: 1.8 s totaltime was: 9.7 s [1] [OMPI users] scaling problem with openmpi From: Roman Martonak <r.marto...@gmail.com> To: us...@open-mpi.org Date: 2009-05-16 00.20 [2]: 1 # num of collectives 3 # ID = 3 Alltoall collective (ID in coll_tuned.h) 1 # number of com sizes 32 # comm size 8 1 # number of msg sizes 0 3 0 0 # for message size 0, bruck 1, topo 0, 0 segmentation # end of first collective [3]: OpenMPI: Built with intel-11.1.074 only configure options used were: --enable-orterun-prefix-by-default --prefix OS: CentOS-5.4 x86_64 HW: Dual E5520 nodes with IB (ConnectX) Size of job: 8 nodes (that is 64 cores/ranks) /Peter
signature.asc
Description: This is a digitally signed message part.