Ah, this sounds familiar. I believe that the issue Dave sees is that without patcher/overwrite the "leave pinned" protocol is OFF by default.
Use of '-mca mpi_leave_pinned 1' may help if my guess is right. HOWEVER, w/o the memory management hooks provided using patcher/overwrite, leave pinned can give incorrect results. -Paul On Tue, Jan 23, 2018 at 9:17 PM, Gilles Gouaillardet < gilles.gouaillar...@gmail.com> wrote: > Dave, > > here is what I found > > - MPI_THREAD_MULTIPLE is not part of the equation (I just found it is > no more required by IMB by default) > - patcher/overwrite is not built when Open MPI is configure'd with > --disable-dlopen > - when configure'd without --disable-dlopen, performances are way > worst for the IMB (PingPong) benchmark when ran with > mpirun --mca patcher ^overwrite > - OSU (osu_bw) performances are not impacted by the patcher/overwrite > component being blacklisted > > I am afraid that's all I can do ... > > > Nathan, > > could you please shed some light ? > > > Cheers, > > Gilles > > On Wed, Jan 24, 2018 at 1:29 PM, Gilles Gouaillardet > <gilles.gouaillar...@gmail.com> wrote: > > Dave, > > > > i can reproduce the issue with btl/openib and the IMB benchmark, that > > is known to MPI_Init_thread(MPI_THREAD_MULTIPLE) > > > > note performance is ok with OSU benchmark that does not require > > MPI_THREAD_MULTIPLE > > > > Cheers, > > > > Gilles > > > > On Wed, Jan 24, 2018 at 1:16 PM, Gilles Gouaillardet <gil...@rist.or.jp> > wrote: > >> Dave, > >> > >> > >> one more question, are you running the openib/btl ? or other libraries > such > >> as MXM or UCX ? > >> > >> > >> Cheers, > >> > >> > >> Gilles > >> > >> > >> On 1/24/2018 12:55 PM, Dave Turner wrote: > >>> > >>> > >>> We compiled OpenMPI 2.1.1 using the EasyBuild configuration > >>> for CentOS as below and tested on Mellanox QDR hardware. > >>> > >>> ./configure --prefix=/homes/daveturner/libs/openmpi-2.1.1c > >>> --enable-shared > >>> --enable-mpi-thread-multiple > >>> --with-verbs > >>> --enable-mpirun-prefix-by-default > >>> --with-mpi-cxx > >>> --enable-mpi-cxx > >>> --with-hwloc=$EBROOTHWLOC > >>> --disable-dlopen > >>> > >>> The red curve in the attached NetPIPE graph shows the poor performance > >>> above > >>> 8 kB for the uni-directional tests with bi-directional and aggregate > >>> tests also showing similar problems. When I compile using the same > >>> configuration but with the --disable-dlopen parameter removed then the > >>> performance is very good as the green curve in the graph shows. > >>> > >>> We see the same problems with OpenMPI 2.0.2. > >>> Replacing --disable-dlopen with --disable-mca-dso showed good > performance. > >>> Replacing --disable-dlopen with --enable-static showed good > performance. > >>> So it's only --disable-dlopen that leads to poor performance. > >>> > >>> http://netpipe.cs.ksu.edu > >>> > >>> Dave Turner > >>> > >>> -- > >>> Work: davetur...@ksu.edu <mailto:davetur...@ksu.edu> (785) > 532-7791 > >>> 2219 Engineering Hall, Manhattan KS 66506 > >>> Home: drdavetur...@gmail.com <mailto:drdavetur...@gmail.com> > >>> cell: (785) 770-5929 > >>> > >>> > >>> _______________________________________________ > >>> devel mailing list > >>> devel@lists.open-mpi.org > >>> https://lists.open-mpi.org/mailman/listinfo/devel > >> > >> > >> _______________________________________________ > >> devel mailing list > >> devel@lists.open-mpi.org > >> https://lists.open-mpi.org/mailman/listinfo/devel > _______________________________________________ > devel mailing list > devel@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/devel > -- Paul H. Hargrove <phhargr...@lbl.gov> Computer Languages & Systems Software (CLaSS) Group Computer Science Department Lawrence Berkeley National Laboratory
_______________________________________________ devel mailing list devel@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/devel