Gilles, I'm using NetPIPE which is available at http://netpipe.cs.ksu.edu My base test is uni-directional with 1 process on a node communicating with a process on a second node.
make mpi mpirun -np 2 --hostfile=hf.2p2n NPmpi cat hf.2p2n node0 slots=1 node1 slots=1 NetPIPE does not do any MPI_Init_thread(). Tests on the configs below give good performance with and without the --enable-mpi-thread-multiple so I don't think that's the issue. configure --prefix=/homes/daveturner/libs/openmpi-2.1.1 --enable-mpi-fortran=all --with-verbs --enable-ipv6 --enable-mpi-cxx configure --prefix=/homes/daveturner/libs/openmpi-2.1.1 --enable-mpi-fortran=all --with-verbs --enable-ipv6 --enable-mpi-cxx --enable-mpi-thread-multiple Dave On Tue, Jan 23, 2018 at 10:03 PM, Gilles Gouaillardet < gilles.gouaillar...@gmail.com> wrote: > Dave, > > At first glance, that looks pretty odd, and I'll have a look at it. > > Which benchmark are you using to measure the bandwidth ? > Does your benchmark MPI_Init_thread(MPI_THREAD_MULTIPLE) ? > Have you tried without --enable-mpi-thread-multiple ? > > Cheers, > > Gilles > > On Wed, Jan 24, 2018 at 12:55 PM, Dave Turner <drdavetur...@gmail.com> > wrote: > > > > We compiled OpenMPI 2.1.1 using the EasyBuild configuration > > for CentOS as below and tested on Mellanox QDR hardware. > > > > ./configure --prefix=/homes/daveturner/libs/openmpi-2.1.1c > > --enable-shared > > --enable-mpi-thread-multiple > > --with-verbs > > --enable-mpirun-prefix-by-default > > --with-mpi-cxx > > --enable-mpi-cxx > > --with-hwloc=$EBROOTHWLOC > > --disable-dlopen > > > > The red curve in the attached NetPIPE graph shows the poor performance > above > > 8 kB for the uni-directional tests with bi-directional and aggregate > > tests also showing similar problems. When I compile using the same > > configuration but with the --disable-dlopen parameter removed then the > > performance is very good as the green curve in the graph shows. > > > > We see the same problems with OpenMPI 2.0.2. > > Replacing --disable-dlopen with --disable-mca-dso showed good > performance. > > Replacing --disable-dlopen with --enable-static showed good performance. > > So it's only --disable-dlopen that leads to poor performance. > > > > http://netpipe.cs.ksu.edu > > > > Dave Turner > > > > -- > > Work: davetur...@ksu.edu (785) 532-7791 > > 2219 Engineering Hall, Manhattan KS 66506 > > Home: drdavetur...@gmail.com > > cell: (785) 770-5929 > > > > _______________________________________________ > > devel mailing list > > devel@lists.open-mpi.org > > https://lists.open-mpi.org/mailman/listinfo/devel > -- Work: davetur...@ksu.edu (785) 532-7791 2219 Engineering Hall, Manhattan KS 66506 Home: drdavetur...@gmail.com cell: (785) 770-5929
_______________________________________________ devel mailing list devel@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/devel