The poor performance when compiling with --disable-dlopen also
occurs in OpenMPI 3.0.0 in addition to 2.1.1 and 2.0.2 that I reported
earlier.  My understanding is that the EasyBuild group is looking into
simply removing --disable-dlopen from their build, which is what we've
done on our system.

                       Dave Turner

On Wed, Jan 24, 2018 at 1:00 PM, <devel-requ...@lists.open-mpi.org> wrote:

> Send devel mailing list submissions to
>         devel@lists.open-mpi.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         https://lists.open-mpi.org/mailman/listinfo/devel
> or, via email, send a message with subject or body 'help' to
>         devel-requ...@lists.open-mpi.org
>
> You can reach the person managing the list at
>         devel-ow...@lists.open-mpi.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of devel digest..."
>
>
> Today's Topics:
>
>    1. Open MPI 3.0.1rc2 available for testing (Barrett, Brian)
>    2. Open MPI 3.1.0 pre-release available (Barrett, Brian)
>    3. Poor performance when compiling with --disable-dlopen
>       (Dave Turner)
>    4. Re: Poor performance when compiling with --disable-dlopen
>       (Gilles Gouaillardet)
>    5. Re: Poor performance when compiling with --disable-dlopen
>       (Gilles Gouaillardet)
>    6. Re: Poor performance when compiling with --disable-dlopen
>       (Gilles Gouaillardet)
>    7. Re: Poor performance when compiling with --disable-dlopen
>       (Gilles Gouaillardet)
>    8. Re: Poor performance when compiling with --disable-dlopen
>       (Paul Hargrove)
>    9. Re: Poor performance when compiling with --disable-dlopen
>       (Gilles Gouaillardet)
>   10. Re: Poor performance when compiling with --disable-dlopen
>       (Dave Turner)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Wed, 24 Jan 2018 01:04:10 +0000
> From: "Barrett, Brian" <bbarr...@amazon.com>
> To: Open MPI Developers <devel@lists.open-mpi.org>
> Subject: [OMPI devel] Open MPI 3.0.1rc2 available for testing
> Message-ID: <216c85e3-7716-4aac-a88f-9cd709a8d...@amazon.com>
> Content-Type: text/plain; charset="utf-8"
>
> I?ve posted the first public release candidate of Open MPI 3.0.1 this
> evening.  It can be downloaded for testing from:
>
>   https://www.open-mpi.org/software/ompi/v3.0/
>
> We appreciate any testing you can do in preparation for a release in the
> next week or two.
>
>
> Thanks,
>
> Brian & Howard
>
> ------------------------------
>
> Message: 2
> Date: Wed, 24 Jan 2018 01:24:06 +0000
> From: "Barrett, Brian" <bbarr...@amazon.com>
> To: Open MPI Developers <devel@lists.open-mpi.org>
> Subject: [OMPI devel] Open MPI 3.1.0 pre-release available
> Message-ID: <52df0d41-ce3f-4aad-894c-daab76c0f...@amazon.com>
> Content-Type: text/plain; charset="us-ascii"
>
> The Open MPI team is pleased to announce the first pre-release of the Open
> MPI 3.1 series, available at:
>
>   https://www.open-mpi.org/software/ompi/v3.1/
>
> RC1 has two known issues:
>
>   - We did not complete work to support hwloc 2.x, even when hwloc is
> built as an external library.  This may or may not be complete before 3.1.0
> is shipped.
>   - 3.1.0 is shipping with a pre-release version of PMIx 2.1.  We will
> finish the update to PMIx 2.1 before 3.1.0 is released.
>
> We look forward to any other issues you may find in testing.
>
> Thanks,
>
> Brian
>
> ------------------------------
>
> Message: 3
> Date: Tue, 23 Jan 2018 21:55:38 -0600
> From: Dave Turner <drdavetur...@gmail.com>
> To: Open MPI Developers <devel@lists.open-mpi.org>, beo...@cs.ksu.edu
> Subject: [OMPI devel] Poor performance when compiling with
>         --disable-dlopen
> Message-ID:
>         <CAFGXdkyNOWkEDxVnrjg4k24Og5MRnN_9KDLSQ73M3TXH_WtUfQ@mail.
> gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
>    We compiled OpenMPI 2.1.1 using the EasyBuild configuration
> for CentOS as below and tested on Mellanox QDR hardware.
>
> ./configure --prefix=/homes/daveturner/libs/openmpi-2.1.1c
>                  --enable-shared
>                  --enable-mpi-thread-multiple
>                  --with-verbs
>                  --enable-mpirun-prefix-by-default
>                  --with-mpi-cxx
>                  --enable-mpi-cxx
>                  --with-hwloc=$EBROOTHWLOC
>                  --disable-dlopen
>
> The red curve in the attached NetPIPE graph shows the poor performance
> above
> 8 kB for the uni-directional tests with bi-directional and aggregate
> tests also showing similar problems.  When I compile using the same
> configuration but with the --disable-dlopen parameter removed then the
> performance is very good as the green curve in the graph shows.
>
> We see the same problems with OpenMPI 2.0.2.
> Replacing --disable-dlopen with --disable-mca-dso showed good performance.
> Replacing --disable-dlopen with --enable-static showed good performance.
> So it's only --disable-dlopen that leads to poor performance.
>
> http://netpipe.cs.ksu.edu
>
>                    Dave Turner
>
> --
> Work:     davetur...@ksu.edu     (785) 532-7791
>              2219 Engineering Hall, Manhattan KS  66506
> Home:    drdavetur...@gmail.com
>               cell: (785) 770-5929
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <https://lists.open-mpi.org/mailman/private/devel/
> attachments/20180123/6d4537ad/attachment.html>
> -------------- next part --------------
> A non-text attachment was scrubbed...
> Name: MPI_on_QDR_dlopen_paramter.pdf
> Type: application/pdf
> Size: 16813 bytes
> Desc: not available
> URL: <https://lists.open-mpi.org/mailman/private/devel/
> attachments/20180123/6d4537ad/attachment.pdf>
>
> ------------------------------
>
> Message: 4
> Date: Wed, 24 Jan 2018 13:03:13 +0900
> From: Gilles Gouaillardet <gilles.gouaillar...@gmail.com>
> To: Dave Turner <drdavetur...@gmail.com>, Open MPI Developers
>         <devel@lists.open-mpi.org>
> Subject: Re: [OMPI devel] Poor performance when compiling with
>         --disable-dlopen
> Message-ID:
>         <CAAkFZ5uXZGsbeg0vVBJ4HiLKmDMVfvUV1tu6HOT-gUaEi5OW8Q@mail.
> gmail.com>
> Content-Type: text/plain; charset="UTF-8"
>
> Dave,
>
> At first glance, that looks pretty odd, and I'll have a look at it.
>
> Which benchmark are you using to measure the bandwidth ?
> Does your benchmark MPI_Init_thread(MPI_THREAD_MULTIPLE) ?
> Have you tried without  --enable-mpi-thread-multiple ?
>
> Cheers,
>
> Gilles
>
> On Wed, Jan 24, 2018 at 12:55 PM, Dave Turner <drdavetur...@gmail.com>
> wrote:
> >
> >    We compiled OpenMPI 2.1.1 using the EasyBuild configuration
> > for CentOS as below and tested on Mellanox QDR hardware.
> >
> > ./configure --prefix=/homes/daveturner/libs/openmpi-2.1.1c
> >                  --enable-shared
> >                  --enable-mpi-thread-multiple
> >                  --with-verbs
> >                  --enable-mpirun-prefix-by-default
> >                  --with-mpi-cxx
> >                  --enable-mpi-cxx
> >                  --with-hwloc=$EBROOTHWLOC
> >                  --disable-dlopen
> >
> > The red curve in the attached NetPIPE graph shows the poor performance
> above
> > 8 kB for the uni-directional tests with bi-directional and aggregate
> > tests also showing similar problems.  When I compile using the same
> > configuration but with the --disable-dlopen parameter removed then the
> > performance is very good as the green curve in the graph shows.
> >
> > We see the same problems with OpenMPI 2.0.2.
> > Replacing --disable-dlopen with --disable-mca-dso showed good
> performance.
> > Replacing --disable-dlopen with --enable-static showed good performance.
> > So it's only --disable-dlopen that leads to poor performance.
> >
> > http://netpipe.cs.ksu.edu
> >
> >                    Dave Turner
> >
> > --
> > Work:     davetur...@ksu.edu     (785) 532-7791
> >              2219 Engineering Hall, Manhattan KS  66506
> > Home:    drdavetur...@gmail.com
> >               cell: (785) 770-5929
> >
> > _______________________________________________
> > devel mailing list
> > devel@lists.open-mpi.org
> > https://lists.open-mpi.org/mailman/listinfo/devel
>
>
> ------------------------------
>
> Message: 5
> Date: Wed, 24 Jan 2018 13:16:25 +0900
> From: Gilles Gouaillardet <gil...@rist.or.jp>
> To: devel@lists.open-mpi.org
> Subject: Re: [OMPI devel] Poor performance when compiling with
>         --disable-dlopen
> Message-ID: <fa070192-8d4f-531d-05a1-a51dc54f3...@rist.or.jp>
> Content-Type: text/plain; charset=utf-8; format=flowed
>
> Dave,
>
>
> one more question, are you running the openib/btl ? or other libraries
> such as MXM or UCX ?
>
>
> Cheers,
>
>
> Gilles
>
>
> On 1/24/2018 12:55 PM, Dave Turner wrote:
> >
> > ? ?We compiled OpenMPI 2.1.1 using the EasyBuild configuration
> > for CentOS as below and tested on Mellanox QDR hardware.
> >
> > ./configure --prefix=/homes/daveturner/libs/openmpi-2.1.1c
> > ? ? ? ? ? ? ? ? ?--enable-shared
> > ? ? ? ? ? ? ? ? ?--enable-mpi-thread-multiple
> > ? ? ? ? ? ? ? ? ?--with-verbs
> > ? ? ? ? ? ? ? ? ?--enable-mpirun-prefix-by-default
> > ? ? ? ? ? ? ? ? ?--with-mpi-cxx
> > ? ? ? ? ? ? ? ? ?--enable-mpi-cxx
> > ? ? ? ? ? ? ? ? ?--with-hwloc=$EBROOTHWLOC
> > ? ? ? ? ? ? ? ? ?--disable-dlopen
> >
> > The red curve in the attached NetPIPE graph shows the poor performance
> > above
> > 8 kB for the uni-directional tests with bi-directional and aggregate
> > tests also showing similar problems.? When I compile using the same
> > configuration but with the --disable-dlopen parameter removed then the
> > performance is very good as the green curve in the graph shows.
> >
> > We see the same problems with OpenMPI 2.0.2.
> > Replacing --disable-dlopen with --disable-mca-dso showed good
> performance.
> > Replacing --disable-dlopen with --enable-static showed good performance.
> > So it's only --disable-dlopen that leads to poor performance.
> >
> > http://netpipe.cs.ksu.edu
> >
> > ? ? ? ? ? ? ? ? ? ?Dave Turner
> >
> > --
> > Work: davetur...@ksu.edu <mailto:davetur...@ksu.edu> ? ? (785) 532-7791
> > ? ? ? ? ? ?? 2219 Engineering Hall, Manhattan KS ?66506
> > Home: drdavetur...@gmail.com <mailto:drdavetur...@gmail.com>
> > ?????????? ?? cell: (785) 770-5929
> >
> >
> > _______________________________________________
> > devel mailing list
> > devel@lists.open-mpi.org
> > https://lists.open-mpi.org/mailman/listinfo/devel
>
>
>
> ------------------------------
>
> Message: 6
> Date: Wed, 24 Jan 2018 13:29:09 +0900
> From: Gilles Gouaillardet <gilles.gouaillar...@gmail.com>
> To: Open MPI Developers <devel@lists.open-mpi.org>
> Subject: Re: [OMPI devel] Poor performance when compiling with
>         --disable-dlopen
> Message-ID:
>         <CAAkFZ5vJQ0FvkVtcVmayNsDQpq2oDXpcY2kAMGNUw_pZw3WTCQ@mail.
> gmail.com>
> Content-Type: text/plain; charset="UTF-8"
>
> Dave,
>
> i can reproduce the issue with btl/openib and the IMB benchmark, that
> is known to MPI_Init_thread(MPI_THREAD_MULTIPLE)
>
> note performance is ok with OSU benchmark that does not require
> MPI_THREAD_MULTIPLE
>
> Cheers,
>
> Gilles
>
> On Wed, Jan 24, 2018 at 1:16 PM, Gilles Gouaillardet <gil...@rist.or.jp>
> wrote:
> > Dave,
> >
> >
> > one more question, are you running the openib/btl ? or other libraries
> such
> > as MXM or UCX ?
> >
> >
> > Cheers,
> >
> >
> > Gilles
> >
> >
> > On 1/24/2018 12:55 PM, Dave Turner wrote:
> >>
> >>
> >>    We compiled OpenMPI 2.1.1 using the EasyBuild configuration
> >> for CentOS as below and tested on Mellanox QDR hardware.
> >>
> >> ./configure --prefix=/homes/daveturner/libs/openmpi-2.1.1c
> >>                  --enable-shared
> >>                  --enable-mpi-thread-multiple
> >>                  --with-verbs
> >>                  --enable-mpirun-prefix-by-default
> >>                  --with-mpi-cxx
> >>                  --enable-mpi-cxx
> >>                  --with-hwloc=$EBROOTHWLOC
> >>                  --disable-dlopen
> >>
> >> The red curve in the attached NetPIPE graph shows the poor performance
> >> above
> >> 8 kB for the uni-directional tests with bi-directional and aggregate
> >> tests also showing similar problems.  When I compile using the same
> >> configuration but with the --disable-dlopen parameter removed then the
> >> performance is very good as the green curve in the graph shows.
> >>
> >> We see the same problems with OpenMPI 2.0.2.
> >> Replacing --disable-dlopen with --disable-mca-dso showed good
> performance.
> >> Replacing --disable-dlopen with --enable-static showed good performance.
> >> So it's only --disable-dlopen that leads to poor performance.
> >>
> >> http://netpipe.cs.ksu.edu
> >>
> >>                    Dave Turner
> >>
> >> --
> >> Work: davetur...@ksu.edu <mailto:davetur...@ksu.edu>     (785) 532-7791
> >>              2219 Engineering Hall, Manhattan KS  66506
> >> Home: drdavetur...@gmail.com <mailto:drdavetur...@gmail.com>
> >>               cell: (785) 770-5929
> >>
> >>
> >> _______________________________________________
> >> devel mailing list
> >> devel@lists.open-mpi.org
> >> https://lists.open-mpi.org/mailman/listinfo/devel
> >
> >
> > _______________________________________________
> > devel mailing list
> > devel@lists.open-mpi.org
> > https://lists.open-mpi.org/mailman/listinfo/devel
>
>
> ------------------------------
>
> Message: 7
> Date: Wed, 24 Jan 2018 14:17:56 +0900
> From: Gilles Gouaillardet <gilles.gouaillar...@gmail.com>
> To: Open MPI Developers <devel@lists.open-mpi.org>
> Subject: Re: [OMPI devel] Poor performance when compiling with
>         --disable-dlopen
> Message-ID:
>         <CAAkFZ5uhz9gaNfqVMieDr4nkkCNJF+cbRS36i2fGC+o25wRJiw@mail.
> gmail.com>
> Content-Type: text/plain; charset="UTF-8"
>
> Dave,
>
> here is what I found
>
>  - MPI_THREAD_MULTIPLE is not part of the equation (I just found it is
> no more required by IMB by default)
>  - patcher/overwrite is not built when Open MPI is configure'd with
> --disable-dlopen
>  - when configure'd without --disable-dlopen, performances are way
> worst for the IMB (PingPong) benchmark when ran with
>    mpirun --mca patcher ^overwrite
>  - OSU (osu_bw) performances are not impacted by the patcher/overwrite
> component being blacklisted
>
> I am afraid that's all I can do ...
>
>
> Nathan,
>
> could you please shed some light ?
>
>
> Cheers,
>
> Gilles
>
> On Wed, Jan 24, 2018 at 1:29 PM, Gilles Gouaillardet
> <gilles.gouaillar...@gmail.com> wrote:
> > Dave,
> >
> > i can reproduce the issue with btl/openib and the IMB benchmark, that
> > is known to MPI_Init_thread(MPI_THREAD_MULTIPLE)
> >
> > note performance is ok with OSU benchmark that does not require
> > MPI_THREAD_MULTIPLE
> >
> > Cheers,
> >
> > Gilles
> >
> > On Wed, Jan 24, 2018 at 1:16 PM, Gilles Gouaillardet <gil...@rist.or.jp>
> wrote:
> >> Dave,
> >>
> >>
> >> one more question, are you running the openib/btl ? or other libraries
> such
> >> as MXM or UCX ?
> >>
> >>
> >> Cheers,
> >>
> >>
> >> Gilles
> >>
> >>
> >> On 1/24/2018 12:55 PM, Dave Turner wrote:
> >>>
> >>>
> >>>    We compiled OpenMPI 2.1.1 using the EasyBuild configuration
> >>> for CentOS as below and tested on Mellanox QDR hardware.
> >>>
> >>> ./configure --prefix=/homes/daveturner/libs/openmpi-2.1.1c
> >>>                  --enable-shared
> >>>                  --enable-mpi-thread-multiple
> >>>                  --with-verbs
> >>>                  --enable-mpirun-prefix-by-default
> >>>                  --with-mpi-cxx
> >>>                  --enable-mpi-cxx
> >>>                  --with-hwloc=$EBROOTHWLOC
> >>>                  --disable-dlopen
> >>>
> >>> The red curve in the attached NetPIPE graph shows the poor performance
> >>> above
> >>> 8 kB for the uni-directional tests with bi-directional and aggregate
> >>> tests also showing similar problems.  When I compile using the same
> >>> configuration but with the --disable-dlopen parameter removed then the
> >>> performance is very good as the green curve in the graph shows.
> >>>
> >>> We see the same problems with OpenMPI 2.0.2.
> >>> Replacing --disable-dlopen with --disable-mca-dso showed good
> performance.
> >>> Replacing --disable-dlopen with --enable-static showed good
> performance.
> >>> So it's only --disable-dlopen that leads to poor performance.
> >>>
> >>> http://netpipe.cs.ksu.edu
> >>>
> >>>                    Dave Turner
> >>>
> >>> --
> >>> Work: davetur...@ksu.edu <mailto:davetur...@ksu.edu>     (785)
> 532-7791
> >>>              2219 Engineering Hall, Manhattan KS  66506
> >>> Home: drdavetur...@gmail.com <mailto:drdavetur...@gmail.com>
> >>>               cell: (785) 770-5929
> >>>
> >>>
> >>> _______________________________________________
> >>> devel mailing list
> >>> devel@lists.open-mpi.org
> >>> https://lists.open-mpi.org/mailman/listinfo/devel
> >>
> >>
> >> _______________________________________________
> >> devel mailing list
> >> devel@lists.open-mpi.org
> >> https://lists.open-mpi.org/mailman/listinfo/devel
>
>
> ------------------------------
>
> Message: 8
> Date: Tue, 23 Jan 2018 21:28:32 -0800
> From: Paul Hargrove <phhargr...@lbl.gov>
> To: Open MPI Developers <devel@lists.open-mpi.org>
> Subject: Re: [OMPI devel] Poor performance when compiling with
>         --disable-dlopen
> Message-ID:
>         <CAAvDA177bOSj_5W9oCt_8NTpZsF75V=9Jkpkg1SFhsov_Q=obg
> @mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> Ah, this sounds familiar.
>
> I believe that the issue Dave sees is that without patcher/overwrite the
> "leave pinned" protocol is OFF by default.
>
> Use of '-mca mpi_leave_pinned 1' may help if my guess is right.
> HOWEVER, w/o the memory management hooks provided using patcher/overwrite,
> leave pinned can give incorrect results.
>
> -Paul
>
> On Tue, Jan 23, 2018 at 9:17 PM, Gilles Gouaillardet <
> gilles.gouaillar...@gmail.com> wrote:
>
> > Dave,
> >
> > here is what I found
> >
> >  - MPI_THREAD_MULTIPLE is not part of the equation (I just found it is
> > no more required by IMB by default)
> >  - patcher/overwrite is not built when Open MPI is configure'd with
> > --disable-dlopen
> >  - when configure'd without --disable-dlopen, performances are way
> > worst for the IMB (PingPong) benchmark when ran with
> >    mpirun --mca patcher ^overwrite
> >  - OSU (osu_bw) performances are not impacted by the patcher/overwrite
> > component being blacklisted
> >
> > I am afraid that's all I can do ...
> >
> >
> > Nathan,
> >
> > could you please shed some light ?
> >
> >
> > Cheers,
> >
> > Gilles
> >
> > On Wed, Jan 24, 2018 at 1:29 PM, Gilles Gouaillardet
> > <gilles.gouaillar...@gmail.com> wrote:
> > > Dave,
> > >
> > > i can reproduce the issue with btl/openib and the IMB benchmark, that
> > > is known to MPI_Init_thread(MPI_THREAD_MULTIPLE)
> > >
> > > note performance is ok with OSU benchmark that does not require
> > > MPI_THREAD_MULTIPLE
> > >
> > > Cheers,
> > >
> > > Gilles
> > >
> > > On Wed, Jan 24, 2018 at 1:16 PM, Gilles Gouaillardet <
> gil...@rist.or.jp>
> > wrote:
> > >> Dave,
> > >>
> > >>
> > >> one more question, are you running the openib/btl ? or other libraries
> > such
> > >> as MXM or UCX ?
> > >>
> > >>
> > >> Cheers,
> > >>
> > >>
> > >> Gilles
> > >>
> > >>
> > >> On 1/24/2018 12:55 PM, Dave Turner wrote:
> > >>>
> > >>>
> > >>>    We compiled OpenMPI 2.1.1 using the EasyBuild configuration
> > >>> for CentOS as below and tested on Mellanox QDR hardware.
> > >>>
> > >>> ./configure --prefix=/homes/daveturner/libs/openmpi-2.1.1c
> > >>>                  --enable-shared
> > >>>                  --enable-mpi-thread-multiple
> > >>>                  --with-verbs
> > >>>                  --enable-mpirun-prefix-by-default
> > >>>                  --with-mpi-cxx
> > >>>                  --enable-mpi-cxx
> > >>>                  --with-hwloc=$EBROOTHWLOC
> > >>>                  --disable-dlopen
> > >>>
> > >>> The red curve in the attached NetPIPE graph shows the poor
> performance
> > >>> above
> > >>> 8 kB for the uni-directional tests with bi-directional and aggregate
> > >>> tests also showing similar problems.  When I compile using the same
> > >>> configuration but with the --disable-dlopen parameter removed then
> the
> > >>> performance is very good as the green curve in the graph shows.
> > >>>
> > >>> We see the same problems with OpenMPI 2.0.2.
> > >>> Replacing --disable-dlopen with --disable-mca-dso showed good
> > performance.
> > >>> Replacing --disable-dlopen with --enable-static showed good
> > performance.
> > >>> So it's only --disable-dlopen that leads to poor performance.
> > >>>
> > >>> http://netpipe.cs.ksu.edu
> > >>>
> > >>>                    Dave Turner
> > >>>
> > >>> --
> > >>> Work: davetur...@ksu.edu <mailto:davetur...@ksu.edu>     (785)
> > 532-7791
> > >>>              2219 Engineering Hall, Manhattan KS  66506
> > >>> Home: drdavetur...@gmail.com <mailto:drdavetur...@gmail.com>
> > >>>               cell: (785) 770-5929
> > >>>
> > >>>
> > >>> _______________________________________________
> > >>> devel mailing list
> > >>> devel@lists.open-mpi.org
> > >>> https://lists.open-mpi.org/mailman/listinfo/devel
> > >>
> > >>
> > >> _______________________________________________
> > >> devel mailing list
> > >> devel@lists.open-mpi.org
> > >> https://lists.open-mpi.org/mailman/listinfo/devel
> > _______________________________________________
> > devel mailing list
> > devel@lists.open-mpi.org
> > https://lists.open-mpi.org/mailman/listinfo/devel
> >
>
>
>
> --
> Paul H. Hargrove <phhargr...@lbl.gov>
> Computer Languages & Systems Software (CLaSS) Group
> Computer Science Department
> Lawrence Berkeley National Laboratory
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <https://lists.open-mpi.org/mailman/private/devel/
> attachments/20180123/ce0ba332/attachment.html>
>
> ------------------------------
>
> Message: 9
> Date: Wed, 24 Jan 2018 14:39:28 +0900
> From: Gilles Gouaillardet <gilles.gouaillar...@gmail.com>
> To: Open MPI Developers <devel@lists.open-mpi.org>
> Subject: Re: [OMPI devel] Poor performance when compiling with
>         --disable-dlopen
> Message-ID:
>         <CAAkFZ5tCh0dw6=+RQXeLE_qw06S46qpireSEf5uaLa4=fysQWA@
> mail.gmail.com>
> Content-Type: text/plain; charset="UTF-8"
>
> Thanks Paul,
>
> unfortunately, that did not help :-(
>
> performance are just as bad even with --mca mpi_leave_pinned 1
>
> and surprisingly, when patcher/overwrite is used, performances are not
> worse with --mca mpi_leave_pinned 0
>
>
> Cheers,
>
> Gilles
>
> On Wed, Jan 24, 2018 at 2:28 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
> > Ah, this sounds familiar.
> >
> > I believe that the issue Dave sees is that without patcher/overwrite the
> > "leave pinned" protocol is OFF by default.
> >
> > Use of '-mca mpi_leave_pinned 1' may help if my guess is right.
> > HOWEVER, w/o the memory management hooks provided using
> patcher/overwrite,
> > leave pinned can give incorrect results.
> >
> > -Paul
> >
> > On Tue, Jan 23, 2018 at 9:17 PM, Gilles Gouaillardet
> > <gilles.gouaillar...@gmail.com> wrote:
> >>
> >> Dave,
> >>
> >> here is what I found
> >>
> >>  - MPI_THREAD_MULTIPLE is not part of the equation (I just found it is
> >> no more required by IMB by default)
> >>  - patcher/overwrite is not built when Open MPI is configure'd with
> >> --disable-dlopen
> >>  - when configure'd without --disable-dlopen, performances are way
> >> worst for the IMB (PingPong) benchmark when ran with
> >>    mpirun --mca patcher ^overwrite
> >>  - OSU (osu_bw) performances are not impacted by the patcher/overwrite
> >> component being blacklisted
> >>
> >> I am afraid that's all I can do ...
> >>
> >>
> >> Nathan,
> >>
> >> could you please shed some light ?
> >>
> >>
> >> Cheers,
> >>
> >> Gilles
> >>
> >> On Wed, Jan 24, 2018 at 1:29 PM, Gilles Gouaillardet
> >> <gilles.gouaillar...@gmail.com> wrote:
> >> > Dave,
> >> >
> >> > i can reproduce the issue with btl/openib and the IMB benchmark, that
> >> > is known to MPI_Init_thread(MPI_THREAD_MULTIPLE)
> >> >
> >> > note performance is ok with OSU benchmark that does not require
> >> > MPI_THREAD_MULTIPLE
> >> >
> >> > Cheers,
> >> >
> >> > Gilles
> >> >
> >> > On Wed, Jan 24, 2018 at 1:16 PM, Gilles Gouaillardet <
> gil...@rist.or.jp>
> >> > wrote:
> >> >> Dave,
> >> >>
> >> >>
> >> >> one more question, are you running the openib/btl ? or other
> libraries
> >> >> such
> >> >> as MXM or UCX ?
> >> >>
> >> >>
> >> >> Cheers,
> >> >>
> >> >>
> >> >> Gilles
> >> >>
> >> >>
> >> >> On 1/24/2018 12:55 PM, Dave Turner wrote:
> >> >>>
> >> >>>
> >> >>>    We compiled OpenMPI 2.1.1 using the EasyBuild configuration
> >> >>> for CentOS as below and tested on Mellanox QDR hardware.
> >> >>>
> >> >>> ./configure --prefix=/homes/daveturner/libs/openmpi-2.1.1c
> >> >>>                  --enable-shared
> >> >>>                  --enable-mpi-thread-multiple
> >> >>>                  --with-verbs
> >> >>>                  --enable-mpirun-prefix-by-default
> >> >>>                  --with-mpi-cxx
> >> >>>                  --enable-mpi-cxx
> >> >>>                  --with-hwloc=$EBROOTHWLOC
> >> >>>                  --disable-dlopen
> >> >>>
> >> >>> The red curve in the attached NetPIPE graph shows the poor
> performance
> >> >>> above
> >> >>> 8 kB for the uni-directional tests with bi-directional and aggregate
> >> >>> tests also showing similar problems.  When I compile using the same
> >> >>> configuration but with the --disable-dlopen parameter removed then
> the
> >> >>> performance is very good as the green curve in the graph shows.
> >> >>>
> >> >>> We see the same problems with OpenMPI 2.0.2.
> >> >>> Replacing --disable-dlopen with --disable-mca-dso showed good
> >> >>> performance.
> >> >>> Replacing --disable-dlopen with --enable-static showed good
> >> >>> performance.
> >> >>> So it's only --disable-dlopen that leads to poor performance.
> >> >>>
> >> >>> http://netpipe.cs.ksu.edu
> >> >>>
> >> >>>                    Dave Turner
> >> >>>
> >> >>> --
> >> >>> Work: davetur...@ksu.edu <mailto:davetur...@ksu.edu>     (785)
> >> >>> 532-7791
> >> >>>              2219 Engineering Hall, Manhattan KS  66506
> >> >>> Home: drdavetur...@gmail.com <mailto:drdavetur...@gmail.com>
> >> >>>               cell: (785) 770-5929
> >> >>>
> >> >>>
> >> >>> _______________________________________________
> >> >>> devel mailing list
> >> >>> devel@lists.open-mpi.org
> >> >>> https://lists.open-mpi.org/mailman/listinfo/devel
> >> >>
> >> >>
> >> >> _______________________________________________
> >> >> devel mailing list
> >> >> devel@lists.open-mpi.org
> >> >> https://lists.open-mpi.org/mailman/listinfo/devel
> >> _______________________________________________
> >> devel mailing list
> >> devel@lists.open-mpi.org
> >> https://lists.open-mpi.org/mailman/listinfo/devel
> >
> >
> >
> >
> > --
> > Paul H. Hargrove <phhargr...@lbl.gov>
> > Computer Languages & Systems Software (CLaSS) Group
> > Computer Science Department
> > Lawrence Berkeley National Laboratory
> >
> > _______________________________________________
> > devel mailing list
> > devel@lists.open-mpi.org
> > https://lists.open-mpi.org/mailman/listinfo/devel
>
>
> ------------------------------
>
> Message: 10
> Date: Tue, 23 Jan 2018 23:55:35 -0600
> From: Dave Turner <drdavetur...@gmail.com>
> To: Gilles Gouaillardet <gilles.gouaillar...@gmail.com>,  Open MPI
>         Developers <devel@lists.open-mpi.org>
> Subject: Re: [OMPI devel] Poor performance when compiling with
>         --disable-dlopen
> Message-ID:
>         <CAFGXdkyf5QJ46RrTT1=fXZr0uBg8xAHNV=C0GPAV3OEnzupxx
> q...@mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> Gilles,
>
>    I'm using NetPIPE which is available at http://netpipe.cs.ksu.edu
> My base test is uni-directional with 1 process on a node communicating
> with a process on a second node.
>
> make mpi
> mpirun -np 2 --hostfile=hf.2p2n NPmpi
> cat hf.2p2n
> node0 slots=1
> node1 slots=1
>
> NetPIPE does not do any MPI_Init_thread().
> Tests on the configs below give good performance with and without the
> --enable-mpi-thread-multiple
> so I don't think that's the issue.
>
> configure --prefix=/homes/daveturner/libs/openmpi-2.1.1
> --enable-mpi-fortran=all --with-verbs --enable-ipv6 --enable-mpi-cxx
> configure --prefix=/homes/daveturner/libs/openmpi-2.1.1
> --enable-mpi-fortran=all --with-verbs --enable-ipv6 --enable-mpi-cxx
> --enable-mpi-thread-multiple
>
>                   Dave
>
> On Tue, Jan 23, 2018 at 10:03 PM, Gilles Gouaillardet <
> gilles.gouaillar...@gmail.com> wrote:
>
> > Dave,
> >
> > At first glance, that looks pretty odd, and I'll have a look at it.
> >
> > Which benchmark are you using to measure the bandwidth ?
> > Does your benchmark MPI_Init_thread(MPI_THREAD_MULTIPLE) ?
> > Have you tried without  --enable-mpi-thread-multiple ?
> >
> > Cheers,
> >
> > Gilles
> >
> > On Wed, Jan 24, 2018 at 12:55 PM, Dave Turner <drdavetur...@gmail.com>
> > wrote:
> > >
> > >    We compiled OpenMPI 2.1.1 using the EasyBuild configuration
> > > for CentOS as below and tested on Mellanox QDR hardware.
> > >
> > > ./configure --prefix=/homes/daveturner/libs/openmpi-2.1.1c
> > >                  --enable-shared
> > >                  --enable-mpi-thread-multiple
> > >                  --with-verbs
> > >                  --enable-mpirun-prefix-by-default
> > >                  --with-mpi-cxx
> > >                  --enable-mpi-cxx
> > >                  --with-hwloc=$EBROOTHWLOC
> > >                  --disable-dlopen
> > >
> > > The red curve in the attached NetPIPE graph shows the poor performance
> > above
> > > 8 kB for the uni-directional tests with bi-directional and aggregate
> > > tests also showing similar problems.  When I compile using the same
> > > configuration but with the --disable-dlopen parameter removed then the
> > > performance is very good as the green curve in the graph shows.
> > >
> > > We see the same problems with OpenMPI 2.0.2.
> > > Replacing --disable-dlopen with --disable-mca-dso showed good
> > performance.
> > > Replacing --disable-dlopen with --enable-static showed good
> performance.
> > > So it's only --disable-dlopen that leads to poor performance.
> > >
> > > http://netpipe.cs.ksu.edu
> > >
> > >                    Dave Turner
> > >
> > > --
> > > Work:     davetur...@ksu.edu     (785) 532-7791
> > >              2219 Engineering Hall, Manhattan KS  66506
> > > Home:    drdavetur...@gmail.com
> > >               cell: (785) 770-5929
> > >
> > > _______________________________________________
> > > devel mailing list
> > > devel@lists.open-mpi.org
> > > https://lists.open-mpi.org/mailman/listinfo/devel
> >
>
>
>
> --
> Work:     davetur...@ksu.edu     (785) 532-7791
>              2219 Engineering Hall, Manhattan KS  66506
> Home:    drdavetur...@gmail.com
>               cell: (785) 770-5929
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <https://lists.open-mpi.org/mailman/private/devel/
> attachments/20180123/05fb1735/attachment.html>
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> devel mailing list
> devel@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/devel
>
> ------------------------------
>
> End of devel Digest, Vol 3576, Issue 1
> **************************************
>



-- 
Work:     davetur...@ksu.edu     (785) 532-7791
             2219 Engineering Hall, Manhattan KS  66506
Home:    drdavetur...@gmail.com
              cell: (785) 770-5929
_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel

Reply via email to