Well, now it is a multi-line patch, and it is more hacky... but this
works for me.  Suggestions for a better thing to do to upstream this
functionality would be appreciated.

--- openmpi-4.1.0/config/ompi_check_ucx.m4.orig 2021-01-26
11:13:55.753451526 -0600
+++ openmpi-4.1.0/config/ompi_check_ucx.m4 2021-01-26 11:25:56.738822308 -0600
@@ -26,7 +26,8 @@
           [AC_ARG_WITH([ucx],
         [AC_HELP_STRING([--with-ucx(=DIR)],
         [Build with Unified Communication X library support])])
-    OPAL_CHECK_WITHDIR([ucx], [$with_ucx], [include/ucp/api/ucp.h])
+    AS_IF([test "$with_ucx" != "from_runtime_env"],
+          [OPAL_CHECK_WITHDIR([ucx], [$with_ucx], [include/ucp/api/ucp.h])])
     AC_ARG_WITH([ucx-libdir],
         [AC_HELP_STRING([--with-ucx-libdir=DIR],
         [Search for Unified Communication X libraries in DIR])])
@@ -41,7 +42,7 @@
                                                     [ompi_check_ucx_dir=])],
                                              [true])])
                   ompi_check_ucx_happy="no"
-                  AS_IF([test -z "$ompi_check_ucx_dir"],
+                  AS_IF([test -z "$ompi_check_ucx_dir" || test
"$ompi_check_ucx_dir" = "from_runtime_env"],
                         [OPAL_CHECK_PACKAGE([ompi_check_ucx],
                                    [ucp/api/ucp.h],
                                    [ucp],

On Tue, Jan 26, 2021 at 12:04 PM Tim Mattox <tmat...@gmail.com> wrote:
>
> Ugh, apparently my one-line patch to
> openmpi-4.1.0/config/ompi_check_ucx.m4 wasn't sufficient on a fresh
> install... debugging...
>
> On Tue, Jan 26, 2021 at 10:16 AM Tim Mattox <tmat...@gmail.com> wrote:
> >
> > My environment modules were already setting LD_LIBRARY_PATH to point
> > to my UCX lib directory.
> >
> > The real problem was that OMPI's config/ompi_check_ucx.m4 was
> > recording the full path to the UCX library if it wasn't found in a
> > standard system location (e.g. /lib, /lib64, /usr/lib, etc.).
> > That is normally a good thing to do, since the chances that the
> > average mpirun user will have setup their LD_LIBRARY_PATH is lower
> > than the software installer would have done so correctly (I hope).
> > In my case however, I'm setting up the environment modules to enforce
> > the LD_LIBRARY_PATH to have an ABI compatible UCX available.
> >
> > I found two ways to override ompi_check_ucx.m4 to just let the linker
> > find the UCX libraries on its own:
> > 1) Modify UCX's lib/pkgconfig/ucx.pc to have "prefix = /usr" so that
> > ompi_check_ucx.m4 would think UCX was in a standard system location
> > (which is a lie).
> > 2) Patch the OMPI configure system to allow me to force it to not
> > hard-code the path to UCX into the OMPI .so files such as
> > libmca_common_ucx.so
> >
> > I chose the latter, since there might be other users of my UCX
> > installs, and I didn't want to possibly break them by having
> > pkg_config lie.
> > And, I was already having to patch an OMPI config/*.m4 file for a
> > different reason, and was thus already having to run "./autogen.pl -f"
> > anyway.
> > So, with the patch below, I can now do an OMPI configure line with
> > "... --with-ucx=from_runtime_env ..." and the resulting build will use
> > whatever UCX is found at link time.
> > Normally, that would be scary, since who knows what ancient version of
> > UCX might be sitting around from the Linux distro, but my environment
> > modules are enforcing the dependencies so that my UCX libraries will
> > be found, even if a user does a "module swap".  Hey, a cool benefit of
> > lmod or modules v4+, but I digress.
> >
> > Anyway, if there is some other OMPI sanctioned way to do this, please
> > let me know, or suggest a better way so I can upstream a patch.  I
> > have vague memories of being able to force this kind of behavior by
> > doing  something like "configure ... --with-ucx=/usr ...", but I
> > couldn't find any documentation for that, and in doing code inspection
> > of the m4 files revealed that such a feature (if it was an intended
> > feature) had bitrotted.
> >
> > --- openmpi-4.1.0/config/ompi_check_ucx.m4.orig 2021-01-25
> > 18:23:17.112499399 -0600
> > +++ openmpi-4.1.0/config/ompi_check_ucx.m4 2021-01-25 20:25:15.919338784 
> > -0600
> > @@ -41,7 +41,7 @@
> >                                                      
> > [ompi_check_ucx_dir=])],
> >                                               [true])])
> >                    ompi_check_ucx_happy="no"
> > -                  AS_IF([test -z "$ompi_check_ucx_dir"],
> > +                  AS_IF([test -z "$ompi_check_ucx_dir" || test
> > "$ompi_check_ucx_dir" = "from_runtime_env"],
> >                          [OPAL_CHECK_PACKAGE([ompi_check_ucx],
> >                                     [ucp/api/ucp.h],
> >                                     [ucp],
> >
> > Oddly, the Open MPI configure script already has overrides for
> > ucx_CFLAGS, ucx_LIBS, and ucx_STATIC_LIBS, but nothing for something
> > like "ucx_LDFLAGS=''".
> > I didn't see a simple way to add support for such an override without
> > some more extensive changes to multiple m4 files.
> >
> > On Sun, Jan 24, 2021 at 7:08 PM Gilles Gouaillardet via devel
> > <devel@lists.open-mpi.org> wrote:
> > >
> > > Tim,
> > >
> > > Have you tried using LD_LIBRARY_PATH?
> > > I guess "hardcoding the full path" means "link with -rpath", and IIRC,
> > > LD_LIBRARY_PATH
> > > overrides this setting.
> > >
> > >
> > > If this does not work, here something you can try (disclaimer: I did not)
> > >
> > > export LD_LIBRARY_PATH=/same/install/prefix/ucx/1.9.0/lib
> > > configure ... --with-ucx
> > > CPPFLAGS=-I/same/install/prefix/ucx/1.9.0/include
> > > LDFLAGS=-L/same/install/prefix/ucx/1.9.0/lib
> > >
> > > I expect the UCX components use libuct.so instead of
> > > /same/install/prefix/ucx/1.9.0/lib/libuct.so.
> > > If your users want the debug version, then you can simply change your
> > > LD_LIBRARY_PATH
> > > (module swap ucx should do the trick)
> > >
> > > Three caveats you should keep in mind:
> > >  - it is your responsibility to ensure the debug and prod versions of
> > > UCX are ABI compatible
> > >  - it will be mandatory to load a ucx module (otherwise Open MPI won't
> > > find UCX libraries)
> > >  - this is a guess and I did not test this.
> > >
> > >
> > > An other option (I did not try) would be to install UCX on your build
> > > machine in /usr
> > > (since I expect /usr/lib/libuct.so is not hardcoded) and then use
> > > LD_LIBRARY_PATH
> > > (I assume your ucx module set it) to point to the UCX flavor of your 
> > > choice).
> > >
> > > Cheers,
> > >
> > > Gilles
> > >
> > > On Mon, Jan 25, 2021 at 7:43 AM Tim Mattox via devel
> > > <devel@lists.open-mpi.org> wrote:
> > > >
> > > > I'm specifically wanting my users to be able to load a "debug" vs.
> > > > "tuned" UCX module, without me having to make two different Open MPI
> > > > installs... the combinatorics get bad after a few versions.... (I'm
> > > > already having multiple versions of Open MPI to handle the differences
> > > > in Fortran mpi mod files for various compilers.)
> > > > Here are the differences in the configure options between the two UCX 
> > > > modules:
> > > > debug version: --enable-logging --enable-debug --enable-assertions
> > > > --enable-params-check --prefix=/same/install/prefix/ucx/1.9.0/debug
> > > > tuned version: --disable-logging --disable-debug --disable-assertions
> > > > --disable-params-check --prefix=/same/install/prefix/ucx/1.9.0/tuned
> > > >
> > > > We noticed that the --enable-debug option for UCX has a pretty
> > > > dramatic performance hit for one application (so far).
> > > > I've already tested that everything works fine if I replace UCX's .so
> > > > files manually in the filesystem, and the "new/changed" ones get
> > > > loaded, but a user can't make that kind of swap.
> > > > My hope is a user could type "module swap ucx/1.9.0/tuned
> > > > ucx/1.9.0/debug" when they want to enable debugging at the UCX layer.
> > > >
> > > > On Sun, Jan 24, 2021 at 4:43 PM Yossi Itigin <yos...@nvidia.com> wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > One option is to use LD_PRELOAD to load all ucx libraries from a 
> > > > > specific location
> > > > > For example: mpirun -x 
> > > > > LD_PRELOAD=<path-to-libucp.so>:<path-to-libuct.so>:<path-to-libucs.so>:<path-to-libucm.so>
> > > > >  ... <exe> <args>
> > > > >
> > > > > BTW, what is different about the other UCX configuration? Maybe this 
> > > > > is something which can be resolved another way.
> > > > >
> > > > > --Yossi
> > > > >
> > > > > -----Original Message-----
> > > > > From: devel <devel-boun...@lists.open-mpi.org> On Behalf Of Tim 
> > > > > Mattox via devel
> > > > > Sent: Sunday, 24 January 2021 23:18
> > > > > To: devel@lists.open-mpi.org
> > > > > Cc: Tim Mattox <tmat...@gmail.com>
> > > > > Subject: [OMPI devel] How to build Open MPI so the UCX used can be 
> > > > > changed at runtime?
> > > > >
> > > > > Hello,
> > > > > I've run into an application that has its performance dramatically 
> > > > > affected by some configuration options to the underlying UCX library.
> > > > > Is there a way to configure/build Open MPI so that which UCX library 
> > > > > is used is determined at runtime (e.g. by an environment module), 
> > > > > rather than having to configure/build different instances of Open MPI?
> > > > >
> > > > > When I configure Open MPI 4.1.0 with "--with-ucx" it is hardcoding 
> > > > > the full path to the UCX .so library files to the UCX version it 
> > > > > found at configure time.
> > > > > --
> > > > > Tim Mattox, Ph.D. - tmat...@gmail.com
> > > >
> > > >
> > > >
> > > > --
> > > > Tim Mattox, Ph.D. - tmat...@gmail.com
> >
> >
> >
> > --
> > Tim Mattox, Ph.D. - tmat...@gmail.com
>
>
>
> --
> Tim Mattox, Ph.D. - tmat...@gmail.com



-- 
Tim Mattox, Ph.D. - tmat...@gmail.com

Reply via email to