Well, now it is a multi-line patch, and it is more hacky... but this works for me. Suggestions for a better thing to do to upstream this functionality would be appreciated.
--- openmpi-4.1.0/config/ompi_check_ucx.m4.orig 2021-01-26 11:13:55.753451526 -0600 +++ openmpi-4.1.0/config/ompi_check_ucx.m4 2021-01-26 11:25:56.738822308 -0600 @@ -26,7 +26,8 @@ [AC_ARG_WITH([ucx], [AC_HELP_STRING([--with-ucx(=DIR)], [Build with Unified Communication X library support])]) - OPAL_CHECK_WITHDIR([ucx], [$with_ucx], [include/ucp/api/ucp.h]) + AS_IF([test "$with_ucx" != "from_runtime_env"], + [OPAL_CHECK_WITHDIR([ucx], [$with_ucx], [include/ucp/api/ucp.h])]) AC_ARG_WITH([ucx-libdir], [AC_HELP_STRING([--with-ucx-libdir=DIR], [Search for Unified Communication X libraries in DIR])]) @@ -41,7 +42,7 @@ [ompi_check_ucx_dir=])], [true])]) ompi_check_ucx_happy="no" - AS_IF([test -z "$ompi_check_ucx_dir"], + AS_IF([test -z "$ompi_check_ucx_dir" || test "$ompi_check_ucx_dir" = "from_runtime_env"], [OPAL_CHECK_PACKAGE([ompi_check_ucx], [ucp/api/ucp.h], [ucp], On Tue, Jan 26, 2021 at 12:04 PM Tim Mattox <tmat...@gmail.com> wrote: > > Ugh, apparently my one-line patch to > openmpi-4.1.0/config/ompi_check_ucx.m4 wasn't sufficient on a fresh > install... debugging... > > On Tue, Jan 26, 2021 at 10:16 AM Tim Mattox <tmat...@gmail.com> wrote: > > > > My environment modules were already setting LD_LIBRARY_PATH to point > > to my UCX lib directory. > > > > The real problem was that OMPI's config/ompi_check_ucx.m4 was > > recording the full path to the UCX library if it wasn't found in a > > standard system location (e.g. /lib, /lib64, /usr/lib, etc.). > > That is normally a good thing to do, since the chances that the > > average mpirun user will have setup their LD_LIBRARY_PATH is lower > > than the software installer would have done so correctly (I hope). > > In my case however, I'm setting up the environment modules to enforce > > the LD_LIBRARY_PATH to have an ABI compatible UCX available. > > > > I found two ways to override ompi_check_ucx.m4 to just let the linker > > find the UCX libraries on its own: > > 1) Modify UCX's lib/pkgconfig/ucx.pc to have "prefix = /usr" so that > > ompi_check_ucx.m4 would think UCX was in a standard system location > > (which is a lie). > > 2) Patch the OMPI configure system to allow me to force it to not > > hard-code the path to UCX into the OMPI .so files such as > > libmca_common_ucx.so > > > > I chose the latter, since there might be other users of my UCX > > installs, and I didn't want to possibly break them by having > > pkg_config lie. > > And, I was already having to patch an OMPI config/*.m4 file for a > > different reason, and was thus already having to run "./autogen.pl -f" > > anyway. > > So, with the patch below, I can now do an OMPI configure line with > > "... --with-ucx=from_runtime_env ..." and the resulting build will use > > whatever UCX is found at link time. > > Normally, that would be scary, since who knows what ancient version of > > UCX might be sitting around from the Linux distro, but my environment > > modules are enforcing the dependencies so that my UCX libraries will > > be found, even if a user does a "module swap". Hey, a cool benefit of > > lmod or modules v4+, but I digress. > > > > Anyway, if there is some other OMPI sanctioned way to do this, please > > let me know, or suggest a better way so I can upstream a patch. I > > have vague memories of being able to force this kind of behavior by > > doing something like "configure ... --with-ucx=/usr ...", but I > > couldn't find any documentation for that, and in doing code inspection > > of the m4 files revealed that such a feature (if it was an intended > > feature) had bitrotted. > > > > --- openmpi-4.1.0/config/ompi_check_ucx.m4.orig 2021-01-25 > > 18:23:17.112499399 -0600 > > +++ openmpi-4.1.0/config/ompi_check_ucx.m4 2021-01-25 20:25:15.919338784 > > -0600 > > @@ -41,7 +41,7 @@ > > > > [ompi_check_ucx_dir=])], > > [true])]) > > ompi_check_ucx_happy="no" > > - AS_IF([test -z "$ompi_check_ucx_dir"], > > + AS_IF([test -z "$ompi_check_ucx_dir" || test > > "$ompi_check_ucx_dir" = "from_runtime_env"], > > [OPAL_CHECK_PACKAGE([ompi_check_ucx], > > [ucp/api/ucp.h], > > [ucp], > > > > Oddly, the Open MPI configure script already has overrides for > > ucx_CFLAGS, ucx_LIBS, and ucx_STATIC_LIBS, but nothing for something > > like "ucx_LDFLAGS=''". > > I didn't see a simple way to add support for such an override without > > some more extensive changes to multiple m4 files. > > > > On Sun, Jan 24, 2021 at 7:08 PM Gilles Gouaillardet via devel > > <devel@lists.open-mpi.org> wrote: > > > > > > Tim, > > > > > > Have you tried using LD_LIBRARY_PATH? > > > I guess "hardcoding the full path" means "link with -rpath", and IIRC, > > > LD_LIBRARY_PATH > > > overrides this setting. > > > > > > > > > If this does not work, here something you can try (disclaimer: I did not) > > > > > > export LD_LIBRARY_PATH=/same/install/prefix/ucx/1.9.0/lib > > > configure ... --with-ucx > > > CPPFLAGS=-I/same/install/prefix/ucx/1.9.0/include > > > LDFLAGS=-L/same/install/prefix/ucx/1.9.0/lib > > > > > > I expect the UCX components use libuct.so instead of > > > /same/install/prefix/ucx/1.9.0/lib/libuct.so. > > > If your users want the debug version, then you can simply change your > > > LD_LIBRARY_PATH > > > (module swap ucx should do the trick) > > > > > > Three caveats you should keep in mind: > > > - it is your responsibility to ensure the debug and prod versions of > > > UCX are ABI compatible > > > - it will be mandatory to load a ucx module (otherwise Open MPI won't > > > find UCX libraries) > > > - this is a guess and I did not test this. > > > > > > > > > An other option (I did not try) would be to install UCX on your build > > > machine in /usr > > > (since I expect /usr/lib/libuct.so is not hardcoded) and then use > > > LD_LIBRARY_PATH > > > (I assume your ucx module set it) to point to the UCX flavor of your > > > choice). > > > > > > Cheers, > > > > > > Gilles > > > > > > On Mon, Jan 25, 2021 at 7:43 AM Tim Mattox via devel > > > <devel@lists.open-mpi.org> wrote: > > > > > > > > I'm specifically wanting my users to be able to load a "debug" vs. > > > > "tuned" UCX module, without me having to make two different Open MPI > > > > installs... the combinatorics get bad after a few versions.... (I'm > > > > already having multiple versions of Open MPI to handle the differences > > > > in Fortran mpi mod files for various compilers.) > > > > Here are the differences in the configure options between the two UCX > > > > modules: > > > > debug version: --enable-logging --enable-debug --enable-assertions > > > > --enable-params-check --prefix=/same/install/prefix/ucx/1.9.0/debug > > > > tuned version: --disable-logging --disable-debug --disable-assertions > > > > --disable-params-check --prefix=/same/install/prefix/ucx/1.9.0/tuned > > > > > > > > We noticed that the --enable-debug option for UCX has a pretty > > > > dramatic performance hit for one application (so far). > > > > I've already tested that everything works fine if I replace UCX's .so > > > > files manually in the filesystem, and the "new/changed" ones get > > > > loaded, but a user can't make that kind of swap. > > > > My hope is a user could type "module swap ucx/1.9.0/tuned > > > > ucx/1.9.0/debug" when they want to enable debugging at the UCX layer. > > > > > > > > On Sun, Jan 24, 2021 at 4:43 PM Yossi Itigin <yos...@nvidia.com> wrote: > > > > > > > > > > Hi, > > > > > > > > > > One option is to use LD_PRELOAD to load all ucx libraries from a > > > > > specific location > > > > > For example: mpirun -x > > > > > LD_PRELOAD=<path-to-libucp.so>:<path-to-libuct.so>:<path-to-libucs.so>:<path-to-libucm.so> > > > > > ... <exe> <args> > > > > > > > > > > BTW, what is different about the other UCX configuration? Maybe this > > > > > is something which can be resolved another way. > > > > > > > > > > --Yossi > > > > > > > > > > -----Original Message----- > > > > > From: devel <devel-boun...@lists.open-mpi.org> On Behalf Of Tim > > > > > Mattox via devel > > > > > Sent: Sunday, 24 January 2021 23:18 > > > > > To: devel@lists.open-mpi.org > > > > > Cc: Tim Mattox <tmat...@gmail.com> > > > > > Subject: [OMPI devel] How to build Open MPI so the UCX used can be > > > > > changed at runtime? > > > > > > > > > > Hello, > > > > > I've run into an application that has its performance dramatically > > > > > affected by some configuration options to the underlying UCX library. > > > > > Is there a way to configure/build Open MPI so that which UCX library > > > > > is used is determined at runtime (e.g. by an environment module), > > > > > rather than having to configure/build different instances of Open MPI? > > > > > > > > > > When I configure Open MPI 4.1.0 with "--with-ucx" it is hardcoding > > > > > the full path to the UCX .so library files to the UCX version it > > > > > found at configure time. > > > > > -- > > > > > Tim Mattox, Ph.D. - tmat...@gmail.com > > > > > > > > > > > > > > > > -- > > > > Tim Mattox, Ph.D. - tmat...@gmail.com > > > > > > > > -- > > Tim Mattox, Ph.D. - tmat...@gmail.com > > > > -- > Tim Mattox, Ph.D. - tmat...@gmail.com -- Tim Mattox, Ph.D. - tmat...@gmail.com