My environment modules were already setting LD_LIBRARY_PATH to point to my UCX lib directory.
The real problem was that OMPI's config/ompi_check_ucx.m4 was recording the full path to the UCX library if it wasn't found in a standard system location (e.g. /lib, /lib64, /usr/lib, etc.). That is normally a good thing to do, since the chances that the average mpirun user will have setup their LD_LIBRARY_PATH is lower than the software installer would have done so correctly (I hope). In my case however, I'm setting up the environment modules to enforce the LD_LIBRARY_PATH to have an ABI compatible UCX available. I found two ways to override ompi_check_ucx.m4 to just let the linker find the UCX libraries on its own: 1) Modify UCX's lib/pkgconfig/ucx.pc to have "prefix = /usr" so that ompi_check_ucx.m4 would think UCX was in a standard system location (which is a lie). 2) Patch the OMPI configure system to allow me to force it to not hard-code the path to UCX into the OMPI .so files such as libmca_common_ucx.so I chose the latter, since there might be other users of my UCX installs, and I didn't want to possibly break them by having pkg_config lie. And, I was already having to patch an OMPI config/*.m4 file for a different reason, and was thus already having to run "./autogen.pl -f" anyway. So, with the patch below, I can now do an OMPI configure line with "... --with-ucx=from_runtime_env ..." and the resulting build will use whatever UCX is found at link time. Normally, that would be scary, since who knows what ancient version of UCX might be sitting around from the Linux distro, but my environment modules are enforcing the dependencies so that my UCX libraries will be found, even if a user does a "module swap". Hey, a cool benefit of lmod or modules v4+, but I digress. Anyway, if there is some other OMPI sanctioned way to do this, please let me know, or suggest a better way so I can upstream a patch. I have vague memories of being able to force this kind of behavior by doing something like "configure ... --with-ucx=/usr ...", but I couldn't find any documentation for that, and in doing code inspection of the m4 files revealed that such a feature (if it was an intended feature) had bitrotted. --- openmpi-4.1.0/config/ompi_check_ucx.m4.orig 2021-01-25 18:23:17.112499399 -0600 +++ openmpi-4.1.0/config/ompi_check_ucx.m4 2021-01-25 20:25:15.919338784 -0600 @@ -41,7 +41,7 @@ [ompi_check_ucx_dir=])], [true])]) ompi_check_ucx_happy="no" - AS_IF([test -z "$ompi_check_ucx_dir"], + AS_IF([test -z "$ompi_check_ucx_dir" || test "$ompi_check_ucx_dir" = "from_runtime_env"], [OPAL_CHECK_PACKAGE([ompi_check_ucx], [ucp/api/ucp.h], [ucp], Oddly, the Open MPI configure script already has overrides for ucx_CFLAGS, ucx_LIBS, and ucx_STATIC_LIBS, but nothing for something like "ucx_LDFLAGS=''". I didn't see a simple way to add support for such an override without some more extensive changes to multiple m4 files. On Sun, Jan 24, 2021 at 7:08 PM Gilles Gouaillardet via devel <devel@lists.open-mpi.org> wrote: > > Tim, > > Have you tried using LD_LIBRARY_PATH? > I guess "hardcoding the full path" means "link with -rpath", and IIRC, > LD_LIBRARY_PATH > overrides this setting. > > > If this does not work, here something you can try (disclaimer: I did not) > > export LD_LIBRARY_PATH=/same/install/prefix/ucx/1.9.0/lib > configure ... --with-ucx > CPPFLAGS=-I/same/install/prefix/ucx/1.9.0/include > LDFLAGS=-L/same/install/prefix/ucx/1.9.0/lib > > I expect the UCX components use libuct.so instead of > /same/install/prefix/ucx/1.9.0/lib/libuct.so. > If your users want the debug version, then you can simply change your > LD_LIBRARY_PATH > (module swap ucx should do the trick) > > Three caveats you should keep in mind: > - it is your responsibility to ensure the debug and prod versions of > UCX are ABI compatible > - it will be mandatory to load a ucx module (otherwise Open MPI won't > find UCX libraries) > - this is a guess and I did not test this. > > > An other option (I did not try) would be to install UCX on your build > machine in /usr > (since I expect /usr/lib/libuct.so is not hardcoded) and then use > LD_LIBRARY_PATH > (I assume your ucx module set it) to point to the UCX flavor of your choice). > > Cheers, > > Gilles > > On Mon, Jan 25, 2021 at 7:43 AM Tim Mattox via devel > <devel@lists.open-mpi.org> wrote: > > > > I'm specifically wanting my users to be able to load a "debug" vs. > > "tuned" UCX module, without me having to make two different Open MPI > > installs... the combinatorics get bad after a few versions.... (I'm > > already having multiple versions of Open MPI to handle the differences > > in Fortran mpi mod files for various compilers.) > > Here are the differences in the configure options between the two UCX > > modules: > > debug version: --enable-logging --enable-debug --enable-assertions > > --enable-params-check --prefix=/same/install/prefix/ucx/1.9.0/debug > > tuned version: --disable-logging --disable-debug --disable-assertions > > --disable-params-check --prefix=/same/install/prefix/ucx/1.9.0/tuned > > > > We noticed that the --enable-debug option for UCX has a pretty > > dramatic performance hit for one application (so far). > > I've already tested that everything works fine if I replace UCX's .so > > files manually in the filesystem, and the "new/changed" ones get > > loaded, but a user can't make that kind of swap. > > My hope is a user could type "module swap ucx/1.9.0/tuned > > ucx/1.9.0/debug" when they want to enable debugging at the UCX layer. > > > > On Sun, Jan 24, 2021 at 4:43 PM Yossi Itigin <yos...@nvidia.com> wrote: > > > > > > Hi, > > > > > > One option is to use LD_PRELOAD to load all ucx libraries from a specific > > > location > > > For example: mpirun -x > > > LD_PRELOAD=<path-to-libucp.so>:<path-to-libuct.so>:<path-to-libucs.so>:<path-to-libucm.so> > > > ... <exe> <args> > > > > > > BTW, what is different about the other UCX configuration? Maybe this is > > > something which can be resolved another way. > > > > > > --Yossi > > > > > > -----Original Message----- > > > From: devel <devel-boun...@lists.open-mpi.org> On Behalf Of Tim Mattox > > > via devel > > > Sent: Sunday, 24 January 2021 23:18 > > > To: devel@lists.open-mpi.org > > > Cc: Tim Mattox <tmat...@gmail.com> > > > Subject: [OMPI devel] How to build Open MPI so the UCX used can be > > > changed at runtime? > > > > > > Hello, > > > I've run into an application that has its performance dramatically > > > affected by some configuration options to the underlying UCX library. > > > Is there a way to configure/build Open MPI so that which UCX library is > > > used is determined at runtime (e.g. by an environment module), rather > > > than having to configure/build different instances of Open MPI? > > > > > > When I configure Open MPI 4.1.0 with "--with-ucx" it is hardcoding the > > > full path to the UCX .so library files to the UCX version it found at > > > configure time. > > > -- > > > Tim Mattox, Ph.D. - tmat...@gmail.com > > > > > > > > -- > > Tim Mattox, Ph.D. - tmat...@gmail.com -- Tim Mattox, Ph.D. - tmat...@gmail.com