Re: [OMPI devel] How to build Open MPI so the UCX used can be changed at runtime?
Tim, a simple option is to configure ... LDFLAGS="-Wl,--enable-new-dtags" If Open MPI is built with this option, then LD_LIBRARY_PATH takes precedence over rpath (the default is the opposite as correctly pointed by Yossi in an earlier message) Cheers, Gilles On 1/27/2021 2:48 AM, Tim Mattox via devel wrote: Well, now it is a multi-line patch, and it is more hacky... but this works for me. Suggestions for a better thing to do to upstream this functionality would be appreciated. --- openmpi-4.1.0/config/ompi_check_ucx.m4.orig 2021-01-26 11:13:55.753451526 -0600 +++ openmpi-4.1.0/config/ompi_check_ucx.m4 2021-01-26 11:25:56.738822308 -0600 @@ -26,7 +26,8 @@ [AC_ARG_WITH([ucx], [AC_HELP_STRING([--with-ucx(=DIR)], [Build with Unified Communication X library support])]) -OPAL_CHECK_WITHDIR([ucx], [$with_ucx], [include/ucp/api/ucp.h]) +AS_IF([test "$with_ucx" != "from_runtime_env"], + [OPAL_CHECK_WITHDIR([ucx], [$with_ucx], [include/ucp/api/ucp.h])]) AC_ARG_WITH([ucx-libdir], [AC_HELP_STRING([--with-ucx-libdir=DIR], [Search for Unified Communication X libraries in DIR])]) @@ -41,7 +42,7 @@ [ompi_check_ucx_dir=])], [true])]) ompi_check_ucx_happy="no" - AS_IF([test -z "$ompi_check_ucx_dir"], + AS_IF([test -z "$ompi_check_ucx_dir" || test "$ompi_check_ucx_dir" = "from_runtime_env"], [OPAL_CHECK_PACKAGE([ompi_check_ucx], [ucp/api/ucp.h], [ucp], On Tue, Jan 26, 2021 at 12:04 PM Tim Mattox wrote: Ugh, apparently my one-line patch to openmpi-4.1.0/config/ompi_check_ucx.m4 wasn't sufficient on a fresh install... debugging... On Tue, Jan 26, 2021 at 10:16 AM Tim Mattox wrote: My environment modules were already setting LD_LIBRARY_PATH to point to my UCX lib directory. The real problem was that OMPI's config/ompi_check_ucx.m4 was recording the full path to the UCX library if it wasn't found in a standard system location (e.g. /lib, /lib64, /usr/lib, etc.). That is normally a good thing to do, since the chances that the average mpirun user will have setup their LD_LIBRARY_PATH is lower than the software installer would have done so correctly (I hope). In my case however, I'm setting up the environment modules to enforce the LD_LIBRARY_PATH to have an ABI compatible UCX available. I found two ways to override ompi_check_ucx.m4 to just let the linker find the UCX libraries on its own: 1) Modify UCX's lib/pkgconfig/ucx.pc to have "prefix = /usr" so that ompi_check_ucx.m4 would think UCX was in a standard system location (which is a lie). 2) Patch the OMPI configure system to allow me to force it to not hard-code the path to UCX into the OMPI .so files such as libmca_common_ucx.so I chose the latter, since there might be other users of my UCX installs, and I didn't want to possibly break them by having pkg_config lie. And, I was already having to patch an OMPI config/*.m4 file for a different reason, and was thus already having to run "./autogen.pl -f" anyway. So, with the patch below, I can now do an OMPI configure line with "... --with-ucx=from_runtime_env ..." and the resulting build will use whatever UCX is found at link time. Normally, that would be scary, since who knows what ancient version of UCX might be sitting around from the Linux distro, but my environment modules are enforcing the dependencies so that my UCX libraries will be found, even if a user does a "module swap". Hey, a cool benefit of lmod or modules v4+, but I digress. Anyway, if there is some other OMPI sanctioned way to do this, please let me know, or suggest a better way so I can upstream a patch. I have vague memories of being able to force this kind of behavior by doing something like "configure ... --with-ucx=/usr ...", but I couldn't find any documentation for that, and in doing code inspection of the m4 files revealed that such a feature (if it was an intended feature) had bitrotted. --- openmpi-4.1.0/config/ompi_check_ucx.m4.orig 2021-01-25 18:23:17.112499399 -0600 +++ openmpi-4.1.0/config/ompi_check_ucx.m4 2021-01-25 20:25:15.919338784 -0600 @@ -41,7 +41,7 @@ [ompi_check_ucx_dir=])], [true])]) ompi_check_ucx_happy="no" - AS_IF([test -z "$ompi_check_ucx_dir"], + AS_IF([test -z "$ompi_check_ucx_dir" || test "$ompi_check_ucx_dir" = "from_runtime_env"], [OPAL_CHECK_PACKAGE([ompi_check_ucx], [ucp/api/ucp.h], [ucp], Oddly, the Open MPI configure script already has overrides for ucx_CFLAGS, ucx_LIBS, and ucx_STATIC_LIBS, but nothing for
Re: [OMPI devel] How to build Open MPI so the UCX used can be changed at runtime?
Well, now it is a multi-line patch, and it is more hacky... but this works for me. Suggestions for a better thing to do to upstream this functionality would be appreciated. --- openmpi-4.1.0/config/ompi_check_ucx.m4.orig 2021-01-26 11:13:55.753451526 -0600 +++ openmpi-4.1.0/config/ompi_check_ucx.m4 2021-01-26 11:25:56.738822308 -0600 @@ -26,7 +26,8 @@ [AC_ARG_WITH([ucx], [AC_HELP_STRING([--with-ucx(=DIR)], [Build with Unified Communication X library support])]) -OPAL_CHECK_WITHDIR([ucx], [$with_ucx], [include/ucp/api/ucp.h]) +AS_IF([test "$with_ucx" != "from_runtime_env"], + [OPAL_CHECK_WITHDIR([ucx], [$with_ucx], [include/ucp/api/ucp.h])]) AC_ARG_WITH([ucx-libdir], [AC_HELP_STRING([--with-ucx-libdir=DIR], [Search for Unified Communication X libraries in DIR])]) @@ -41,7 +42,7 @@ [ompi_check_ucx_dir=])], [true])]) ompi_check_ucx_happy="no" - AS_IF([test -z "$ompi_check_ucx_dir"], + AS_IF([test -z "$ompi_check_ucx_dir" || test "$ompi_check_ucx_dir" = "from_runtime_env"], [OPAL_CHECK_PACKAGE([ompi_check_ucx], [ucp/api/ucp.h], [ucp], On Tue, Jan 26, 2021 at 12:04 PM Tim Mattox wrote: > > Ugh, apparently my one-line patch to > openmpi-4.1.0/config/ompi_check_ucx.m4 wasn't sufficient on a fresh > install... debugging... > > On Tue, Jan 26, 2021 at 10:16 AM Tim Mattox wrote: > > > > My environment modules were already setting LD_LIBRARY_PATH to point > > to my UCX lib directory. > > > > The real problem was that OMPI's config/ompi_check_ucx.m4 was > > recording the full path to the UCX library if it wasn't found in a > > standard system location (e.g. /lib, /lib64, /usr/lib, etc.). > > That is normally a good thing to do, since the chances that the > > average mpirun user will have setup their LD_LIBRARY_PATH is lower > > than the software installer would have done so correctly (I hope). > > In my case however, I'm setting up the environment modules to enforce > > the LD_LIBRARY_PATH to have an ABI compatible UCX available. > > > > I found two ways to override ompi_check_ucx.m4 to just let the linker > > find the UCX libraries on its own: > > 1) Modify UCX's lib/pkgconfig/ucx.pc to have "prefix = /usr" so that > > ompi_check_ucx.m4 would think UCX was in a standard system location > > (which is a lie). > > 2) Patch the OMPI configure system to allow me to force it to not > > hard-code the path to UCX into the OMPI .so files such as > > libmca_common_ucx.so > > > > I chose the latter, since there might be other users of my UCX > > installs, and I didn't want to possibly break them by having > > pkg_config lie. > > And, I was already having to patch an OMPI config/*.m4 file for a > > different reason, and was thus already having to run "./autogen.pl -f" > > anyway. > > So, with the patch below, I can now do an OMPI configure line with > > "... --with-ucx=from_runtime_env ..." and the resulting build will use > > whatever UCX is found at link time. > > Normally, that would be scary, since who knows what ancient version of > > UCX might be sitting around from the Linux distro, but my environment > > modules are enforcing the dependencies so that my UCX libraries will > > be found, even if a user does a "module swap". Hey, a cool benefit of > > lmod or modules v4+, but I digress. > > > > Anyway, if there is some other OMPI sanctioned way to do this, please > > let me know, or suggest a better way so I can upstream a patch. I > > have vague memories of being able to force this kind of behavior by > > doing something like "configure ... --with-ucx=/usr ...", but I > > couldn't find any documentation for that, and in doing code inspection > > of the m4 files revealed that such a feature (if it was an intended > > feature) had bitrotted. > > > > --- openmpi-4.1.0/config/ompi_check_ucx.m4.orig 2021-01-25 > > 18:23:17.112499399 -0600 > > +++ openmpi-4.1.0/config/ompi_check_ucx.m4 2021-01-25 20:25:15.919338784 > > -0600 > > @@ -41,7 +41,7 @@ > > > > [ompi_check_ucx_dir=])], > > [true])]) > >ompi_check_ucx_happy="no" > > - AS_IF([test -z "$ompi_check_ucx_dir"], > > + AS_IF([test -z "$ompi_check_ucx_dir" || test > > "$ompi_check_ucx_dir" = "from_runtime_env"], > > [OPAL_CHECK_PACKAGE([ompi_check_ucx], > > [ucp/api/ucp.h], > > [ucp], > > > > Oddly, the Open MPI configure script already has overrides for > > ucx_CFLAGS, ucx_LIBS, and ucx_STATIC_LIBS, but nothing for something > > like "ucx_LDFLAGS=''". > > I didn't see a simple way to add support for such an
Re: [OMPI devel] How to build Open MPI so the UCX used can be changed at runtime?
Ugh, apparently my one-line patch to openmpi-4.1.0/config/ompi_check_ucx.m4 wasn't sufficient on a fresh install... debugging... On Tue, Jan 26, 2021 at 10:16 AM Tim Mattox wrote: > > My environment modules were already setting LD_LIBRARY_PATH to point > to my UCX lib directory. > > The real problem was that OMPI's config/ompi_check_ucx.m4 was > recording the full path to the UCX library if it wasn't found in a > standard system location (e.g. /lib, /lib64, /usr/lib, etc.). > That is normally a good thing to do, since the chances that the > average mpirun user will have setup their LD_LIBRARY_PATH is lower > than the software installer would have done so correctly (I hope). > In my case however, I'm setting up the environment modules to enforce > the LD_LIBRARY_PATH to have an ABI compatible UCX available. > > I found two ways to override ompi_check_ucx.m4 to just let the linker > find the UCX libraries on its own: > 1) Modify UCX's lib/pkgconfig/ucx.pc to have "prefix = /usr" so that > ompi_check_ucx.m4 would think UCX was in a standard system location > (which is a lie). > 2) Patch the OMPI configure system to allow me to force it to not > hard-code the path to UCX into the OMPI .so files such as > libmca_common_ucx.so > > I chose the latter, since there might be other users of my UCX > installs, and I didn't want to possibly break them by having > pkg_config lie. > And, I was already having to patch an OMPI config/*.m4 file for a > different reason, and was thus already having to run "./autogen.pl -f" > anyway. > So, with the patch below, I can now do an OMPI configure line with > "... --with-ucx=from_runtime_env ..." and the resulting build will use > whatever UCX is found at link time. > Normally, that would be scary, since who knows what ancient version of > UCX might be sitting around from the Linux distro, but my environment > modules are enforcing the dependencies so that my UCX libraries will > be found, even if a user does a "module swap". Hey, a cool benefit of > lmod or modules v4+, but I digress. > > Anyway, if there is some other OMPI sanctioned way to do this, please > let me know, or suggest a better way so I can upstream a patch. I > have vague memories of being able to force this kind of behavior by > doing something like "configure ... --with-ucx=/usr ...", but I > couldn't find any documentation for that, and in doing code inspection > of the m4 files revealed that such a feature (if it was an intended > feature) had bitrotted. > > --- openmpi-4.1.0/config/ompi_check_ucx.m4.orig 2021-01-25 > 18:23:17.112499399 -0600 > +++ openmpi-4.1.0/config/ompi_check_ucx.m4 2021-01-25 20:25:15.919338784 -0600 > @@ -41,7 +41,7 @@ > [ompi_check_ucx_dir=])], > [true])]) >ompi_check_ucx_happy="no" > - AS_IF([test -z "$ompi_check_ucx_dir"], > + AS_IF([test -z "$ompi_check_ucx_dir" || test > "$ompi_check_ucx_dir" = "from_runtime_env"], > [OPAL_CHECK_PACKAGE([ompi_check_ucx], > [ucp/api/ucp.h], > [ucp], > > Oddly, the Open MPI configure script already has overrides for > ucx_CFLAGS, ucx_LIBS, and ucx_STATIC_LIBS, but nothing for something > like "ucx_LDFLAGS=''". > I didn't see a simple way to add support for such an override without > some more extensive changes to multiple m4 files. > > On Sun, Jan 24, 2021 at 7:08 PM Gilles Gouaillardet via devel > wrote: > > > > Tim, > > > > Have you tried using LD_LIBRARY_PATH? > > I guess "hardcoding the full path" means "link with -rpath", and IIRC, > > LD_LIBRARY_PATH > > overrides this setting. > > > > > > If this does not work, here something you can try (disclaimer: I did not) > > > > export LD_LIBRARY_PATH=/same/install/prefix/ucx/1.9.0/lib > > configure ... --with-ucx > > CPPFLAGS=-I/same/install/prefix/ucx/1.9.0/include > > LDFLAGS=-L/same/install/prefix/ucx/1.9.0/lib > > > > I expect the UCX components use libuct.so instead of > > /same/install/prefix/ucx/1.9.0/lib/libuct.so. > > If your users want the debug version, then you can simply change your > > LD_LIBRARY_PATH > > (module swap ucx should do the trick) > > > > Three caveats you should keep in mind: > > - it is your responsibility to ensure the debug and prod versions of > > UCX are ABI compatible > > - it will be mandatory to load a ucx module (otherwise Open MPI won't > > find UCX libraries) > > - this is a guess and I did not test this. > > > > > > An other option (I did not try) would be to install UCX on your build > > machine in /usr > > (since I expect /usr/lib/libuct.so is not hardcoded) and then use > > LD_LIBRARY_PATH > > (I assume your ucx module set it) to point to the UCX flavor of your > > choice). > > > > Cheers, > > > > Gilles > > > > On Mon, Jan 25, 2021 at 7:43 AM Tim Mattox via devel > > wrote: > > > > > > I'm specifically
Re: [OMPI devel] How to build Open MPI so the UCX used can be changed at runtime?
My environment modules were already setting LD_LIBRARY_PATH to point to my UCX lib directory. The real problem was that OMPI's config/ompi_check_ucx.m4 was recording the full path to the UCX library if it wasn't found in a standard system location (e.g. /lib, /lib64, /usr/lib, etc.). That is normally a good thing to do, since the chances that the average mpirun user will have setup their LD_LIBRARY_PATH is lower than the software installer would have done so correctly (I hope). In my case however, I'm setting up the environment modules to enforce the LD_LIBRARY_PATH to have an ABI compatible UCX available. I found two ways to override ompi_check_ucx.m4 to just let the linker find the UCX libraries on its own: 1) Modify UCX's lib/pkgconfig/ucx.pc to have "prefix = /usr" so that ompi_check_ucx.m4 would think UCX was in a standard system location (which is a lie). 2) Patch the OMPI configure system to allow me to force it to not hard-code the path to UCX into the OMPI .so files such as libmca_common_ucx.so I chose the latter, since there might be other users of my UCX installs, and I didn't want to possibly break them by having pkg_config lie. And, I was already having to patch an OMPI config/*.m4 file for a different reason, and was thus already having to run "./autogen.pl -f" anyway. So, with the patch below, I can now do an OMPI configure line with "... --with-ucx=from_runtime_env ..." and the resulting build will use whatever UCX is found at link time. Normally, that would be scary, since who knows what ancient version of UCX might be sitting around from the Linux distro, but my environment modules are enforcing the dependencies so that my UCX libraries will be found, even if a user does a "module swap". Hey, a cool benefit of lmod or modules v4+, but I digress. Anyway, if there is some other OMPI sanctioned way to do this, please let me know, or suggest a better way so I can upstream a patch. I have vague memories of being able to force this kind of behavior by doing something like "configure ... --with-ucx=/usr ...", but I couldn't find any documentation for that, and in doing code inspection of the m4 files revealed that such a feature (if it was an intended feature) had bitrotted. --- openmpi-4.1.0/config/ompi_check_ucx.m4.orig 2021-01-25 18:23:17.112499399 -0600 +++ openmpi-4.1.0/config/ompi_check_ucx.m4 2021-01-25 20:25:15.919338784 -0600 @@ -41,7 +41,7 @@ [ompi_check_ucx_dir=])], [true])]) ompi_check_ucx_happy="no" - AS_IF([test -z "$ompi_check_ucx_dir"], + AS_IF([test -z "$ompi_check_ucx_dir" || test "$ompi_check_ucx_dir" = "from_runtime_env"], [OPAL_CHECK_PACKAGE([ompi_check_ucx], [ucp/api/ucp.h], [ucp], Oddly, the Open MPI configure script already has overrides for ucx_CFLAGS, ucx_LIBS, and ucx_STATIC_LIBS, but nothing for something like "ucx_LDFLAGS=''". I didn't see a simple way to add support for such an override without some more extensive changes to multiple m4 files. On Sun, Jan 24, 2021 at 7:08 PM Gilles Gouaillardet via devel wrote: > > Tim, > > Have you tried using LD_LIBRARY_PATH? > I guess "hardcoding the full path" means "link with -rpath", and IIRC, > LD_LIBRARY_PATH > overrides this setting. > > > If this does not work, here something you can try (disclaimer: I did not) > > export LD_LIBRARY_PATH=/same/install/prefix/ucx/1.9.0/lib > configure ... --with-ucx > CPPFLAGS=-I/same/install/prefix/ucx/1.9.0/include > LDFLAGS=-L/same/install/prefix/ucx/1.9.0/lib > > I expect the UCX components use libuct.so instead of > /same/install/prefix/ucx/1.9.0/lib/libuct.so. > If your users want the debug version, then you can simply change your > LD_LIBRARY_PATH > (module swap ucx should do the trick) > > Three caveats you should keep in mind: > - it is your responsibility to ensure the debug and prod versions of > UCX are ABI compatible > - it will be mandatory to load a ucx module (otherwise Open MPI won't > find UCX libraries) > - this is a guess and I did not test this. > > > An other option (I did not try) would be to install UCX on your build > machine in /usr > (since I expect /usr/lib/libuct.so is not hardcoded) and then use > LD_LIBRARY_PATH > (I assume your ucx module set it) to point to the UCX flavor of your choice). > > Cheers, > > Gilles > > On Mon, Jan 25, 2021 at 7:43 AM Tim Mattox via devel > wrote: > > > > I'm specifically wanting my users to be able to load a "debug" vs. > > "tuned" UCX module, without me having to make two different Open MPI > > installs... the combinatorics get bad after a few versions (I'm > > already having multiple versions of Open MPI to handle the differences > > in Fortran mpi mod files for various compilers.) > > Here are the differences in the configure options between the two UCX > >