Re: [OMPI devel] How to build Open MPI so the UCX used can be changed at runtime?

2021-01-26 Thread Gilles Gouaillardet via devel

Tim,


a simple option is to

configure ... LDFLAGS="-Wl,--enable-new-dtags"


If Open MPI is built with this option, then LD_LIBRARY_PATH takes 
precedence over rpath


(the default is the opposite as correctly pointed by Yossi in an earlier 
message)



Cheers,


Gilles

On 1/27/2021 2:48 AM, Tim Mattox via devel wrote:

Well, now it is a multi-line patch, and it is more hacky... but this
works for me.  Suggestions for a better thing to do to upstream this
functionality would be appreciated.

--- openmpi-4.1.0/config/ompi_check_ucx.m4.orig 2021-01-26
11:13:55.753451526 -0600
+++ openmpi-4.1.0/config/ompi_check_ucx.m4 2021-01-26 11:25:56.738822308 -0600
@@ -26,7 +26,8 @@
[AC_ARG_WITH([ucx],
  [AC_HELP_STRING([--with-ucx(=DIR)],
  [Build with Unified Communication X library support])])
-OPAL_CHECK_WITHDIR([ucx], [$with_ucx], [include/ucp/api/ucp.h])
+AS_IF([test "$with_ucx" != "from_runtime_env"],
+  [OPAL_CHECK_WITHDIR([ucx], [$with_ucx], [include/ucp/api/ucp.h])])
  AC_ARG_WITH([ucx-libdir],
  [AC_HELP_STRING([--with-ucx-libdir=DIR],
  [Search for Unified Communication X libraries in DIR])])
@@ -41,7 +42,7 @@
  [ompi_check_ucx_dir=])],
   [true])])
ompi_check_ucx_happy="no"
-  AS_IF([test -z "$ompi_check_ucx_dir"],
+  AS_IF([test -z "$ompi_check_ucx_dir" || test
"$ompi_check_ucx_dir" = "from_runtime_env"],
  [OPAL_CHECK_PACKAGE([ompi_check_ucx],
 [ucp/api/ucp.h],
 [ucp],

On Tue, Jan 26, 2021 at 12:04 PM Tim Mattox  wrote:

Ugh, apparently my one-line patch to
openmpi-4.1.0/config/ompi_check_ucx.m4 wasn't sufficient on a fresh
install... debugging...

On Tue, Jan 26, 2021 at 10:16 AM Tim Mattox  wrote:

My environment modules were already setting LD_LIBRARY_PATH to point
to my UCX lib directory.

The real problem was that OMPI's config/ompi_check_ucx.m4 was
recording the full path to the UCX library if it wasn't found in a
standard system location (e.g. /lib, /lib64, /usr/lib, etc.).
That is normally a good thing to do, since the chances that the
average mpirun user will have setup their LD_LIBRARY_PATH is lower
than the software installer would have done so correctly (I hope).
In my case however, I'm setting up the environment modules to enforce
the LD_LIBRARY_PATH to have an ABI compatible UCX available.

I found two ways to override ompi_check_ucx.m4 to just let the linker
find the UCX libraries on its own:
1) Modify UCX's lib/pkgconfig/ucx.pc to have "prefix = /usr" so that
ompi_check_ucx.m4 would think UCX was in a standard system location
(which is a lie).
2) Patch the OMPI configure system to allow me to force it to not
hard-code the path to UCX into the OMPI .so files such as
libmca_common_ucx.so

I chose the latter, since there might be other users of my UCX
installs, and I didn't want to possibly break them by having
pkg_config lie.
And, I was already having to patch an OMPI config/*.m4 file for a
different reason, and was thus already having to run "./autogen.pl -f"
anyway.
So, with the patch below, I can now do an OMPI configure line with
"... --with-ucx=from_runtime_env ..." and the resulting build will use
whatever UCX is found at link time.
Normally, that would be scary, since who knows what ancient version of
UCX might be sitting around from the Linux distro, but my environment
modules are enforcing the dependencies so that my UCX libraries will
be found, even if a user does a "module swap".  Hey, a cool benefit of
lmod or modules v4+, but I digress.

Anyway, if there is some other OMPI sanctioned way to do this, please
let me know, or suggest a better way so I can upstream a patch.  I
have vague memories of being able to force this kind of behavior by
doing  something like "configure ... --with-ucx=/usr ...", but I
couldn't find any documentation for that, and in doing code inspection
of the m4 files revealed that such a feature (if it was an intended
feature) had bitrotted.

--- openmpi-4.1.0/config/ompi_check_ucx.m4.orig 2021-01-25
18:23:17.112499399 -0600
+++ openmpi-4.1.0/config/ompi_check_ucx.m4 2021-01-25 20:25:15.919338784 -0600
@@ -41,7 +41,7 @@
  [ompi_check_ucx_dir=])],
   [true])])
ompi_check_ucx_happy="no"
-  AS_IF([test -z "$ompi_check_ucx_dir"],
+  AS_IF([test -z "$ompi_check_ucx_dir" || test
"$ompi_check_ucx_dir" = "from_runtime_env"],
  [OPAL_CHECK_PACKAGE([ompi_check_ucx],
 [ucp/api/ucp.h],
 [ucp],

Oddly, the Open MPI configure script already has overrides for
ucx_CFLAGS, ucx_LIBS, and ucx_STATIC_LIBS, but nothing for 

Re: [OMPI devel] How to build Open MPI so the UCX used can be changed at runtime?

2021-01-26 Thread Tim Mattox via devel
Well, now it is a multi-line patch, and it is more hacky... but this
works for me.  Suggestions for a better thing to do to upstream this
functionality would be appreciated.

--- openmpi-4.1.0/config/ompi_check_ucx.m4.orig 2021-01-26
11:13:55.753451526 -0600
+++ openmpi-4.1.0/config/ompi_check_ucx.m4 2021-01-26 11:25:56.738822308 -0600
@@ -26,7 +26,8 @@
   [AC_ARG_WITH([ucx],
 [AC_HELP_STRING([--with-ucx(=DIR)],
 [Build with Unified Communication X library support])])
-OPAL_CHECK_WITHDIR([ucx], [$with_ucx], [include/ucp/api/ucp.h])
+AS_IF([test "$with_ucx" != "from_runtime_env"],
+  [OPAL_CHECK_WITHDIR([ucx], [$with_ucx], [include/ucp/api/ucp.h])])
 AC_ARG_WITH([ucx-libdir],
 [AC_HELP_STRING([--with-ucx-libdir=DIR],
 [Search for Unified Communication X libraries in DIR])])
@@ -41,7 +42,7 @@
 [ompi_check_ucx_dir=])],
  [true])])
   ompi_check_ucx_happy="no"
-  AS_IF([test -z "$ompi_check_ucx_dir"],
+  AS_IF([test -z "$ompi_check_ucx_dir" || test
"$ompi_check_ucx_dir" = "from_runtime_env"],
 [OPAL_CHECK_PACKAGE([ompi_check_ucx],
[ucp/api/ucp.h],
[ucp],

On Tue, Jan 26, 2021 at 12:04 PM Tim Mattox  wrote:
>
> Ugh, apparently my one-line patch to
> openmpi-4.1.0/config/ompi_check_ucx.m4 wasn't sufficient on a fresh
> install... debugging...
>
> On Tue, Jan 26, 2021 at 10:16 AM Tim Mattox  wrote:
> >
> > My environment modules were already setting LD_LIBRARY_PATH to point
> > to my UCX lib directory.
> >
> > The real problem was that OMPI's config/ompi_check_ucx.m4 was
> > recording the full path to the UCX library if it wasn't found in a
> > standard system location (e.g. /lib, /lib64, /usr/lib, etc.).
> > That is normally a good thing to do, since the chances that the
> > average mpirun user will have setup their LD_LIBRARY_PATH is lower
> > than the software installer would have done so correctly (I hope).
> > In my case however, I'm setting up the environment modules to enforce
> > the LD_LIBRARY_PATH to have an ABI compatible UCX available.
> >
> > I found two ways to override ompi_check_ucx.m4 to just let the linker
> > find the UCX libraries on its own:
> > 1) Modify UCX's lib/pkgconfig/ucx.pc to have "prefix = /usr" so that
> > ompi_check_ucx.m4 would think UCX was in a standard system location
> > (which is a lie).
> > 2) Patch the OMPI configure system to allow me to force it to not
> > hard-code the path to UCX into the OMPI .so files such as
> > libmca_common_ucx.so
> >
> > I chose the latter, since there might be other users of my UCX
> > installs, and I didn't want to possibly break them by having
> > pkg_config lie.
> > And, I was already having to patch an OMPI config/*.m4 file for a
> > different reason, and was thus already having to run "./autogen.pl -f"
> > anyway.
> > So, with the patch below, I can now do an OMPI configure line with
> > "... --with-ucx=from_runtime_env ..." and the resulting build will use
> > whatever UCX is found at link time.
> > Normally, that would be scary, since who knows what ancient version of
> > UCX might be sitting around from the Linux distro, but my environment
> > modules are enforcing the dependencies so that my UCX libraries will
> > be found, even if a user does a "module swap".  Hey, a cool benefit of
> > lmod or modules v4+, but I digress.
> >
> > Anyway, if there is some other OMPI sanctioned way to do this, please
> > let me know, or suggest a better way so I can upstream a patch.  I
> > have vague memories of being able to force this kind of behavior by
> > doing  something like "configure ... --with-ucx=/usr ...", but I
> > couldn't find any documentation for that, and in doing code inspection
> > of the m4 files revealed that such a feature (if it was an intended
> > feature) had bitrotted.
> >
> > --- openmpi-4.1.0/config/ompi_check_ucx.m4.orig 2021-01-25
> > 18:23:17.112499399 -0600
> > +++ openmpi-4.1.0/config/ompi_check_ucx.m4 2021-01-25 20:25:15.919338784 
> > -0600
> > @@ -41,7 +41,7 @@
> >  
> > [ompi_check_ucx_dir=])],
> >   [true])])
> >ompi_check_ucx_happy="no"
> > -  AS_IF([test -z "$ompi_check_ucx_dir"],
> > +  AS_IF([test -z "$ompi_check_ucx_dir" || test
> > "$ompi_check_ucx_dir" = "from_runtime_env"],
> >  [OPAL_CHECK_PACKAGE([ompi_check_ucx],
> > [ucp/api/ucp.h],
> > [ucp],
> >
> > Oddly, the Open MPI configure script already has overrides for
> > ucx_CFLAGS, ucx_LIBS, and ucx_STATIC_LIBS, but nothing for something
> > like "ucx_LDFLAGS=''".
> > I didn't see a simple way to add support for such an 

Re: [OMPI devel] How to build Open MPI so the UCX used can be changed at runtime?

2021-01-26 Thread Tim Mattox via devel
Ugh, apparently my one-line patch to
openmpi-4.1.0/config/ompi_check_ucx.m4 wasn't sufficient on a fresh
install... debugging...

On Tue, Jan 26, 2021 at 10:16 AM Tim Mattox  wrote:
>
> My environment modules were already setting LD_LIBRARY_PATH to point
> to my UCX lib directory.
>
> The real problem was that OMPI's config/ompi_check_ucx.m4 was
> recording the full path to the UCX library if it wasn't found in a
> standard system location (e.g. /lib, /lib64, /usr/lib, etc.).
> That is normally a good thing to do, since the chances that the
> average mpirun user will have setup their LD_LIBRARY_PATH is lower
> than the software installer would have done so correctly (I hope).
> In my case however, I'm setting up the environment modules to enforce
> the LD_LIBRARY_PATH to have an ABI compatible UCX available.
>
> I found two ways to override ompi_check_ucx.m4 to just let the linker
> find the UCX libraries on its own:
> 1) Modify UCX's lib/pkgconfig/ucx.pc to have "prefix = /usr" so that
> ompi_check_ucx.m4 would think UCX was in a standard system location
> (which is a lie).
> 2) Patch the OMPI configure system to allow me to force it to not
> hard-code the path to UCX into the OMPI .so files such as
> libmca_common_ucx.so
>
> I chose the latter, since there might be other users of my UCX
> installs, and I didn't want to possibly break them by having
> pkg_config lie.
> And, I was already having to patch an OMPI config/*.m4 file for a
> different reason, and was thus already having to run "./autogen.pl -f"
> anyway.
> So, with the patch below, I can now do an OMPI configure line with
> "... --with-ucx=from_runtime_env ..." and the resulting build will use
> whatever UCX is found at link time.
> Normally, that would be scary, since who knows what ancient version of
> UCX might be sitting around from the Linux distro, but my environment
> modules are enforcing the dependencies so that my UCX libraries will
> be found, even if a user does a "module swap".  Hey, a cool benefit of
> lmod or modules v4+, but I digress.
>
> Anyway, if there is some other OMPI sanctioned way to do this, please
> let me know, or suggest a better way so I can upstream a patch.  I
> have vague memories of being able to force this kind of behavior by
> doing  something like "configure ... --with-ucx=/usr ...", but I
> couldn't find any documentation for that, and in doing code inspection
> of the m4 files revealed that such a feature (if it was an intended
> feature) had bitrotted.
>
> --- openmpi-4.1.0/config/ompi_check_ucx.m4.orig 2021-01-25
> 18:23:17.112499399 -0600
> +++ openmpi-4.1.0/config/ompi_check_ucx.m4 2021-01-25 20:25:15.919338784 -0600
> @@ -41,7 +41,7 @@
>  [ompi_check_ucx_dir=])],
>   [true])])
>ompi_check_ucx_happy="no"
> -  AS_IF([test -z "$ompi_check_ucx_dir"],
> +  AS_IF([test -z "$ompi_check_ucx_dir" || test
> "$ompi_check_ucx_dir" = "from_runtime_env"],
>  [OPAL_CHECK_PACKAGE([ompi_check_ucx],
> [ucp/api/ucp.h],
> [ucp],
>
> Oddly, the Open MPI configure script already has overrides for
> ucx_CFLAGS, ucx_LIBS, and ucx_STATIC_LIBS, but nothing for something
> like "ucx_LDFLAGS=''".
> I didn't see a simple way to add support for such an override without
> some more extensive changes to multiple m4 files.
>
> On Sun, Jan 24, 2021 at 7:08 PM Gilles Gouaillardet via devel
>  wrote:
> >
> > Tim,
> >
> > Have you tried using LD_LIBRARY_PATH?
> > I guess "hardcoding the full path" means "link with -rpath", and IIRC,
> > LD_LIBRARY_PATH
> > overrides this setting.
> >
> >
> > If this does not work, here something you can try (disclaimer: I did not)
> >
> > export LD_LIBRARY_PATH=/same/install/prefix/ucx/1.9.0/lib
> > configure ... --with-ucx
> > CPPFLAGS=-I/same/install/prefix/ucx/1.9.0/include
> > LDFLAGS=-L/same/install/prefix/ucx/1.9.0/lib
> >
> > I expect the UCX components use libuct.so instead of
> > /same/install/prefix/ucx/1.9.0/lib/libuct.so.
> > If your users want the debug version, then you can simply change your
> > LD_LIBRARY_PATH
> > (module swap ucx should do the trick)
> >
> > Three caveats you should keep in mind:
> >  - it is your responsibility to ensure the debug and prod versions of
> > UCX are ABI compatible
> >  - it will be mandatory to load a ucx module (otherwise Open MPI won't
> > find UCX libraries)
> >  - this is a guess and I did not test this.
> >
> >
> > An other option (I did not try) would be to install UCX on your build
> > machine in /usr
> > (since I expect /usr/lib/libuct.so is not hardcoded) and then use
> > LD_LIBRARY_PATH
> > (I assume your ucx module set it) to point to the UCX flavor of your 
> > choice).
> >
> > Cheers,
> >
> > Gilles
> >
> > On Mon, Jan 25, 2021 at 7:43 AM Tim Mattox via devel
> >  wrote:
> > >
> > > I'm specifically 

Re: [OMPI devel] How to build Open MPI so the UCX used can be changed at runtime?

2021-01-26 Thread Tim Mattox via devel
My environment modules were already setting LD_LIBRARY_PATH to point
to my UCX lib directory.

The real problem was that OMPI's config/ompi_check_ucx.m4 was
recording the full path to the UCX library if it wasn't found in a
standard system location (e.g. /lib, /lib64, /usr/lib, etc.).
That is normally a good thing to do, since the chances that the
average mpirun user will have setup their LD_LIBRARY_PATH is lower
than the software installer would have done so correctly (I hope).
In my case however, I'm setting up the environment modules to enforce
the LD_LIBRARY_PATH to have an ABI compatible UCX available.

I found two ways to override ompi_check_ucx.m4 to just let the linker
find the UCX libraries on its own:
1) Modify UCX's lib/pkgconfig/ucx.pc to have "prefix = /usr" so that
ompi_check_ucx.m4 would think UCX was in a standard system location
(which is a lie).
2) Patch the OMPI configure system to allow me to force it to not
hard-code the path to UCX into the OMPI .so files such as
libmca_common_ucx.so

I chose the latter, since there might be other users of my UCX
installs, and I didn't want to possibly break them by having
pkg_config lie.
And, I was already having to patch an OMPI config/*.m4 file for a
different reason, and was thus already having to run "./autogen.pl -f"
anyway.
So, with the patch below, I can now do an OMPI configure line with
"... --with-ucx=from_runtime_env ..." and the resulting build will use
whatever UCX is found at link time.
Normally, that would be scary, since who knows what ancient version of
UCX might be sitting around from the Linux distro, but my environment
modules are enforcing the dependencies so that my UCX libraries will
be found, even if a user does a "module swap".  Hey, a cool benefit of
lmod or modules v4+, but I digress.

Anyway, if there is some other OMPI sanctioned way to do this, please
let me know, or suggest a better way so I can upstream a patch.  I
have vague memories of being able to force this kind of behavior by
doing  something like "configure ... --with-ucx=/usr ...", but I
couldn't find any documentation for that, and in doing code inspection
of the m4 files revealed that such a feature (if it was an intended
feature) had bitrotted.

--- openmpi-4.1.0/config/ompi_check_ucx.m4.orig 2021-01-25
18:23:17.112499399 -0600
+++ openmpi-4.1.0/config/ompi_check_ucx.m4 2021-01-25 20:25:15.919338784 -0600
@@ -41,7 +41,7 @@
 [ompi_check_ucx_dir=])],
  [true])])
   ompi_check_ucx_happy="no"
-  AS_IF([test -z "$ompi_check_ucx_dir"],
+  AS_IF([test -z "$ompi_check_ucx_dir" || test
"$ompi_check_ucx_dir" = "from_runtime_env"],
 [OPAL_CHECK_PACKAGE([ompi_check_ucx],
[ucp/api/ucp.h],
[ucp],

Oddly, the Open MPI configure script already has overrides for
ucx_CFLAGS, ucx_LIBS, and ucx_STATIC_LIBS, but nothing for something
like "ucx_LDFLAGS=''".
I didn't see a simple way to add support for such an override without
some more extensive changes to multiple m4 files.

On Sun, Jan 24, 2021 at 7:08 PM Gilles Gouaillardet via devel
 wrote:
>
> Tim,
>
> Have you tried using LD_LIBRARY_PATH?
> I guess "hardcoding the full path" means "link with -rpath", and IIRC,
> LD_LIBRARY_PATH
> overrides this setting.
>
>
> If this does not work, here something you can try (disclaimer: I did not)
>
> export LD_LIBRARY_PATH=/same/install/prefix/ucx/1.9.0/lib
> configure ... --with-ucx
> CPPFLAGS=-I/same/install/prefix/ucx/1.9.0/include
> LDFLAGS=-L/same/install/prefix/ucx/1.9.0/lib
>
> I expect the UCX components use libuct.so instead of
> /same/install/prefix/ucx/1.9.0/lib/libuct.so.
> If your users want the debug version, then you can simply change your
> LD_LIBRARY_PATH
> (module swap ucx should do the trick)
>
> Three caveats you should keep in mind:
>  - it is your responsibility to ensure the debug and prod versions of
> UCX are ABI compatible
>  - it will be mandatory to load a ucx module (otherwise Open MPI won't
> find UCX libraries)
>  - this is a guess and I did not test this.
>
>
> An other option (I did not try) would be to install UCX on your build
> machine in /usr
> (since I expect /usr/lib/libuct.so is not hardcoded) and then use
> LD_LIBRARY_PATH
> (I assume your ucx module set it) to point to the UCX flavor of your choice).
>
> Cheers,
>
> Gilles
>
> On Mon, Jan 25, 2021 at 7:43 AM Tim Mattox via devel
>  wrote:
> >
> > I'm specifically wanting my users to be able to load a "debug" vs.
> > "tuned" UCX module, without me having to make two different Open MPI
> > installs... the combinatorics get bad after a few versions (I'm
> > already having multiple versions of Open MPI to handle the differences
> > in Fortran mpi mod files for various compilers.)
> > Here are the differences in the configure options between the two UCX 
> >