Re: [OMPI devel] How to build Open MPI so the UCX used can be changed at runtime?

2021-02-01 Thread Tim Mattox via devel
FYI - I wasn’t bothered by the default behavior... I was just looking for a
sanctioned way for an installer (e.g. a sysadmin) to make the UCX be loaded
based on LD_LIBRARY_PATH so that there was an ability  for the user to swap
in a debug build of UCX at runtime.

On Mon, Feb 1, 2021 at 10:14 AM Peter Kjellström via devel <
devel@lists.open-mpi.org> wrote:

> On Mon, 1 Feb 2021 14:46:22 +
> "Jeff Squyres \(jsquyres\) via devel"  wrote:
>
> > On Jan 27, 2021, at 7:19 PM, Gilles Gouaillardet 
> > wrote:
> > >
> > > What I meant is the default Linux behavior is to first lookup
> > > dependencies in the rpath, and then fallback to LD_LIBRARY_PATH
> > > *unless* -Wl,--enable-new-dtags was used at link time.
> > >
> > > In the case of Open MPI, -Wl,--enable-new-dtags is added to the MPI
> > > wrappers, but Open MPI is *not* built with this option.
> >
> > Oh, I see where I got confused: Open MPI (core and DSO components) is
> > built with -rpath, but not --enable-new-dtags.
> >
> > Hmm.  ...trying to remember why we would have made that choice...
> >
> > I don't see any obvious reason cited in the git history.  Do you
> > remember?
>
> Many centers deploying OpenMPI on clusters probably DONT want runpath
> instead of rpath (what new-dtags does).
>
> For many "protection" against LD_LIBRARY_PATH in user environments is
> the whole point.
>
> In my opinion -rpath that doesn't win over LD_LIBRARY_PATH is near
> useless.
>
> my .02€
>  Peter
>
-- 
Tim Mattox (Sent from Gmail Mobile)


Re: [OMPI devel] How to build Open MPI so the UCX used can be changed at runtime?

2021-02-01 Thread Peter Kjellström via devel
On Mon, 1 Feb 2021 14:46:22 +
"Jeff Squyres \(jsquyres\) via devel"  wrote:

> On Jan 27, 2021, at 7:19 PM, Gilles Gouaillardet 
> wrote:
> > 
> > What I meant is the default Linux behavior is to first lookup
> > dependencies in the rpath, and then fallback to LD_LIBRARY_PATH
> > *unless* -Wl,--enable-new-dtags was used at link time.
> > 
> > In the case of Open MPI, -Wl,--enable-new-dtags is added to the MPI
> > wrappers, but Open MPI is *not* built with this option.  
> 
> Oh, I see where I got confused: Open MPI (core and DSO components) is
> built with -rpath, but not --enable-new-dtags.
> 
> Hmm.  ...trying to remember why we would have made that choice...
>
> I don't see any obvious reason cited in the git history.  Do you
> remember?

Many centers deploying OpenMPI on clusters probably DONT want runpath
instead of rpath (what new-dtags does).

For many "protection" against LD_LIBRARY_PATH in user environments is
the whole point.

In my opinion -rpath that doesn't win over LD_LIBRARY_PATH is near
useless.

my .02€
 Peter


Re: [OMPI devel] How to build Open MPI so the UCX used can be changed at runtime?

2021-02-01 Thread Jeff Squyres (jsquyres) via devel
On Jan 27, 2021, at 7:19 PM, Gilles Gouaillardet  wrote:
> 
> What I meant is the default Linux behavior is to first lookup dependencies in 
> the rpath, and then fallback to LD_LIBRARY_PATH
> *unless* -Wl,--enable-new-dtags was used at link time.
> 
> In the case of Open MPI, -Wl,--enable-new-dtags is added to the MPI wrappers,
> but Open MPI is *not* built with this option.

Oh, I see where I got confused: Open MPI (core and DSO components) is built 
with -rpath, but not --enable-new-dtags.

Hmm.  ...trying to remember why we would have made that choice...

I don't see any obvious reason cited in the git history.  Do you remember?

> That means, that by default, mca_pml_ucx.so and friends will get libuc?.so 
> libraries at runtime from rpath
> (and that cannot be overridden by LD_LIBRARY_PATH).

Gotcha.

-- 
Jeff Squyres
jsquy...@cisco.com



Re: [OMPI devel] How to build Open MPI so the UCX used can be changed at runtime?

2021-01-27 Thread Gilles Gouaillardet via devel

Jeff,


What I meant is the default Linux behavior is to first lookup 
dependencies in the rpath, and then fallback to LD_LIBRARY_PATH


*unless* -Wl,--enable-new-dtags was used at link time.


In the case of Open MPI, -Wl,--enable-new-dtags is added to the MPI 
wrappers,


but Open MPI is *not* built with this option.


That means, that by default, mca_pml_ucx.so and friends will get 
libuc?.so libraries at runtime from rpath


(and that cannot be overridden by LD_LIBRARY_PATH).


Cheers,


Gilles


On 1/28/2021 12:52 AM, Jeff Squyres (jsquyres) wrote:
On Jan 27, 2021, at 2:00 AM, Gilles Gouaillardet via devel 
mailto:devel@lists.open-mpi.org>> wrote:


Tim,

a simple option is to

configure ... LDFLAGS="-Wl,--enable-new-dtags"


If Open MPI is built with this option, then LD_LIBRARY_PATH takes 
precedence over rpath


(the default is the opposite as correctly pointed by Yossi in an 
earlier message)


Are you sure about the default?  I just did a default Open MPI v4.1.0 
build on Linux with gcc 8.x:


$ mpicc --showme
gcc -I/home/jsquyres/bogus/include -pthread -Wl,-rpath 
-Wl,/home/jsquyres/bogus/lib -Wl,--enable-new-dtags 
-L/home/jsquyres/bogus/lib -lmpi


--
Jeff Squyres
jsquy...@cisco.com 



Re: [OMPI devel] How to build Open MPI so the UCX used can be changed at runtime?

2021-01-27 Thread Tim Mattox via devel
Thank you for the suggestion of 'configure ...
LDFLAGS="-Wl,--enable-new-dtags"'.
I'm still reading up on its meaning, but wouldn't that change the
behavior across all dependencies that are dynamically linked when I
build Open MPI?

I was specifically wanting *just* these UCX .so files to be
dynamically linked from generic paths.
libucp.so.0
libuct.so.0
libucm.so.0
libucs.so.0

In my patched Open MPI 4.1.0 install, these files (well, the first is
a symbolic link to the real file) are the ones that needed the change:
lib/libmca_common_ucx.so
lib/openmpi/mca_atomic_ucx.so
lib/openmpi/mca_osc_ucx.so
lib/openmpi/mca_pml_ucx.so
lib/openmpi/mca_spml_ucx.so
lib/openmpi/mca_sshmem_ucx.so

My hacky patch to config/ompi_check_ucx.m4 did the trick for me.

On Wed, Jan 27, 2021 at 10:54 AM Jeff Squyres (jsquyres) via devel
 wrote:
>
> On Jan 27, 2021, at 2:00 AM, Gilles Gouaillardet via devel 
>  wrote:
>
> Tim,
>
> a simple option is to
>
> configure ... LDFLAGS="-Wl,--enable-new-dtags"
>
>
> If Open MPI is built with this option, then LD_LIBRARY_PATH takes precedence 
> over rpath
>
> (the default is the opposite as correctly pointed by Yossi in an earlier 
> message)
>
>
> Are you sure about the default?  I just did a default Open MPI v4.1.0 build 
> on Linux with gcc 8.x:
>
> $ mpicc --showme
> gcc -I/home/jsquyres/bogus/include -pthread -Wl,-rpath 
> -Wl,/home/jsquyres/bogus/lib -Wl,--enable-new-dtags 
> -L/home/jsquyres/bogus/lib -lmpi
>
> --
> Jeff Squyres
> jsquy...@cisco.com
>


-- 
Tim Mattox, Ph.D. - tmat...@gmail.com


Re: [OMPI devel] How to build Open MPI so the UCX used can be changed at runtime?

2021-01-27 Thread Jeff Squyres (jsquyres) via devel
On Jan 27, 2021, at 2:00 AM, Gilles Gouaillardet via devel 
mailto:devel@lists.open-mpi.org>> wrote:

Tim,

a simple option is to

configure ... LDFLAGS="-Wl,--enable-new-dtags"


If Open MPI is built with this option, then LD_LIBRARY_PATH takes precedence 
over rpath

(the default is the opposite as correctly pointed by Yossi in an earlier 
message)

Are you sure about the default?  I just did a default Open MPI v4.1.0 build on 
Linux with gcc 8.x:

$ mpicc --showme
gcc -I/home/jsquyres/bogus/include -pthread -Wl,-rpath 
-Wl,/home/jsquyres/bogus/lib -Wl,--enable-new-dtags -L/home/jsquyres/bogus/lib 
-lmpi

--
Jeff Squyres
jsquy...@cisco.com



Re: [OMPI devel] How to build Open MPI so the UCX used can be changed at runtime?

2021-01-26 Thread Gilles Gouaillardet via devel

Tim,


a simple option is to

configure ... LDFLAGS="-Wl,--enable-new-dtags"


If Open MPI is built with this option, then LD_LIBRARY_PATH takes 
precedence over rpath


(the default is the opposite as correctly pointed by Yossi in an earlier 
message)



Cheers,


Gilles

On 1/27/2021 2:48 AM, Tim Mattox via devel wrote:

Well, now it is a multi-line patch, and it is more hacky... but this
works for me.  Suggestions for a better thing to do to upstream this
functionality would be appreciated.

--- openmpi-4.1.0/config/ompi_check_ucx.m4.orig 2021-01-26
11:13:55.753451526 -0600
+++ openmpi-4.1.0/config/ompi_check_ucx.m4 2021-01-26 11:25:56.738822308 -0600
@@ -26,7 +26,8 @@
[AC_ARG_WITH([ucx],
  [AC_HELP_STRING([--with-ucx(=DIR)],
  [Build with Unified Communication X library support])])
-OPAL_CHECK_WITHDIR([ucx], [$with_ucx], [include/ucp/api/ucp.h])
+AS_IF([test "$with_ucx" != "from_runtime_env"],
+  [OPAL_CHECK_WITHDIR([ucx], [$with_ucx], [include/ucp/api/ucp.h])])
  AC_ARG_WITH([ucx-libdir],
  [AC_HELP_STRING([--with-ucx-libdir=DIR],
  [Search for Unified Communication X libraries in DIR])])
@@ -41,7 +42,7 @@
  [ompi_check_ucx_dir=])],
   [true])])
ompi_check_ucx_happy="no"
-  AS_IF([test -z "$ompi_check_ucx_dir"],
+  AS_IF([test -z "$ompi_check_ucx_dir" || test
"$ompi_check_ucx_dir" = "from_runtime_env"],
  [OPAL_CHECK_PACKAGE([ompi_check_ucx],
 [ucp/api/ucp.h],
 [ucp],

On Tue, Jan 26, 2021 at 12:04 PM Tim Mattox  wrote:

Ugh, apparently my one-line patch to
openmpi-4.1.0/config/ompi_check_ucx.m4 wasn't sufficient on a fresh
install... debugging...

On Tue, Jan 26, 2021 at 10:16 AM Tim Mattox  wrote:

My environment modules were already setting LD_LIBRARY_PATH to point
to my UCX lib directory.

The real problem was that OMPI's config/ompi_check_ucx.m4 was
recording the full path to the UCX library if it wasn't found in a
standard system location (e.g. /lib, /lib64, /usr/lib, etc.).
That is normally a good thing to do, since the chances that the
average mpirun user will have setup their LD_LIBRARY_PATH is lower
than the software installer would have done so correctly (I hope).
In my case however, I'm setting up the environment modules to enforce
the LD_LIBRARY_PATH to have an ABI compatible UCX available.

I found two ways to override ompi_check_ucx.m4 to just let the linker
find the UCX libraries on its own:
1) Modify UCX's lib/pkgconfig/ucx.pc to have "prefix = /usr" so that
ompi_check_ucx.m4 would think UCX was in a standard system location
(which is a lie).
2) Patch the OMPI configure system to allow me to force it to not
hard-code the path to UCX into the OMPI .so files such as
libmca_common_ucx.so

I chose the latter, since there might be other users of my UCX
installs, and I didn't want to possibly break them by having
pkg_config lie.
And, I was already having to patch an OMPI config/*.m4 file for a
different reason, and was thus already having to run "./autogen.pl -f"
anyway.
So, with the patch below, I can now do an OMPI configure line with
"... --with-ucx=from_runtime_env ..." and the resulting build will use
whatever UCX is found at link time.
Normally, that would be scary, since who knows what ancient version of
UCX might be sitting around from the Linux distro, but my environment
modules are enforcing the dependencies so that my UCX libraries will
be found, even if a user does a "module swap".  Hey, a cool benefit of
lmod or modules v4+, but I digress.

Anyway, if there is some other OMPI sanctioned way to do this, please
let me know, or suggest a better way so I can upstream a patch.  I
have vague memories of being able to force this kind of behavior by
doing  something like "configure ... --with-ucx=/usr ...", but I
couldn't find any documentation for that, and in doing code inspection
of the m4 files revealed that such a feature (if it was an intended
feature) had bitrotted.

--- openmpi-4.1.0/config/ompi_check_ucx.m4.orig 2021-01-25
18:23:17.112499399 -0600
+++ openmpi-4.1.0/config/ompi_check_ucx.m4 2021-01-25 20:25:15.919338784 -0600
@@ -41,7 +41,7 @@
  [ompi_check_ucx_dir=])],
   [true])])
ompi_check_ucx_happy="no"
-  AS_IF([test -z "$ompi_check_ucx_dir"],
+  AS_IF([test -z "$ompi_check_ucx_dir" || test
"$ompi_check_ucx_dir" = "from_runtime_env"],
  [OPAL_CHECK_PACKAGE([ompi_check_ucx],
 [ucp/api/ucp.h],
 [ucp],

Oddly, the Open MPI configure script already has overrides for
ucx_CFLAGS, ucx_LIBS, and ucx_STATIC_LIBS, but nothing for 

Re: [OMPI devel] How to build Open MPI so the UCX used can be changed at runtime?

2021-01-26 Thread Tim Mattox via devel
Well, now it is a multi-line patch, and it is more hacky... but this
works for me.  Suggestions for a better thing to do to upstream this
functionality would be appreciated.

--- openmpi-4.1.0/config/ompi_check_ucx.m4.orig 2021-01-26
11:13:55.753451526 -0600
+++ openmpi-4.1.0/config/ompi_check_ucx.m4 2021-01-26 11:25:56.738822308 -0600
@@ -26,7 +26,8 @@
   [AC_ARG_WITH([ucx],
 [AC_HELP_STRING([--with-ucx(=DIR)],
 [Build with Unified Communication X library support])])
-OPAL_CHECK_WITHDIR([ucx], [$with_ucx], [include/ucp/api/ucp.h])
+AS_IF([test "$with_ucx" != "from_runtime_env"],
+  [OPAL_CHECK_WITHDIR([ucx], [$with_ucx], [include/ucp/api/ucp.h])])
 AC_ARG_WITH([ucx-libdir],
 [AC_HELP_STRING([--with-ucx-libdir=DIR],
 [Search for Unified Communication X libraries in DIR])])
@@ -41,7 +42,7 @@
 [ompi_check_ucx_dir=])],
  [true])])
   ompi_check_ucx_happy="no"
-  AS_IF([test -z "$ompi_check_ucx_dir"],
+  AS_IF([test -z "$ompi_check_ucx_dir" || test
"$ompi_check_ucx_dir" = "from_runtime_env"],
 [OPAL_CHECK_PACKAGE([ompi_check_ucx],
[ucp/api/ucp.h],
[ucp],

On Tue, Jan 26, 2021 at 12:04 PM Tim Mattox  wrote:
>
> Ugh, apparently my one-line patch to
> openmpi-4.1.0/config/ompi_check_ucx.m4 wasn't sufficient on a fresh
> install... debugging...
>
> On Tue, Jan 26, 2021 at 10:16 AM Tim Mattox  wrote:
> >
> > My environment modules were already setting LD_LIBRARY_PATH to point
> > to my UCX lib directory.
> >
> > The real problem was that OMPI's config/ompi_check_ucx.m4 was
> > recording the full path to the UCX library if it wasn't found in a
> > standard system location (e.g. /lib, /lib64, /usr/lib, etc.).
> > That is normally a good thing to do, since the chances that the
> > average mpirun user will have setup their LD_LIBRARY_PATH is lower
> > than the software installer would have done so correctly (I hope).
> > In my case however, I'm setting up the environment modules to enforce
> > the LD_LIBRARY_PATH to have an ABI compatible UCX available.
> >
> > I found two ways to override ompi_check_ucx.m4 to just let the linker
> > find the UCX libraries on its own:
> > 1) Modify UCX's lib/pkgconfig/ucx.pc to have "prefix = /usr" so that
> > ompi_check_ucx.m4 would think UCX was in a standard system location
> > (which is a lie).
> > 2) Patch the OMPI configure system to allow me to force it to not
> > hard-code the path to UCX into the OMPI .so files such as
> > libmca_common_ucx.so
> >
> > I chose the latter, since there might be other users of my UCX
> > installs, and I didn't want to possibly break them by having
> > pkg_config lie.
> > And, I was already having to patch an OMPI config/*.m4 file for a
> > different reason, and was thus already having to run "./autogen.pl -f"
> > anyway.
> > So, with the patch below, I can now do an OMPI configure line with
> > "... --with-ucx=from_runtime_env ..." and the resulting build will use
> > whatever UCX is found at link time.
> > Normally, that would be scary, since who knows what ancient version of
> > UCX might be sitting around from the Linux distro, but my environment
> > modules are enforcing the dependencies so that my UCX libraries will
> > be found, even if a user does a "module swap".  Hey, a cool benefit of
> > lmod or modules v4+, but I digress.
> >
> > Anyway, if there is some other OMPI sanctioned way to do this, please
> > let me know, or suggest a better way so I can upstream a patch.  I
> > have vague memories of being able to force this kind of behavior by
> > doing  something like "configure ... --with-ucx=/usr ...", but I
> > couldn't find any documentation for that, and in doing code inspection
> > of the m4 files revealed that such a feature (if it was an intended
> > feature) had bitrotted.
> >
> > --- openmpi-4.1.0/config/ompi_check_ucx.m4.orig 2021-01-25
> > 18:23:17.112499399 -0600
> > +++ openmpi-4.1.0/config/ompi_check_ucx.m4 2021-01-25 20:25:15.919338784 
> > -0600
> > @@ -41,7 +41,7 @@
> >  
> > [ompi_check_ucx_dir=])],
> >   [true])])
> >ompi_check_ucx_happy="no"
> > -  AS_IF([test -z "$ompi_check_ucx_dir"],
> > +  AS_IF([test -z "$ompi_check_ucx_dir" || test
> > "$ompi_check_ucx_dir" = "from_runtime_env"],
> >  [OPAL_CHECK_PACKAGE([ompi_check_ucx],
> > [ucp/api/ucp.h],
> > [ucp],
> >
> > Oddly, the Open MPI configure script already has overrides for
> > ucx_CFLAGS, ucx_LIBS, and ucx_STATIC_LIBS, but nothing for something
> > like "ucx_LDFLAGS=''".
> > I didn't see a simple way to add support for such an 

Re: [OMPI devel] How to build Open MPI so the UCX used can be changed at runtime?

2021-01-26 Thread Tim Mattox via devel
Ugh, apparently my one-line patch to
openmpi-4.1.0/config/ompi_check_ucx.m4 wasn't sufficient on a fresh
install... debugging...

On Tue, Jan 26, 2021 at 10:16 AM Tim Mattox  wrote:
>
> My environment modules were already setting LD_LIBRARY_PATH to point
> to my UCX lib directory.
>
> The real problem was that OMPI's config/ompi_check_ucx.m4 was
> recording the full path to the UCX library if it wasn't found in a
> standard system location (e.g. /lib, /lib64, /usr/lib, etc.).
> That is normally a good thing to do, since the chances that the
> average mpirun user will have setup their LD_LIBRARY_PATH is lower
> than the software installer would have done so correctly (I hope).
> In my case however, I'm setting up the environment modules to enforce
> the LD_LIBRARY_PATH to have an ABI compatible UCX available.
>
> I found two ways to override ompi_check_ucx.m4 to just let the linker
> find the UCX libraries on its own:
> 1) Modify UCX's lib/pkgconfig/ucx.pc to have "prefix = /usr" so that
> ompi_check_ucx.m4 would think UCX was in a standard system location
> (which is a lie).
> 2) Patch the OMPI configure system to allow me to force it to not
> hard-code the path to UCX into the OMPI .so files such as
> libmca_common_ucx.so
>
> I chose the latter, since there might be other users of my UCX
> installs, and I didn't want to possibly break them by having
> pkg_config lie.
> And, I was already having to patch an OMPI config/*.m4 file for a
> different reason, and was thus already having to run "./autogen.pl -f"
> anyway.
> So, with the patch below, I can now do an OMPI configure line with
> "... --with-ucx=from_runtime_env ..." and the resulting build will use
> whatever UCX is found at link time.
> Normally, that would be scary, since who knows what ancient version of
> UCX might be sitting around from the Linux distro, but my environment
> modules are enforcing the dependencies so that my UCX libraries will
> be found, even if a user does a "module swap".  Hey, a cool benefit of
> lmod or modules v4+, but I digress.
>
> Anyway, if there is some other OMPI sanctioned way to do this, please
> let me know, or suggest a better way so I can upstream a patch.  I
> have vague memories of being able to force this kind of behavior by
> doing  something like "configure ... --with-ucx=/usr ...", but I
> couldn't find any documentation for that, and in doing code inspection
> of the m4 files revealed that such a feature (if it was an intended
> feature) had bitrotted.
>
> --- openmpi-4.1.0/config/ompi_check_ucx.m4.orig 2021-01-25
> 18:23:17.112499399 -0600
> +++ openmpi-4.1.0/config/ompi_check_ucx.m4 2021-01-25 20:25:15.919338784 -0600
> @@ -41,7 +41,7 @@
>  [ompi_check_ucx_dir=])],
>   [true])])
>ompi_check_ucx_happy="no"
> -  AS_IF([test -z "$ompi_check_ucx_dir"],
> +  AS_IF([test -z "$ompi_check_ucx_dir" || test
> "$ompi_check_ucx_dir" = "from_runtime_env"],
>  [OPAL_CHECK_PACKAGE([ompi_check_ucx],
> [ucp/api/ucp.h],
> [ucp],
>
> Oddly, the Open MPI configure script already has overrides for
> ucx_CFLAGS, ucx_LIBS, and ucx_STATIC_LIBS, but nothing for something
> like "ucx_LDFLAGS=''".
> I didn't see a simple way to add support for such an override without
> some more extensive changes to multiple m4 files.
>
> On Sun, Jan 24, 2021 at 7:08 PM Gilles Gouaillardet via devel
>  wrote:
> >
> > Tim,
> >
> > Have you tried using LD_LIBRARY_PATH?
> > I guess "hardcoding the full path" means "link with -rpath", and IIRC,
> > LD_LIBRARY_PATH
> > overrides this setting.
> >
> >
> > If this does not work, here something you can try (disclaimer: I did not)
> >
> > export LD_LIBRARY_PATH=/same/install/prefix/ucx/1.9.0/lib
> > configure ... --with-ucx
> > CPPFLAGS=-I/same/install/prefix/ucx/1.9.0/include
> > LDFLAGS=-L/same/install/prefix/ucx/1.9.0/lib
> >
> > I expect the UCX components use libuct.so instead of
> > /same/install/prefix/ucx/1.9.0/lib/libuct.so.
> > If your users want the debug version, then you can simply change your
> > LD_LIBRARY_PATH
> > (module swap ucx should do the trick)
> >
> > Three caveats you should keep in mind:
> >  - it is your responsibility to ensure the debug and prod versions of
> > UCX are ABI compatible
> >  - it will be mandatory to load a ucx module (otherwise Open MPI won't
> > find UCX libraries)
> >  - this is a guess and I did not test this.
> >
> >
> > An other option (I did not try) would be to install UCX on your build
> > machine in /usr
> > (since I expect /usr/lib/libuct.so is not hardcoded) and then use
> > LD_LIBRARY_PATH
> > (I assume your ucx module set it) to point to the UCX flavor of your 
> > choice).
> >
> > Cheers,
> >
> > Gilles
> >
> > On Mon, Jan 25, 2021 at 7:43 AM Tim Mattox via devel
> >  wrote:
> > >
> > > I'm specifically 

Re: [OMPI devel] How to build Open MPI so the UCX used can be changed at runtime?

2021-01-26 Thread Tim Mattox via devel
My environment modules were already setting LD_LIBRARY_PATH to point
to my UCX lib directory.

The real problem was that OMPI's config/ompi_check_ucx.m4 was
recording the full path to the UCX library if it wasn't found in a
standard system location (e.g. /lib, /lib64, /usr/lib, etc.).
That is normally a good thing to do, since the chances that the
average mpirun user will have setup their LD_LIBRARY_PATH is lower
than the software installer would have done so correctly (I hope).
In my case however, I'm setting up the environment modules to enforce
the LD_LIBRARY_PATH to have an ABI compatible UCX available.

I found two ways to override ompi_check_ucx.m4 to just let the linker
find the UCX libraries on its own:
1) Modify UCX's lib/pkgconfig/ucx.pc to have "prefix = /usr" so that
ompi_check_ucx.m4 would think UCX was in a standard system location
(which is a lie).
2) Patch the OMPI configure system to allow me to force it to not
hard-code the path to UCX into the OMPI .so files such as
libmca_common_ucx.so

I chose the latter, since there might be other users of my UCX
installs, and I didn't want to possibly break them by having
pkg_config lie.
And, I was already having to patch an OMPI config/*.m4 file for a
different reason, and was thus already having to run "./autogen.pl -f"
anyway.
So, with the patch below, I can now do an OMPI configure line with
"... --with-ucx=from_runtime_env ..." and the resulting build will use
whatever UCX is found at link time.
Normally, that would be scary, since who knows what ancient version of
UCX might be sitting around from the Linux distro, but my environment
modules are enforcing the dependencies so that my UCX libraries will
be found, even if a user does a "module swap".  Hey, a cool benefit of
lmod or modules v4+, but I digress.

Anyway, if there is some other OMPI sanctioned way to do this, please
let me know, or suggest a better way so I can upstream a patch.  I
have vague memories of being able to force this kind of behavior by
doing  something like "configure ... --with-ucx=/usr ...", but I
couldn't find any documentation for that, and in doing code inspection
of the m4 files revealed that such a feature (if it was an intended
feature) had bitrotted.

--- openmpi-4.1.0/config/ompi_check_ucx.m4.orig 2021-01-25
18:23:17.112499399 -0600
+++ openmpi-4.1.0/config/ompi_check_ucx.m4 2021-01-25 20:25:15.919338784 -0600
@@ -41,7 +41,7 @@
 [ompi_check_ucx_dir=])],
  [true])])
   ompi_check_ucx_happy="no"
-  AS_IF([test -z "$ompi_check_ucx_dir"],
+  AS_IF([test -z "$ompi_check_ucx_dir" || test
"$ompi_check_ucx_dir" = "from_runtime_env"],
 [OPAL_CHECK_PACKAGE([ompi_check_ucx],
[ucp/api/ucp.h],
[ucp],

Oddly, the Open MPI configure script already has overrides for
ucx_CFLAGS, ucx_LIBS, and ucx_STATIC_LIBS, but nothing for something
like "ucx_LDFLAGS=''".
I didn't see a simple way to add support for such an override without
some more extensive changes to multiple m4 files.

On Sun, Jan 24, 2021 at 7:08 PM Gilles Gouaillardet via devel
 wrote:
>
> Tim,
>
> Have you tried using LD_LIBRARY_PATH?
> I guess "hardcoding the full path" means "link with -rpath", and IIRC,
> LD_LIBRARY_PATH
> overrides this setting.
>
>
> If this does not work, here something you can try (disclaimer: I did not)
>
> export LD_LIBRARY_PATH=/same/install/prefix/ucx/1.9.0/lib
> configure ... --with-ucx
> CPPFLAGS=-I/same/install/prefix/ucx/1.9.0/include
> LDFLAGS=-L/same/install/prefix/ucx/1.9.0/lib
>
> I expect the UCX components use libuct.so instead of
> /same/install/prefix/ucx/1.9.0/lib/libuct.so.
> If your users want the debug version, then you can simply change your
> LD_LIBRARY_PATH
> (module swap ucx should do the trick)
>
> Three caveats you should keep in mind:
>  - it is your responsibility to ensure the debug and prod versions of
> UCX are ABI compatible
>  - it will be mandatory to load a ucx module (otherwise Open MPI won't
> find UCX libraries)
>  - this is a guess and I did not test this.
>
>
> An other option (I did not try) would be to install UCX on your build
> machine in /usr
> (since I expect /usr/lib/libuct.so is not hardcoded) and then use
> LD_LIBRARY_PATH
> (I assume your ucx module set it) to point to the UCX flavor of your choice).
>
> Cheers,
>
> Gilles
>
> On Mon, Jan 25, 2021 at 7:43 AM Tim Mattox via devel
>  wrote:
> >
> > I'm specifically wanting my users to be able to load a "debug" vs.
> > "tuned" UCX module, without me having to make two different Open MPI
> > installs... the combinatorics get bad after a few versions (I'm
> > already having multiple versions of Open MPI to handle the differences
> > in Fortran mpi mod files for various compilers.)
> > Here are the differences in the configure options between the two UCX 
> > 

Re: [OMPI devel] How to build Open MPI so the UCX used can be changed at runtime?

2021-01-24 Thread Gilles Gouaillardet via devel
Tim,

Have you tried using LD_LIBRARY_PATH?
I guess "hardcoding the full path" means "link with -rpath", and IIRC,
LD_LIBRARY_PATH
overrides this setting.


If this does not work, here something you can try (disclaimer: I did not)

export LD_LIBRARY_PATH=/same/install/prefix/ucx/1.9.0/lib
configure ... --with-ucx
CPPFLAGS=-I/same/install/prefix/ucx/1.9.0/include
LDFLAGS=-L/same/install/prefix/ucx/1.9.0/lib

I expect the UCX components use libuct.so instead of
/same/install/prefix/ucx/1.9.0/lib/libuct.so.
If your users want the debug version, then you can simply change your
LD_LIBRARY_PATH
(module swap ucx should do the trick)

Three caveats you should keep in mind:
 - it is your responsibility to ensure the debug and prod versions of
UCX are ABI compatible
 - it will be mandatory to load a ucx module (otherwise Open MPI won't
find UCX libraries)
 - this is a guess and I did not test this.


An other option (I did not try) would be to install UCX on your build
machine in /usr
(since I expect /usr/lib/libuct.so is not hardcoded) and then use
LD_LIBRARY_PATH
(I assume your ucx module set it) to point to the UCX flavor of your choice).

Cheers,

Gilles

On Mon, Jan 25, 2021 at 7:43 AM Tim Mattox via devel
 wrote:
>
> I'm specifically wanting my users to be able to load a "debug" vs.
> "tuned" UCX module, without me having to make two different Open MPI
> installs... the combinatorics get bad after a few versions (I'm
> already having multiple versions of Open MPI to handle the differences
> in Fortran mpi mod files for various compilers.)
> Here are the differences in the configure options between the two UCX modules:
> debug version: --enable-logging --enable-debug --enable-assertions
> --enable-params-check --prefix=/same/install/prefix/ucx/1.9.0/debug
> tuned version: --disable-logging --disable-debug --disable-assertions
> --disable-params-check --prefix=/same/install/prefix/ucx/1.9.0/tuned
>
> We noticed that the --enable-debug option for UCX has a pretty
> dramatic performance hit for one application (so far).
> I've already tested that everything works fine if I replace UCX's .so
> files manually in the filesystem, and the "new/changed" ones get
> loaded, but a user can't make that kind of swap.
> My hope is a user could type "module swap ucx/1.9.0/tuned
> ucx/1.9.0/debug" when they want to enable debugging at the UCX layer.
>
> On Sun, Jan 24, 2021 at 4:43 PM Yossi Itigin  wrote:
> >
> > Hi,
> >
> > One option is to use LD_PRELOAD to load all ucx libraries from a specific 
> > location
> > For example: mpirun -x 
> > LD_PRELOAD=:::
> >  ...  
> >
> > BTW, what is different about the other UCX configuration? Maybe this is 
> > something which can be resolved another way.
> >
> > --Yossi
> >
> > -Original Message-
> > From: devel  On Behalf Of Tim Mattox via 
> > devel
> > Sent: Sunday, 24 January 2021 23:18
> > To: devel@lists.open-mpi.org
> > Cc: Tim Mattox 
> > Subject: [OMPI devel] How to build Open MPI so the UCX used can be changed 
> > at runtime?
> >
> > Hello,
> > I've run into an application that has its performance dramatically affected 
> > by some configuration options to the underlying UCX library.
> > Is there a way to configure/build Open MPI so that which UCX library is 
> > used is determined at runtime (e.g. by an environment module), rather than 
> > having to configure/build different instances of Open MPI?
> >
> > When I configure Open MPI 4.1.0 with "--with-ucx" it is hardcoding the full 
> > path to the UCX .so library files to the UCX version it found at configure 
> > time.
> > --
> > Tim Mattox, Ph.D. - tmat...@gmail.com
>
>
>
> --
> Tim Mattox, Ph.D. - tmat...@gmail.com


Re: [OMPI devel] How to build Open MPI so the UCX used can be changed at runtime?

2021-01-24 Thread Tim Mattox via devel
I'm specifically wanting my users to be able to load a "debug" vs.
"tuned" UCX module, without me having to make two different Open MPI
installs... the combinatorics get bad after a few versions (I'm
already having multiple versions of Open MPI to handle the differences
in Fortran mpi mod files for various compilers.)
Here are the differences in the configure options between the two UCX modules:
debug version: --enable-logging --enable-debug --enable-assertions
--enable-params-check --prefix=/same/install/prefix/ucx/1.9.0/debug
tuned version: --disable-logging --disable-debug --disable-assertions
--disable-params-check --prefix=/same/install/prefix/ucx/1.9.0/tuned

We noticed that the --enable-debug option for UCX has a pretty
dramatic performance hit for one application (so far).
I've already tested that everything works fine if I replace UCX's .so
files manually in the filesystem, and the "new/changed" ones get
loaded, but a user can't make that kind of swap.
My hope is a user could type "module swap ucx/1.9.0/tuned
ucx/1.9.0/debug" when they want to enable debugging at the UCX layer.

On Sun, Jan 24, 2021 at 4:43 PM Yossi Itigin  wrote:
>
> Hi,
>
> One option is to use LD_PRELOAD to load all ucx libraries from a specific 
> location
> For example: mpirun -x 
> LD_PRELOAD=:::
>  ...  
>
> BTW, what is different about the other UCX configuration? Maybe this is 
> something which can be resolved another way.
>
> --Yossi
>
> -Original Message-
> From: devel  On Behalf Of Tim Mattox via 
> devel
> Sent: Sunday, 24 January 2021 23:18
> To: devel@lists.open-mpi.org
> Cc: Tim Mattox 
> Subject: [OMPI devel] How to build Open MPI so the UCX used can be changed at 
> runtime?
>
> Hello,
> I've run into an application that has its performance dramatically affected 
> by some configuration options to the underlying UCX library.
> Is there a way to configure/build Open MPI so that which UCX library is used 
> is determined at runtime (e.g. by an environment module), rather than having 
> to configure/build different instances of Open MPI?
>
> When I configure Open MPI 4.1.0 with "--with-ucx" it is hardcoding the full 
> path to the UCX .so library files to the UCX version it found at configure 
> time.
> --
> Tim Mattox, Ph.D. - tmat...@gmail.com



-- 
Tim Mattox, Ph.D. - tmat...@gmail.com