I had a look at the VT integration branch today. I was surprised by
a few things; I think we can do a little better on separation of VT
from the rest of OMPI. But this also touches on the larger general
concept of how we want to bundle other software packages in Open
MPI. Here's a few recommendations:
- VT is MPI-level tracing and therefore needs to be in the ompi/ tree
(not the top-level tree). That being said, we extensively discussed
libnbc integration here in Paris this week and think that it falls in
the same category as VT: they are both 3rd party MPI-level software
packages that are being bundled in OMPI. Hence, perhaps we should
really have an ompi/contrib directory where both vt and libnbc should
be placed (and any other 3rd-party/bundled MPI-related software
should be placed). More on this below.
- I notice that the VT configure script calls "wget" to check if
there is a more recent version of VT available, and if so, downloads/
expands/uses it. This is a very, very bad idea for at least the
following reasons:
- What if my system doesn't have wget? (OS X, Solaris)
- What if my system doesn't have / has broken internet access?
- What if I don't want the VT maintainers tracking which systems I
install OMPI/VT on? (web server accesses generate log entries for the
originating machines)
- What if I want to use exactly the version that is bundled in
Open MPI and not another? (this is very, very important for QA
certification, for example)
- What if the VT web site is no longer available?
If the VT developers would like to keep the "call home" functionality
in the production version of VT, fine -- but it needs to be
guaranteed to be completely and totally deactivated when shipping
with OMPI. As you can probably tell, I feel very strongly about
this. :-)
- I confess to having the subject of VT integration swapped out so
many times that I don't remember exactly how we agreed to do it :-(.
I see that the current integration added a .m4 file in config/,
acinclude'd it, and then called a setup macro from the main
configure.ac. I was a little confused as to why some VT things were
added in config/ompi_configure_options.m4 and others were added to
the VT-specific .m4 file -- why not put them all in the same place?
- I also found the mpi*-vt wrapper compilers in ompi/tools. I guess
I was surprised that VT had spread out over so many directories -- I
had really thought it would be entirely self-contained in its own
tree and only have a small top-level stub that tied it into the
overall configure/build system.
- Since we're now bundling two 3rd-party software packages to Open
MPI, I think we need a general solution for how to add/maintain them
(and others) over time. OMPI already has a fairly nice find-
configure-build system for components; it seems natural that 3rd
party packages should use a similar system.
- However, taking a brief look at autogen.sh and config/ompi_mca.m4,
it looks like it would be a *major* undertaking to do this (i.e.,
"discover" packages under ompi/contrib/ and set them up to configure/
build like we do with components). I unfortunately do not have
anything close to the cycles required to do this work. If anyone
else wants to do this work, I would caution you that this is
extremely hairy bourne shell and m4 code that the whole of OMPI
depends on -- I will be VERY picky about how it is modified :-) (not
trying to be a jerk here, but this code is pretty close to the heart
of OMPI's configure/build system -- breaking it will result in many
unhappy developers!).
- Therefore, I'm leaning towards filing a ticket to someday do this
stuff properly, but in the meantime, have a pseudo-hardcoded setup
for libnbc and VT in configure.ac. That is, I envision (but have not
thought out the details -- all of the following is subject to
change...) that both VT and libnbc will have configure.m4 scripts
that can be added to acinclude.m4 that AC_DEFUN well-known macros
that can be directly called from configure.ac. These macros can do
whatever the package wants/needs up in the top-level OMPI configure
script. The top-level OMPI script will then invoke
OMPI_CONFIG_SUBDIR to call the package's configure script.
Specifically, the package will get 2 chances for configuration stuff:
* their configure.m4 script for OMPI-bundling-level glue code
(e.g., AC_CONFIG_FILES for the wrapper compiler data files)
* their configure script for configuring/setting up to build the
real package when recursively invoked by top-level OMPI "make" targets
- I imagine that the trees for these packages will look like this:
ompi/contrib/<pkg>/ - top-level tree of OMPI bundling of the package
ompi/contrib/<pkg>/configure.m4 - acinclude'd in OMPI's configure.ac
ompi/contrib/<pkg>/... - any other relevant glue files (README,
wrapper compiler template files, etc.)
ompi/contrib/<pkg>/<pkg>/ - expanded tarball of the package
ompi/contrib/<pkg>/<pkg>/configure - package's distribution configure
script
ompi/contrib/<pkg>/<pkg>/... - rest of the files from the package
Note the 2nd subdir -- contrib/<pkg>/<pkg>/ -- this is where the
actual package distribution is placed (NOT in the top-level contrib/
<pkg>/ directory). As I've mentioned before, this is for two
reasons: a) allow a separate directory [namespace] for the OMPI-
specific package configure.m4 script (and/or other files), and b) it
*greatly* simplifies bringing in new versions if what is put in OMPI
is *exactly* the same as what is distributed independently of OMPI.
There is also precedent for this type of directory layout for a 3rd
party package in OMPI already -- ROMIO (ompi/mca/io/romio).
To be specific, here's an example with libnbc and vt:
ompi/contrib/vt/
ompi/contrib/vt/configure.m4
ompi/contrib/vt/...other files (such as wrapper compiler data
templates)
ompi/contrib/vt/vt/
ompi/contrib/vt/vt/configure
ompi/contrib/vt/vt/...other files (from VT tarball)
ompi/contrib/libnbc/
ompi/contrib/libnbc/configure.m4
ompi/contrib/libnbc/...other files (such as wrapper compiler data
templates)
ompi/contrib/libnbc/libnbc/
ompi/contrib/libnbc/libnbc/configure
ompi/contrib/libnbc/libnbc/...other files (from libnbc tarball)
Make sense?
I started the infrastructure work in /tmp/htor-nbc; I might be able
to finish it by the end of this week. This should make the VT/libnbc
integration quite simple, I think. The bulk of the work will be two
things:
1. create a configure.m4 that does whatever the VT integration needs/
wants (e.g., some AC_ARG_WITH / AC_ARG_ENABLE and AC_CONFIG_FILE
statements, perhaps some AC_MSG_CHECKING's, or whatever) as part of
the main OMPI configure script.
2. untar the distribution tarball in ompi/contrib/<pkg>/<pkg>.
Again, I want to emphasize the point that it *greatly* simplifies
future version upgrades if what is in ompi/contrib/<pkg>/<pkg> is
*exactly* the distribution tarball with zero modifications. The best-
case scenario/goal is to be able to do the following when importing a
new version of libnbc into OMPI:
shell$ cd ompi/contrib/libnbc
shell$ svn rm libnbc
shell$ svn ci -m "Remove old version of libnbc"
shell$ tar zxf /path/to/libNBC-1.2.3.tar.gz
shell$ mv libNBC-1.2.3 libnbc
shell$ svn add libnbc
shell$ svn ci libnbc -m "Upgrade libnbc to v1.2.3"
That should be *all* that a 3rd party package (vt, libnbc) needs to
do -- they should not need to modify any other files in the OMPI tree.
How does this all sound?
I realize that we have iterated on this a few times already and I'm
sorry for the changes. I think that this loose idea has been in my
head all along, but I probably have not properly elucidated it until
now. My apologies. :-\
--
Jeff Squyres
Cisco Systems