On Jul 22, 2009, at 10:55 AM, Brian W. Barrett wrote:

The current autodetect implementation seems like the wrong approach to me. I'm rather unhappy the base functionality was hacked up like it was without any advanced notice or questions about original design intent. We seem to have a set of base functions which are now more unreadable than before, overly complex, and which leak memory.

First, I should apologize for my procedural misstep. I had asked my group here at Sun whether I should do an RFC or something, and I guess I didn't make my question clear enough. I was under the impression that I should check something in and let people comment on it.

That being said, I would argue that there are good reasons for adding some complexity to the base component:

1. The pre-existing implementation of expansion is broken (although the cases for which it is broken are somewhat obscure).

2. The autodetect component cannot set more than one directory without some knowledge of the relationships amongst the various directories. Giving it that knowledge would violate the independence of the components.

You can see #1 by doing "OPAL_PREFIX='${datarootdir}/..' OPAL_DATAROOTDIR='/opt/SUNWhpc/HPC8.2/share' mpirun hostname" (if you have installed in /opt/SUNWhpc/HPC8.2). Yes, I know, "Why would anyone do that?" Nonetheless, I find that a poor excuse for having a bug in the code.

To expand on #2 a little, consider the case where OpenMPI is configured with "--prefix=/path/one --libdir=/path/two". We can tell that libopen-pal is in /path/two, but it is not correct to assume that the prefix is therefore /path. (Unfortunately, there is code in OpenMPI that does make that sort of assumption -- see orterun.c.) We need information from the config component to avoid making incorrect assumptions.

There are, of course, alternate ways of getting to the same point. But it is not feasible simply to leave the design of the base component unchanged. (More on that below.)

As for readability, I am always open to constructive suggestions as to how to make code more readable. I didn't fix the memory leak because I hadn't yet found a way to do that without decreasing readability.

The intent of the installdirs framework was to allow this type of behavior, but without rehacking all this infer crap into base. The autodetect component should just set $prefix in the set of functions it returns (and possibly libdir and bindir if you really want, which might actually make sense if you guess wrong), and let the expansion code take over from there. The general thought on how this would work went something like:

- Run after config
- If determine you have a special $prefix, set the
  opal_instal_dirs.prefix to NULL (yes, it's a bit of a hack) and
  set your special prefix.
- Same with bindir and libdir if needed
- Let expansion (which runs after all components have had the
  chance to fill in their fields) expand out with your special
  data

If we run the autodetect component after config, and allow it to override values that are already in opal_install_dirs, then there will be no way for users to have environment variables take precedence. (Let's say someone runs with OPAL_LIBDIR=/foo. The autodetect component will not know whether opal_install_dirs.libdir has been set by the env component or by the config component.)

Moreover, if the user has used an environment variable to override one of the paths in the config component, then the autodetect component may make the wrong inference. For example, let's say someone runs with OPAL_LIBDIR=/foo. The autodetect component finds libopen-pal in / usr/renamed/lib, and sets opal_install_dirs.libdir to /usr/renamed/ lib. However, it has to use the config component's idea of libdir (e.g., ${exec_prefix}/lib) to correctly infer that prefix should be / usr/renamed. Since it only has /foo from the environment variable, it will have to decide that it cannot infer the prefix.

All of this will lead to behavior that users will have trouble diagnosing. While I appreciate simple code, I think that a simple user interface is more important.

We could add some infrastructure so that the autodetect component can figure out the provenance of each field in opal_install_dirs, but that would make the boundary between the base component and the autodetect component unclear.

And the base stays simple, the components do all the heavy lifting, and life is happy.

Except in the cases where it doesn't work.

I would not be opposed to putting in a "find expaneded part" type function that takes two strings like "${prefix}/lib" and "/usr/local/ lib" and returns "/usr/local" being added to the base so that other autodetect-style components don't need to handle such a case, but that's about the extent of the base changes I think are appropriate.

Finally, a first quick code review reveals a couple of problems:

- We don't AC_SUBST variables adding .lo files to build sources in
  OMPI.  Instead, we use AM_CONDITIONALS to add sources as needed.
- Obviously, there's a problem with the _happy variable name
  consistency in configure.m4
- There's a naming convention issue - files should all start with
  opal_installdirs_autodetect_, and a number of the added files
  do not.

Thanks.  I will fix those.

- From a finding code standpoint, I'd rather walkcontext.c and
  backtrace.c be one file with #ifs - for such short functions,
  it makes it more obvious that it's just portability implementations
  of the same function.

I'm afraid we'll just have to disagree on the aesthetics of this. I find that #ifs almost always make code harder to follow.

However, I'm willing to hold my nose if necessary.

I'd be happy to discuss the issues further or review any code before it gets committed. But I think the changes as they exist today (even with bugs fixed) aren't consistent with what the installdirs framework was trying to accomplish and should be reworked.

Let's discuss this off-alias. I have set the reply-to field accordingly.

Iain

Reply via email to