Re: [OMPI devel] autodetect broken

Iain Bason Wed, 22 Jul 2009 13:46:49 -0400


On Jul 22, 2009, at 10:55 AM, Brian W. Barrett wrote:

The current autodetect implementation seems like the wrong approachto me. I'm rather unhappy the base functionality was hacked up likeit was without any advanced notice or questions about originaldesign intent. We seem to have a set of base functions which are nowmore unreadable than before, overly complex, and which leak memory.

First, I should apologize for my procedural misstep. I had asked mygroup here at Sun whether I should do an RFC or something, and I guessI didn't make my question clear enough. I was under the impressionthat I should check something in and let people comment on it.

That being said, I would argue that there are good reasons for addingsome complexity to the base component:

1. The pre-existing implementation of expansion is broken (althoughthe cases for which it is broken are somewhat obscure).

2. The autodetect component cannot set more than one directory withoutsome knowledge of the relationships amongst the various directories.Giving it that knowledge would violate the independence of thecomponents.

You can see #1 by doing "OPAL_PREFIX='${datarootdir}/..'OPAL_DATAROOTDIR='/opt/SUNWhpc/HPC8.2/share' mpirun hostname" (if youhave installed in /opt/SUNWhpc/HPC8.2). Yes, I know, "Why wouldanyone do that?" Nonetheless, I find that a poor excuse for having abug in the code.

To expand on #2 a little, consider the case where OpenMPI isconfigured with "--prefix=/path/one --libdir=/path/two". We can tellthat libopen-pal is in /path/two, but it is not correct to assume thatthe prefix is therefore /path. (Unfortunately, there is code inOpenMPI that does make that sort of assumption -- see orterun.c.) Weneed information from the config component to avoid making incorrectassumptions.

There are, of course, alternate ways of getting to the same point.But it is not feasible simply to leave the design of the basecomponent unchanged. (More on that below.)

As for readability, I am always open to constructive suggestions as tohow to make code more readable. I didn't fix the memory leak becauseI hadn't yet found a way to do that without decreasing readability.

The intent of the installdirs framework was to allow this type ofbehavior, but without rehacking all this infer crap into base. Theautodetect component should just set $prefix in the set of functionsit returns (and possibly libdir and bindir if you really want, whichmight actually make sense if you guess wrong), and let the expansioncode take over from there. The general thought on how this wouldwork went something like:
- Run after config
- If determine you have a special $prefix, set the
  opal_instal_dirs.prefix to NULL (yes, it's a bit of a hack) and
  set your special prefix.
- Same with bindir and libdir if needed
- Let expansion (which runs after all components have had the
  chance to fill in their fields) expand out with your special
  data

If we run the autodetect component after config, and allow it tooverride values that are already in opal_install_dirs, then there willbe no way for users to have environment variables take precedence.(Let's say someone runs with OPAL_LIBDIR=/foo. The autodetectcomponent will not know whether opal_install_dirs.libdir has been setby the env component or by the config component.)

Moreover, if the user has used an environment variable to override oneof the paths in the config component, then the autodetect componentmay make the wrong inference. For example, let's say someone runswith OPAL_LIBDIR=/foo. The autodetect component finds libopen-pal in /usr/renamed/lib, and sets opal_install_dirs.libdir to /usr/renamed/lib. However, it has to use the config component's idea of libdir(e.g., ${exec_prefix}/lib) to correctly infer that prefix should be /usr/renamed. Since it only has /foo from the environment variable, itwill have to decide that it cannot infer the prefix.

All of this will lead to behavior that users will have troublediagnosing. While I appreciate simple code, I think that a simpleuser interface is more important.

We could add some infrastructure so that the autodetect component canfigure out the provenance of each field in opal_install_dirs, but thatwould make the boundary between the base component and the autodetectcomponent unclear.

And the base stays simple, the components do all the heavy lifting,and life is happy.


Except in the cases where it doesn't work.

I would not be opposed to putting in a "find expaneded part" typefunction that takes two strings like "${prefix}/lib" and "/usr/local/lib" and returns "/usr/local" being added to the base so that otherautodetect-style components don't need to handle such a case, butthat's about the extent of the base changes I think are appropriate.
Finally, a first quick code review reveals a couple of problems:

- We don't AC_SUBST variables adding .lo files to build sources in
  OMPI.  Instead, we use AM_CONDITIONALS to add sources as needed.
- Obviously, there's a problem with the _happy variable name
  consistency in configure.m4
- There's a naming convention issue - files should all start with
  opal_installdirs_autodetect_, and a number of the added files
  do not.


Thanks.  I will fix those.

- From a finding code standpoint, I'd rather walkcontext.c and
  backtrace.c be one file with #ifs - for such short functions,
  it makes it more obvious that it's just portability implementations
  of the same function.

I'm afraid we'll just have to disagree on the aesthetics of this. Ifind that #ifs almost always make code harder to follow.


However, I'm willing to hold my nose if necessary.

I'd be happy to discuss the issues further or review any code beforeit gets committed. But I think the changes as they exist today(even with bugs fixed) aren't consistent with what the installdirsframework was trying to accomplish and should be reworked.

Let's discuss this off-alias. I have set the reply-to fieldaccordingly.


Iain

Re: [OMPI devel] autodetect broken

Reply via email to