Yo all I have been working on adding/clarifying support for several environments and have encountered a problem that appears to be fairly common out there. Namely, machines that have - over the course of history or for specific reasons - installed libraries to support multiple environments. For example, I can readily find machines that are running TM, but also have LSF and SLURM libraries installed (although those environments are not "active" - the libraries in some cases are old and stale, usually present because either someone wanted to look at them or represent an old installation).
The problem is that our Open MPI build system automatically detects the presence of those libraries, builds the corresponding components, and then links those libraries into our system. Unfortunately, this causes two side-effects: 1. we wind up building and loading a bunch of components that we cannot use - which impacts memory footprint; and 2. not every component in every framework runs some library function to determine if that environment is actually active. Hence, our selection logic can sometimes get confused due to conflicting priorities, resulting in the selection of components that cause the system to crash A couple of solutions come immediately to mind: 1. The most obvious one (to me, at least) is to require that people provide "--with-xx" when they build the system. Instead of automatically detecting an include file and library, and then deciding that the existence of those files dictates that we build support for that environment, we would only build support for those environments that the builder specifies, and error out of the build process if multiple conflicting environments are specified. This raises the issue of what to do with rsh, but I think we can handle that one by simply building it wherever possible. 2. We could laboriously go through all the components and ensure that they check in their selection logic to see if that environment is active. This still causes libraries to be loaded for nothing, but keeps the automatic nature of the build system. We would have to deal with those environments that may not have a "safe" function we can call to see if they are "alive", or have old/stale libraries that may have differing behavior in their APIs, but perhaps those are few enough to not be a big problem. Any thoughts on this? It seems like we should solve this as it is becoming more prevalent (at least on the machines I test on). Ralph