On Mon, 9 Jul 2007, Ralph Castain wrote:
For example, I can readily find machines that are running TM, but also have LSF and SLURM libraries installed (although those environments are not "active" - the libraries in some cases are old and stale, usually present because either someone wanted to look at them or represent an old installation).
Whatever the outcome of this discussion is, please keep in mind that this represents an exception rather than the rule. So the common cases of no batch environment or one batch environment installed should work as effortless as possible. Furthermore, keep in mind that there are lots of people who don't compile themselves Open MPI, but rely on packages compiled by others (Linux distributions, most likely) - so don't make life harder for those who produce these packages.
1. ... we would only build support for those environments that the builder specifies, and error out of the build process if multiple conflicting environments are specified.
I think that Ralf's suggestion (auto unless forced) is better, as it allows: - a better chance of finding the environments for people who don't have too much experience with building Open MPI or hate to RTFM - control over what is built or not for people who know what they are doing
This raises the issue of what to do with rsh, but I think we can handle that one by simply building it wherever possible.
I've been meaning to ask this for some time: is it possible to get rid of rsh support when building/running in an environment where rsh is not used (like a TM-based one) ? I'm not trying to achieve security by doing this (after all, a user can build a separate copy of Open MPI with rsh support), but just to make sure that the programs that I build are either using the "blessed" start-up mechanism or error out.
2. We could laboriously go through all the components and ensure that they check in their selection logic to see if that environment is active.
I might be missing something in the design of batch systems or software in general, but how do you decide that an environment is active or not ? Can a library check if it's being used in a program ? Or if that program actually runs ? And if a configuration file exists, does it mean that the environment is actually active ? How to deal with the case where there are several versions of the same batch system installed, all using the same configuration files and therefore being ready to run ? And how about the case where there is a machine reserved for compilations, where libraries are made available but there is no batch system active ?
-- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: bogdan.coste...@iwr.uni-heidelberg.de