FWIW, Josh implemented "MCA-NULL" in https://svn.open-mpi.org/trac/ompi/changeset/18364
.
I'm not sure how I feel about this solution. On the one hand, it's
kind of a hack-ish way of solving the immediate issue. On the other
hand, it's really a larger issue of explicitly *not* setting an MCA
param (or knowing what source an MCA value originated from, depending
on how you look at it), something that we've never taken the time to
address properly. If we continue to not solve the larger issue, it's
going to come up again someday and someone will add yet another
workaround.
In both dimensions:
- I'm not entirely sure I understand the specific ORTE issue. Is it
that you want one "plm" MCA param value for mpirun and other value for
other processes (i.e., the orteds)? Or, more specifically, you want
plm X in mpirun, and *no* PLM's in the orteds?
- Would adding an enum indicating where an MCA value was retrieved
from help this situation? E.g., MCA_PARAM_ENVIRONMENT,
MCA_PARAM_FILE, MCA_PARAM_DEFAULT?
On May 3, 2008, at 12:02 PM, George Bosilca wrote:
The problem: The orted open all plm before discarding most of them,
all this in the context where a "--mca plm rsh" was present on the
mpirun invocation.
The non problem: In the context of the mpirun process, only the rsh
plm is opened, as the mpirun is the only process who get the "--mca
plm rsh" information. As this specific argument is not included on
the list of arguments we forward to the orted processes, there is no
way that the orted can abide to the imposed restriction. Note that
if the restriction is inserted in the config file, then even the
orted respect it. So far the only problem I can see here, is that
the orted are opening a framework that they are not supposed to (at
least not in most of the cases).
When we implemented the MCA filtering stuff, we proposed another
optimization. More specifically, a default component for all special
frameworks (i.e. used or not based on the type of process) that will
be statically linked inside the library (and therefore will not
generate any NFS traffic). Its only goal was to execute the
selection logic when any of its functions were called, in other
words on-demand component loading feature. Starting from there, a
real component will be selected, and all other calls to this
component will be directed to the selected component. I perfectly
remember that Ralph was completely against this feature for two
reasons: 1) all components in the ORTE framework had to be loaded
and they will do the "if(!hnp) return NULL"; 2) he proposed to
implement the null component.
I was and I'm still against 1) so I guess that any effort toward
implementing a null or none component will have my support.
george.
On May 2, 2008, at 4:40 PM, Josh Hursey wrote:
We could also call it 'null' for the empty set of components? Or
maybe
OMPI-NULL.
Outside of the naming do others this this is a useful feature to
implement?
-- Josh
On May 2, 2008, at 10:51 AM, Ralph Castain wrote:
I would think that adding a special keyword would be the correct
method. I
would suggest something with an "ompi" in it, perhaps capitalized so
there
is no confusion...something like "OMPI-NONE"?
On 5/2/08 8:37 AM, "Josh Hursey" <jjhur...@open-mpi.org> wrote:
I don't believe we have the logic in place to tell
mca_component_open
'do not open anything'. (I could be wrong though).
Adding such an option might be useful, but we would have to
consider
how that option should be specified by the user. Currently if you
do
not set a value (leave empty space in mca-params.conf) then the MCA
system takes this to indicate that all components are eligible for
selection. If you specify any options then only those options
should
be opened. We could add a special keyword (such as 'none') to
indicate
'open nothing'.
What do people think about that?
-- Josh
On May 2, 2008, at 10:22 AM, Ralph Castain wrote:
I see what the problem is. In the case of slurm, I don't want -
any-
components to be opened, even though I am going to call plm open/
select. I
have to leave that logic in place for those environments that -do-
want to
specify some backend secondary launcher.
So the question is: how do I tell mca_component_open "do not open
anything"?
If we don't have a mechanism for doing that, can we create one?
On 5/2/08 8:02 AM, "Ralph Castain" <r...@lanl.gov> wrote:
Well, I have a current version of the trunk. I add an MCA param
to
the
environment indicating that only rsh is to be used by the orted.
Yet I get
an output from every orted indicating that slurm (misspelled!) is
available
for selection.
This tells me that the slurm component is being opened, even
though
the
param is set.
I can check again to ensure that the param is set...
On 5/2/08 7:53 AM, "Jeff Squyres" <jsquy...@cisco.com> wrote:
(moving to devel list for wider audience)
Hmm. I thought the UTK stuff from a while ago supposedly
changed
this
behavior to only open the components that were specifically
requested.
This behavior looks like the *original* MCA behavior -- open
them
all,
then discard what we don't want (but doesn't necessarily reclaim
the
memory because of how dlclose works).
On May 2, 2008, at 9:48 AM, Ralph Castain wrote:
Yo guys
I've noticed something on the trunk that just doesn't strike me
as
correct.
If I specify "-mca plm rsh", it is my expectation that (a) only
the
rsh
component will be opened, and (b) only the rsh module will be
selected,
unless that component indicates that it cannot run.
What I am seeing, though, is that -all- the plm components are
being
opened.
This is not only unnecessary, but consumes memory and leads to
concern over
whether or not some other module could become active.
Is this the intended behavior? If so, may I suggest we change
it in
Josh's
branch prior to bringing it over?
Ralph
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
--
Jeff Squyres
Cisco Systems