FWIW, Josh implemented "MCA-NULL" in https://svn.open-mpi.org/trac/ompi/changeset/18364 . I'm not sure how I feel about this solution. On the one hand, it's kind of a hack-ish way of solving the immediate issue. On the other hand, it's really a larger issue of explicitly *not* setting an MCA param (or knowing what source an MCA value originated from, depending on how you look at it), something that we've never taken the time to address properly. If we continue to not solve the larger issue, it's going to come up again someday and someone will add yet another workaround.

In both dimensions:

- I'm not entirely sure I understand the specific ORTE issue. Is it that you want one "plm" MCA param value for mpirun and other value for other processes (i.e., the orteds)? Or, more specifically, you want plm X in mpirun, and *no* PLM's in the orteds?

- Would adding an enum indicating where an MCA value was retrieved from help this situation? E.g., MCA_PARAM_ENVIRONMENT, MCA_PARAM_FILE, MCA_PARAM_DEFAULT?


On May 3, 2008, at 12:02 PM, George Bosilca wrote:


The problem: The orted open all plm before discarding most of them, all this in the context where a "--mca plm rsh" was present on the mpirun invocation.

The non problem: In the context of the mpirun process, only the rsh plm is opened, as the mpirun is the only process who get the "--mca plm rsh" information. As this specific argument is not included on the list of arguments we forward to the orted processes, there is no way that the orted can abide to the imposed restriction. Note that if the restriction is inserted in the config file, then even the orted respect it. So far the only problem I can see here, is that the orted are opening a framework that they are not supposed to (at least not in most of the cases).

When we implemented the MCA filtering stuff, we proposed another optimization. More specifically, a default component for all special frameworks (i.e. used or not based on the type of process) that will be statically linked inside the library (and therefore will not generate any NFS traffic). Its only goal was to execute the selection logic when any of its functions were called, in other words on-demand component loading feature. Starting from there, a real component will be selected, and all other calls to this component will be directed to the selected component. I perfectly remember that Ralph was completely against this feature for two reasons: 1) all components in the ORTE framework had to be loaded and they will do the "if(!hnp) return NULL"; 2) he proposed to implement the null component.

I was and I'm still against 1) so I guess that any effort toward implementing a null or none component will have my support.

 george.

On May 2, 2008, at 4:40 PM, Josh Hursey wrote:

We could also call it 'null' for the empty set of components? Or maybe
OMPI-NULL.

Outside of the naming do others this this is a useful feature to
implement?

-- Josh

On May 2, 2008, at 10:51 AM, Ralph Castain wrote:

I would think that adding a special keyword would be the correct
method. I
would suggest something with an "ompi" in it, perhaps capitalized so
there
is no confusion...something like "OMPI-NONE"?


On 5/2/08 8:37 AM, "Josh Hursey" <jjhur...@open-mpi.org> wrote:

I don't believe we have the logic in place to tell mca_component_open
'do not open anything'. (I could be wrong though).

Adding such an option might be useful, but we would have to consider how that option should be specified by the user. Currently if you do
not set a value (leave empty space in mca-params.conf) then the MCA
system takes this to indicate that all components are eligible for
selection. If you specify any options then only those options should
be opened. We could add a special keyword (such as 'none') to
indicate
'open nothing'.

What do people think about that?

-- Josh


On May 2, 2008, at 10:22 AM, Ralph Castain wrote:

I see what the problem is. In the case of slurm, I don't want - any-
components to be opened, even though I am going to call plm open/
select. I
have to leave that logic in place for those environments that -do-
want to
specify some backend secondary launcher.

So the question is: how do I tell mca_component_open "do not open
anything"?

If we don't have a mechanism for doing that, can we create one?


On 5/2/08 8:02 AM, "Ralph Castain" <r...@lanl.gov> wrote:

Well, I have a current version of the trunk. I add an MCA param to
the
environment indicating that only rsh is to be used by the orted.
Yet I get
an output from every orted indicating that slurm (misspelled!) is
available
for selection.

This tells me that the slurm component is being opened, even though
the
param is set.

I can check again to ensure that the param is set...


On 5/2/08 7:53 AM, "Jeff Squyres" <jsquy...@cisco.com> wrote:

(moving to devel list for wider audience)

Hmm. I thought the UTK stuff from a while ago supposedly changed
this
behavior to only open the components that were specifically
requested.

This behavior looks like the *original* MCA behavior -- open them
all,
then discard what we don't want (but doesn't necessarily reclaim
the
memory because of how dlclose works).


On May 2, 2008, at 9:48 AM, Ralph Castain wrote:

Yo guys

I've noticed something on the trunk that just doesn't strike me
as
correct.
If I specify "-mca plm rsh", it is my expectation that (a) only
the
rsh
component will be opened, and (b) only the rsh module will be
selected,
unless that component indicates that it cannot run.

What I am seeing, though, is that -all- the plm components are
being
opened.
This is not only unnecessary, but consumes memory and leads to
concern over
whether or not some other module could become active.

Is this the intended behavior? If so, may I suggest we change
it in
Josh's
branch prior to bringing it over?

Ralph





_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


--
Jeff Squyres
Cisco Systems

Reply via email to