Re: [OMPI devel] [RFC] Runtime Services Layer

Tim Prins Tue, 21 Aug 2007 14:25:33 -0400

Terry,

Thanks for the comments. Responses below.


Terry D. Dontje wrote:

I think the concept is a good idea.  A few questions that come to mind:
1. Do you have a set of APIs you plan on supporting?

Do you mean the RSL API? Or do you mean the APIs of alternative runtimesystems?

The rsl API is inhttps://svn.open-mpi.org/svn/ompi/tmp/rsl/ompi/mca/rsl/rsl.h

As far as other runtime systems, I have not looked too much at whatothers support. However, I am trying to make the APIs in the RSL asgeneric as possible.

2.  Are you planning on adding new APIs (not currently supported by ORTE)?

Not in the sense of new functionality, but some of the APIs are quitedifferent then ORTE is currently using.

3.  Do any of the ORTE replacement APIs differ in how they work?

Well, every runtime does things differently.

For instance, looking at the MPICH PMI interface (which is sort-of theirversion of the RSL), they make heavy use of a key-value space. For theRSL, I am using process attributes which are similar in concept to this,but do work slightly differently.

Another difference is that the RSL exposes a out of band communicationinterface, which is not provided by the PMI. So if we used a runtimethat was based on the PMI, then we would have to do our own out-of-bandcommunication within the RSL component.

4.  Will RSL change in how we access information from the GPR?  If not
     how does this layer really separate us from ORTE?

Yes, although there is already a layer of abstraction here since the GPRusage in OMPI all goes through the modex code.

So what would happen with the RSL would be that the modex send/recvwould be called, which would then call the process attribute send/recvcode. Alternatively, the process attribute system could be called directly.

The process attribute system in the RSL would then use whateverimplementation specific system it wants to exchange the data.

5.  How will RSL handle OOB functionality (routing of messages)?

That is up to the rsl implementation. An out-of-band interface isprovided, and it is the components job to make sure the message isdelivered.

6.  How does making the process names opaque differ from how ORTE
names processes? Do you still need a global namespace for a"universe"?

Again, it is up to the implementation. OMPI assumes that all processnames it sees uniquely identify a remote process. In this sense, aglobal process namespace would need be needed. But if the rsl wanted todo some trickery to avoid the need for a global namespace, it probablycould.

I like the idea but I really wonder if this will even be half-baked intime for
1.3  (same concern as Jeff's).

Understood.

Tim

--td

Tim Prins wrote:
WHAT: Solicitation of feedback on the possibility of adding a runtimeservices layer to Open MPI to abstract out the runtime.
WHY: To solidify the interface between OMPI and the runtime environment,and to allow the use of different runtime systems, including differentversions of ORTE.
WHERE: Addition of a new framework to OMPI, and changes to many of thefiles in OMPI to funnel all runtime request through this framework. Fewchanges should be required in OPAL and ORTE.
WHEN: Development has started in tmp/rsl, but is still in its infancy. We hopeto have a working system in the next month.
TIMEOUT: 8/29/07

------
Short version:
I am working on creating an interface between OMPI and the runtime system.This would make a RSL framework in OMPI which all runtime services would beaccessed from. Attached is a graphic depicting this.
This change would be invasive to the OMPI layer. Few (if any) changeswill be required of the ORTE and OPAL layers.
At this point I am soliciting feedback as to whether people aresupportive or not of this change both in general and for v1.3.
Long version:
The current model used in Open MPI assumes that one runtime system isthe best for all environments. However, in many environments it may bebeneficial to have specialized runtime systems. With our current system thisis not easy to do.
With this in mind, the idea of creating a 'runtime services layer' washatched. This would take the form of a framework within OMPI, through whichall runtime functionality would be accessed. This would allow new ordifferent runtime systems to be used with Open MPI. Additionally, with such a
system it would be possible to have multiple versions of open rte coexisting,
which may facilitate development and testing. Finally, this would solidify theinterface between OMPI and the runtime system, as well as providedocumentation and side effects of each interface function.
However, such a change would be fairly invasive to the OMPI layer, andneeds a buy-in from everyone for it to be possible.
Here is a summary of the changes required for the RSL (at least how it iscurrently envisioned):
1. Add a framework to ompi for the rsl, and a component to support orte.
2. Change ompi so that it uses the new interface. This involves:
        a. Moving runtime specific code into the orte rsl component.
        b. Changing the process names in ompi to an opaque object.
        c. change all references to orte in ompi to be to the rsl.
3. Change the configuration code so that open-rte is only linked where needed.

Of course, all this would happen on a tmp branch.
The design of the rsl is not solidified. I have been playing in a tmp branch(located at https://svn.open-mpi.org/svn/ompi/tmp/rsl) which everyone iswelcome to look at and comment on, but be advised that things here aresubject to change (I don't think it even compiles right now). There aresome fairly large open questions on this, including:
1. How to handle mpirun (that is, when a user types 'mpirun', do theyalways get ORTE, or do they sometimes get a system specific runtime). Mostlikely mpirun will always use ORTE, and alternative launching programs wouldbe used for other runtimes.2. Whether there will be any performance implications. My guess is not,but am not quite sure of this yet.
Again, I am interested in people's comments on whether they think addingsuch abstraction is good or not, and whether it is reasonable to do such athing for v1.3.
Thanks,

Tim Prins

------------------------------------------------------------------------

_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] [RFC] Runtime Services Layer

Reply via email to