Re: [OMPI devel] [devel-core] [RFC] Runtime Services Layer

2007-08-19 Thread Tim Prins
On Friday 17 August 2007 10:53:41 am Richard Graham wrote:
> Tim,
>   This looks like a good idea, and is a good step toward componentizing the
> run-time services the are available from the MPI's perspective.
>   A few comments:
>  - It is a good idea to play around in a sandbox to see what may or may not
> work - otherwise we are just guessing.  However, this is driven by the
> current code structure, and may or may not align with longer term plans.
> What is needed here, I believe, is a deliberate design - i.e. figure out
> where we want to go, and then see if anything in the implementation needs
> to change before it is moved over to the trunk.
This is a good point. I have tried to make the design reflect exactly what we 
need from a generic runtime system, and not just copying how we currently use 
orte. However, there are some things currently in the RSL which are somewhat 
orte specific (i.e. multiple init 'stages'), but should not interfer with 
other runtimes being used (since the runtimes can just use no-ops for 
unneeded stages).

>  - We are where we are, and can't just throw it away (it could even be
> exactly what we want), so even if our "ideal" is different than our current
> state, I believe incremental change is the way to go, to preserve an
> operating code base.
Agreed. I think the implementation of the rsl can be done with minimal risk of 
breaking things, since it requires little (if any) change in opal and orte, 
and mostly minor changes in ompi.

>   - I think it is way too early to talk about moving things over to the
> trunk in the next month or so, unless there sufficient evaluation can be
> done in a month or so.  This is not to discourage you at all, but just to
> caution against moving too fast, and then having to redo things.
>   I am very supportive if this, I do believe this is the right way to go,
> unless someone else can come up with a better idea, and time to implement.

Thanks for the comments,

Tim

>
> Thanks,
> Rich
>
> On 8/16/07 9:47 PM, "Tim Prins"  wrote:
> > WHAT: Solicitation of feedback on the possibility of adding a runtime
> > services layer to Open MPI to abstract out the runtime.
> >
> > WHY: To solidify the interface between OMPI and the runtime environment,
> > and to allow the use of different runtime systems, including different
> > versions of ORTE.
> >
> > WHERE: Addition of a new framework to OMPI, and changes to many of the
> > files in OMPI to funnel all runtime request through this framework. Few
> > changes should be required in OPAL and ORTE.
> >
> > WHEN: Development has started in tmp/rsl, but is still in its infancy. We
> > hope to have a working system in the next month.
> >
> > TIMEOUT: 8/29/07
> >
> > --
> > Short version:
> >
> > I am working on creating an interface between OMPI and the runtime
> > system. This would make a RSL framework in OMPI which all runtime
> > services would be accessed from. Attached is a graphic depicting this.
> >
> > This change would be invasive to the OMPI layer. Few (if any) changes
> > will be required of the ORTE and OPAL layers.
> >
> > At this point I am soliciting feedback as to whether people are
> > supportive or not of this change both in general and for v1.3.
> >
> >
> > Long version:
> >
> > The current model used in Open MPI assumes that one runtime system is
> > the best for all environments. However, in many environments it may be
> > beneficial to have specialized runtime systems. With our current system
> > this is not easy to do.
> >
> > With this in mind, the idea of creating a 'runtime services layer' was
> > hatched. This would take the form of a framework within OMPI, through
> > which all runtime functionality would be accessed. This would allow new
> > or different runtime systems to be used with Open MPI. Additionally, with
> > such a system it would be possible to have multiple versions of open rte
> > coexisting, which may facilitate development and testing. Finally, this
> > would solidify the interface between OMPI and the runtime system, as well
> > as provide documentation and side effects of each interface function.
> >
> > However, such a change would be fairly invasive to the OMPI layer, and
> > needs a buy-in from everyone for it to be possible.
> >
> > Here is a summary of the changes required for the RSL (at least how it is
> > currently envisioned):
> >
> > 1. Add a framework to ompi for the rsl, and a component to support orte.
> > 2. Change ompi so that it uses the new interface. This involves:
> >  a. Moving runtime specific code into the orte rsl component.
> >  b. Changing the process names in ompi to an opaque object.
> >  c. change all references to orte in ompi to be to the rsl.
> > 3. Change the configuration code so that open-rte is only linked where
> > needed.
> >
> > Of course, all this would happen on a tmp branch.
> >
> > The design of the rsl is not solidified. I have been playing in a tmp
> > branch (located at https://svn.op

Re: [OMPI devel] [devel-core] [RFC] Runtime Services Layer

2007-08-19 Thread Tim Prins
On Friday 17 August 2007 08:40:01 am Jeff Squyres wrote:
> I am definitely interested to see what the RSL turns out to be; I
> think it has many potential benefits.  There are also some obvious
> issues to be worked out (e.g., mpirun and friends).
Yeah, thinking through this and talking to others, it seems like the best way 
to deal with this is to say that mpirun points to our default runtime (orte), 
and that to use any other rsl component, you have to use that system's 
specific launcher (could be a 'srun', or a 'mpirun-foobar', whatever the 
system wants to do).

>
> As for whether this should go in v1.3, I don't know if it's possible
> to say yet -- it will depend on when RSL becomes [at least close to]
> ready, what the exact schedule for v1.3 is (which we've been skittish
> to define, since we're going for a feature-driven release), etc.

I agree that it is impossible to say right now, but wanted to throw it out 
there for people to consider/think about. 

Tim

>
> On Aug 16, 2007, at 9:47 PM, Tim Prins wrote:
> > WHAT: Solicitation of feedback on the possibility of adding a runtime
> > services layer to Open MPI to abstract out the runtime.
> >
> > WHY: To solidify the interface between OMPI and the runtime
> > environment,
> > and to allow the use of different runtime systems, including different
> > versions of ORTE.
> >
> > WHERE: Addition of a new framework to OMPI, and changes to many of the
> > files in OMPI to funnel all runtime request through this framework.
> > Few
> > changes should be required in OPAL and ORTE.
> >
> > WHEN: Development has started in tmp/rsl, but is still in its
> > infancy. We hope
> > to have a working system in the next month.
> >
> > TIMEOUT: 8/29/07
> >
> > --
> > Short version:
> >
> > I am working on creating an interface between OMPI and the runtime
> > system.
> > This would make a RSL framework in OMPI which all runtime services
> > would be
> > accessed from. Attached is a graphic depicting this.
> >
> > This change would be invasive to the OMPI layer. Few (if any) changes
> > will be required of the ORTE and OPAL layers.
> >
> > At this point I am soliciting feedback as to whether people are
> > supportive or not of this change both in general and for v1.3.
> >
> >
> > Long version:
> >
> > The current model used in Open MPI assumes that one runtime system is
> > the best for all environments. However, in many environments it may be
> > beneficial to have specialized runtime systems. With our current
> > system this
> > is not easy to do.
> >
> > With this in mind, the idea of creating a 'runtime services layer' was
> > hatched. This would take the form of a framework within OMPI,
> > through which
> > all runtime functionality would be accessed. This would allow new or
> > different runtime systems to be used with Open MPI. Additionally,
> > with such a
> > system it would be possible to have multiple versions of open rte
> > coexisting,
> > which may facilitate development and testing. Finally, this would
> > solidify the
> > interface between OMPI and the runtime system, as well as provide
> > documentation and side effects of each interface function.
> >
> > However, such a change would be fairly invasive to the OMPI layer, and
> > needs a buy-in from everyone for it to be possible.
> >
> > Here is a summary of the changes required for the RSL (at least how
> > it is
> > currently envisioned):
> >
> > 1. Add a framework to ompi for the rsl, and a component to support
> > orte.
> > 2. Change ompi so that it uses the new interface. This involves:
> >  a. Moving runtime specific code into the orte rsl component.
> >  b. Changing the process names in ompi to an opaque object.
> >  c. change all references to orte in ompi to be to the rsl.
> > 3. Change the configuration code so that open-rte is only linked
> > where needed.
> >
> > Of course, all this would happen on a tmp branch.
> >
> > The design of the rsl is not solidified. I have been playing in a
> > tmp branch
> > (located at https://svn.open-mpi.org/svn/ompi/tmp/rsl) which
> > everyone is
> > welcome to look at and comment on, but be advised that things here are
> > subject to change (I don't think it even compiles right now). There
> > are
> > some fairly large open questions on this, including:
> >
> > 1. How to handle mpirun (that is, when a user types 'mpirun', do they
> > always get ORTE, or do they sometimes get a system specific
> > runtime). Most
> > likely mpirun will always use ORTE, and alternative launching
> > programs would
> > be used for other runtimes.
> > 2. Whether there will be any performance implications. My guess is
> > not,
> > but am not quite sure of this yet.
> >
> > Again, I am interested in people's comments on whether they think
> > adding
> > such abstraction is good or not, and whether it is reasonable to do
> > such a
> > thing for v1.3.
> >
> > Thanks,
> >
> > Tim Prins
> > 
> > ___
> > devel-c