Re: [OMPI devel] [devel-core] [RFC] Runtime Services Layer

2007-08-28 Thread Ralph H Castain



On 8/27/07 7:30 AM, "Tim Prins"  wrote:

> Ralph,
> 
> Ralph H Castain wrote:
>> Just returned from vacation...sorry for delayed response
> No Problem. Hope you had a good vacation :) And sorry for my super
> delayed response. I have been pondering this a bit.
> 
>> In the past, I have expressed three concerns about the RSL.
>> 
>> 
>> My bottom line recommendation: I have no philosophical issue with the RSL
>> concept. However, I recommend holding off until the next version of ORTE is
>> completed and then re-evaluating to see how valuable the RSL might be, as
>> that next version will include memory footprint reduction and framework
>> consolidation that may yield much of the RSL's value without the extra work.
>> 
>> 
>> Long version:
>> 
>> 1. What problem are we really trying to solve?
>> If the RSL is intended to solve the Cray support problem (where the Cray OS
>> really just wants to see OMPI, not ORTE), then it may have some value. The
>> issue to date has revolved around the difficulty of maintaining the Cray
>> port in the face of changes to ORTE - as new frameworks are added, special
>> components for Cray also need to be created to provide a "do-nothing"
>> capability. In addition, the Cray is memory constrained, and the ORTE
>> library occupies considerable space while providing very little
>> functionality.
> This is definitely a motivation, but not the only one.

So...what are the others?

> 
>> The degree of value provide by the RSL will therefore depend somewhat on the
>> efficacy of the changes in development within ORTE. Those changes will,
>> among other things, significantly consolidate and reduce the number of
>> frameworks, and reduce the memory footprint. The expectation is that the
>> result will require only a single CNOS component in one framework. It isn't
>> clear, therefore, that the RSL will provide a significant value in that
>> environment.
> But won't there still be a lot of orte code linked in that will never be
> used?

Not really. The only thing left would be the stuff in runtime and util.

We have talked for years about creating an ORTE "services" framework -
basically, combining what is now in the runtime and util directories into a
single framework ala "svcs". The notion was that everything OS-specific
would go in there. What has held up implementation is (a) some thought that
maybe those things should go into OPAL instead of ORTE, and (b) low priority
and more important things to do.

However, if someone went ahead and implemented that idea, then you would
have a "NULL" component in the base that basically does a "no-op", and a
"default" component that provides actual services. Thus, for CNOS, you would
take the NULL component (so you don't open the framework's components and
avoid that memory overhead), and away you go.

I don't see how the RSL does anything better. Admittedly, you wouldn't have
to maintain the svcs APIs, but that doesn't seem any more onerous than
maintaining the RSL APIs as we change the MPI/RTE interfaces.

> 
> Also, a RSL would simplify ORTE in that there would be no need to do
> anything special for CNOs in it.

But if all I do is remove the ORTE cnos component and add an RSL cnos
component...what have I simplified?

> 
>> 
>> If the RSL is intended to aid in ORTE development, as hinted at in the RFC,
>> then I believe that is questionable. Developing ORTE in a tmp branch has
>> proven reasonably effective as changes to the MPI layer are largely
>> invisible to ORTE. Creating another layer to the system that would also have
>> to be maintained seems like a non-productive way of addressing any problems
>> in that area.
> Whether or not it would help in orte development remains to be seen. I
> just say that it might. Although I would argue that developing in tmp
> branches has caused a lot of problems with merging, etc.

Guess I don't see how this would solve the merge problems...but whatever.

> 
>> If the RSL is intended as a means of "freezing" the MPI-RTE interface, then
>> I believe we could better attain that objective by simply defining a set of
>> requirements for the RTE. As I'll note below, freezing the interface at an
>> API level could negatively impact other Open MPI objectives.
> It is intended to easily allow the development and use of other runtime
> systems, so simply defining requirements is not enough.

Could you please give some examples of these other runtimes?? Or is this
just hypothetical at this time?


> 
>> 2. Who is going to maintain old RTE versions, and why?
>> It isn't clear to me why anyone would want to do this - are we seriously
>> proposing that we maintain support for the ORTE layer that shipped with Open
>> MPI 1.0?? Can someone explain why we would want to do that?
> I highly doubt anyone would, and see no reason to include support for
> older runtime versions. Again, the purpose is to be able to run
> different runtimes. The ability to run different versions of the same
> runtime 

Re: [OMPI devel] [devel-core] [RFC] Runtime Services Layer

2007-08-27 Thread Tim Prins

Ralph,

Ralph H Castain wrote:

Just returned from vacation...sorry for delayed response
No Problem. Hope you had a good vacation :) And sorry for my super 
delayed response. I have been pondering this a bit.


In the past, I have expressed three concerns about the RSL. 



My bottom line recommendation: I have no philosophical issue with the RSL
concept. However, I recommend holding off until the next version of ORTE is
completed and then re-evaluating to see how valuable the RSL might be, as
that next version will include memory footprint reduction and framework
consolidation that may yield much of the RSL's value without the extra work.


Long version:

1. What problem are we really trying to solve?
If the RSL is intended to solve the Cray support problem (where the Cray OS
really just wants to see OMPI, not ORTE), then it may have some value. The
issue to date has revolved around the difficulty of maintaining the Cray
port in the face of changes to ORTE - as new frameworks are added, special
components for Cray also need to be created to provide a "do-nothing"
capability. In addition, the Cray is memory constrained, and the ORTE
library occupies considerable space while providing very little
functionality.

This is definitely a motivation, but not the only one.


The degree of value provide by the RSL will therefore depend somewhat on the
efficacy of the changes in development within ORTE. Those changes will,
among other things, significantly consolidate and reduce the number of
frameworks, and reduce the memory footprint. The expectation is that the
result will require only a single CNOS component in one framework. It isn't
clear, therefore, that the RSL will provide a significant value in that
environment.

But won't there still be a lot of orte code linked in that will never be
used?

Also, a RSL would simplify ORTE in that there would be no need to do
anything special for CNOs in it.



If the RSL is intended to aid in ORTE development, as hinted at in the RFC,
then I believe that is questionable. Developing ORTE in a tmp branch has
proven reasonably effective as changes to the MPI layer are largely
invisible to ORTE. Creating another layer to the system that would also have
to be maintained seems like a non-productive way of addressing any problems
in that area.

Whether or not it would help in orte development remains to be seen. I
just say that it might. Although I would argue that developing in tmp
branches has caused a lot of problems with merging, etc.


If the RSL is intended as a means of "freezing" the MPI-RTE interface, then
I believe we could better attain that objective by simply defining a set of
requirements for the RTE. As I'll note below, freezing the interface at an
API level could negatively impact other Open MPI objectives.

It is intended to easily allow the development and use of other runtime
systems, so simply defining requirements is not enough.


2. Who is going to maintain old RTE versions, and why?
It isn't clear to me why anyone would want to do this - are we seriously
proposing that we maintain support for the ORTE layer that shipped with Open
MPI 1.0?? Can someone explain why we would want to do that?

I highly doubt anyone would, and see no reason to include support for
older runtime versions. Again, the purpose is to be able to run
different runtimes. The ability to run different versions of the same
runtime is just a side-effect.




3. Are we constraining ourselves from further improvements in startup
performance?
This is my biggest area of concern. The RSL has been proposed as an
API-level definition. However, the MPI-RTE interaction really is defined in
terms of a flow-of-control - although each point of interaction is
instantiated as an API, the fact is that what happens at that point is not
independent of all prior interactions.

As an example of my concern, consider what we are currently doing with ORTE.
The latest change in requirements involves the need to significantly improve
startup time, reduce memory footprint, and reduce ORTE complexity. What we
are doing to meet that requirement is to review the delineation of
responsibilities between the MPI and RTE layers. The current delineation
evolved over time, with many of the decisions made at a very early point in
the program. For example, we instituted RTE-level stage gates in the MPI
layer because, at the time they were needed, the MPI developers didn't want
to deal with them on their side (e.g., ensuring that failure of one proc
wouldn't hang the system). Given today's level of maturity in the MPI layer,
we are now planning on moving the stage gates to the MPI layer, implemented
as an "all-to-all" - this will remove several thousand lines of code from
ORTE and make it easier for the MPI layer to operate on non-ORTE
environments.

Similar efforts are underway to reduce ORTE involvement in the modex
operation and other parts of the MPI application lifecycle. We are able to
do these things because we are 

Re: [OMPI devel] [devel-core] [RFC] Runtime Services Layer

2007-08-24 Thread Doug Tody
On Fri, 24 Aug 2007, George Bosilca wrote:

> On Aug 24, 2007, at 9:50 AM, Tim Prins wrote:
> > I do not understand why a user should have to use a RTE which supports
> > every system ever imagined, and provides every possible fault-tolerant
> > feature, when all they want is a thin RTE.
> 
> We have all the ingredients to make a this RTE layer, i.e. loadable  
> modules. The approach we proposed few months ago, to load a component  
> only when we know it will be needed give us a very slim RTE (once  
> applied everywhere it make sense). The biggest problem I see here is  
> that we will start scattering our efforts on multiple things instead  
> of working together to make what we have right now the best it can be.

I'm all for focusing effort on ORTE and making it the best it can
be, but it would seem that a more formalized component-framework
interface between the MPI layer and all of ORTE could potentially
help to achieve this.

What would be ideal would be if the OpenMPI project could define
such an interface, and also provide and support a standard reference
version of ORTE which implements this functionality.  This could
provide the OpenMPI project with the minimal/stable run time layer it
needs, but at the same time make it much easier for outside projects
with other requirements to experiment with enhanced versions of ORTE,
without having to worry about the impact on core OpenMPI development.
This need not splinter the effort, rather it might make it possible for
others outside the core OpenMPI development team to more effectively
contribute to and use OpenMPI and ORTE, in particular when it comes
to integration of the software into new environments.

- Doug

National Radio Astronomy Observatory (NRAO)
US National Virtual Observatory (NVO)


Re: [OMPI devel] [devel-core] [RFC] Runtime Services Layer

2007-08-24 Thread George Bosilca


On Aug 24, 2007, at 9:50 AM, Tim Prins wrote:


Again, my main concern is about fault tolerance. There is nothing in
PMI (and nothing in RSL so far) that allow any kind of fault
tolerance [And believe me re-writing the MPICH mpirun to allow
checkpoint/restart is a hassle].
I am open to any extensions that are needed. Again, the current  
version

is designed as a starting point. Also, I have been talking a lot with
Josh and the current RSL is more than enough to support
checkpoint/restart as currently implemented. I would be interested in
talking about any additions that are needed.


Right, but that's a side effect. The coordinated checkpoint is not  
very intrusive, it only requires a limited set of capabilities, which  
are usually delivered by all RTE. However, if you look just a little  
bit further, uncoordinated checkpoint (where only one of the  
processes have to be restarted and join the others in their old  
"world"), you will notice that the current interface (RSL or PMI)  
will not support this.





Moreover, your approach seems to
open the possibility of having heterogeneous RTE (in terms of
features) which in my view is definitively the wrong approach.

Do you mean having different RTEs that support different features?
Personally I do not see this as a horrible thing. In fact, we already
deal with this problem, since different systems support different
things. For instance, we support comm_spawn on most systems, but  
not all.


This is again a side effect of the incapacity of the underlying  
systems of providing the most elementary features we need. But, with  
ORTE at least we have the potential to overcome these limitations.



I do not understand why a user should have to use a RTE which supports
every system ever imagined, and provides every possible fault-tolerant
feature, when all they want is a thin RTE.


We have all the ingredients to make a this RTE layer, i.e. loadable  
modules. The approach we proposed few months ago, to load a component  
only when we know it will be needed give us a very slim RTE (once  
applied everywhere it make sense). The biggest problem I see here is  
that we will start scattering our efforts on multiple things instead  
of working together to make what we have right now the best it can be.


  george.



Tim



   george.

On Aug 16, 2007, at 9:47 PM, Tim Prins wrote:

WHAT: Solicitation of feedback on the possibility of adding a  
runtime

services layer to Open MPI to abstract out the runtime.

WHY: To solidify the interface between OMPI and the runtime
environment,
and to allow the use of different runtime systems, including  
different

versions of ORTE.

WHERE: Addition of a new framework to OMPI, and changes to many  
of the

files in OMPI to funnel all runtime request through this framework.
Few
changes should be required in OPAL and ORTE.

WHEN: Development has started in tmp/rsl, but is still in its
infancy. We hope
to have a working system in the next month.

TIMEOUT: 8/29/07

--
Short version:

I am working on creating an interface between OMPI and the runtime
system.
This would make a RSL framework in OMPI which all runtime services
would be
accessed from. Attached is a graphic depicting this.

This change would be invasive to the OMPI layer. Few (if any)  
changes

will be required of the ORTE and OPAL layers.

At this point I am soliciting feedback as to whether people are
supportive or not of this change both in general and for v1.3.


Long version:

The current model used in Open MPI assumes that one runtime  
system is
the best for all environments. However, in many environments it  
may be

beneficial to have specialized runtime systems. With our current
system this
is not easy to do.

With this in mind, the idea of creating a 'runtime services  
layer' was

hatched. This would take the form of a framework within OMPI,
through which
all runtime functionality would be accessed. This would allow new or
different runtime systems to be used with Open MPI. Additionally,
with such a
system it would be possible to have multiple versions of open rte
coexisting,
which may facilitate development and testing. Finally, this would
solidify the
interface between OMPI and the runtime system, as well as provide
documentation and side effects of each interface function.

However, such a change would be fairly invasive to the OMPI  
layer, and

needs a buy-in from everyone for it to be possible.

Here is a summary of the changes required for the RSL (at least how
it is
currently envisioned):

1. Add a framework to ompi for the rsl, and a component to support
orte.
2. Change ompi so that it uses the new interface. This involves:
 a. Moving runtime specific code into the orte rsl  
component.

 b. Changing the process names in ompi to an opaque object.
 c. change all references to orte in ompi to be to the rsl.
3. Change the configuration code so that open-rte is only linked
where needed.

Of course, all this would happen 

Re: [OMPI devel] [devel-core] [RFC] Runtime Services Layer

2007-08-24 Thread Brian Barrett

On Aug 24, 2007, at 9:08 AM, George Bosilca wrote:


By heterogeneous RTE I was talking about what will happened once we
have the RSL. Different back-end will support different features, so
from the user perspective we will not provide a homogeneous execution
environment in all situations. On the other hand, focusing our
efforts in ORTE will guarantee this homogeneity in all cases.


Is this a good thing?  I think no, and we already don't have it.  On  
Cray, we don't use mpirun but yod.  Livermore wants us to use SLURM  
directly instead of our mpirun kludge.  Those are heterogeneous from  
the user perspective.  But are also what the user expects on those  
platforms.


Brian


Re: [OMPI devel] [devel-core] [RFC] Runtime Services Layer

2007-08-24 Thread Tim Prins

George Bosilca wrote:
Looks like I'm the only one barely excited about this idea. The  
system that you described, is well known. It been around for around  
10 years, and it's called PMI. The interface you have in the tmp  
branch as well as the description you gave in your email are more  
than similar with what they sketch in the following two documents:


http://www-unix.mcs.anl.gov/mpi/mpich/developer/design/pmiv2draft.htm
http://www-unix.mcs.anl.gov/mpi/mpich/developer/design/pmiv2.htm
Yes, I am well acquainted with these documents, and the PMI did provide 
a lot of inspiration for the RSL.


Now, there is something wrong with reinventing the wheel if there are  
no improvements. And so far I'm unable to notice any major  
improvement neither compared with PMI nor with what we have today  
(except maybe being able to use PMI inside Open MPI).
This is true. The RSL is designed to handle exactly what we need right 
now. This does not mean that the interface cannot be extended later. The 
current RSL is a starting point.


Again, my main concern is about fault tolerance. There is nothing in  
PMI (and nothing in RSL so far) that allow any kind of fault  
tolerance [And believe me re-writing the MPICH mpirun to allow  
checkpoint/restart is a hassle].
I am open to any extensions that are needed. Again, the current version 
is designed as a starting point. Also, I have been talking a lot with 
Josh and the current RSL is more than enough to support 
checkpoint/restart as currently implemented. I would be interested in 
talking about any additions that are needed.


Moreover, your approach seems to  
open the possibility of having heterogeneous RTE (in terms of  
features) which in my view is definitively the wrong approach.
Do you mean having different RTEs that support different features? 
Personally I do not see this as a horrible thing. In fact, we already 
deal with this problem, since different systems support different 
things. For instance, we support comm_spawn on most systems, but not all.


I do not understand why a user should have to use a RTE which supports 
every system ever imagined, and provides every possible fault-tolerant 
feature, when all they want is a thin RTE.


Tim



   george.

On Aug 16, 2007, at 9:47 PM, Tim Prins wrote:


WHAT: Solicitation of feedback on the possibility of adding a runtime
services layer to Open MPI to abstract out the runtime.

WHY: To solidify the interface between OMPI and the runtime  
environment,

and to allow the use of different runtime systems, including different
versions of ORTE.

WHERE: Addition of a new framework to OMPI, and changes to many of the
files in OMPI to funnel all runtime request through this framework.  
Few

changes should be required in OPAL and ORTE.

WHEN: Development has started in tmp/rsl, but is still in its  
infancy. We hope

to have a working system in the next month.

TIMEOUT: 8/29/07

--
Short version:

I am working on creating an interface between OMPI and the runtime  
system.
This would make a RSL framework in OMPI which all runtime services  
would be

accessed from. Attached is a graphic depicting this.

This change would be invasive to the OMPI layer. Few (if any) changes
will be required of the ORTE and OPAL layers.

At this point I am soliciting feedback as to whether people are
supportive or not of this change both in general and for v1.3.


Long version:

The current model used in Open MPI assumes that one runtime system is
the best for all environments. However, in many environments it may be
beneficial to have specialized runtime systems. With our current  
system this

is not easy to do.

With this in mind, the idea of creating a 'runtime services layer' was
hatched. This would take the form of a framework within OMPI,  
through which

all runtime functionality would be accessed. This would allow new or
different runtime systems to be used with Open MPI. Additionally,  
with such a
system it would be possible to have multiple versions of open rte  
coexisting,
which may facilitate development and testing. Finally, this would  
solidify the

interface between OMPI and the runtime system, as well as provide
documentation and side effects of each interface function.

However, such a change would be fairly invasive to the OMPI layer, and
needs a buy-in from everyone for it to be possible.

Here is a summary of the changes required for the RSL (at least how  
it is

currently envisioned):

1. Add a framework to ompi for the rsl, and a component to support  
orte.

2. Change ompi so that it uses the new interface. This involves:
 a. Moving runtime specific code into the orte rsl component.
 b. Changing the process names in ompi to an opaque object.
 c. change all references to orte in ompi to be to the rsl.
3. Change the configuration code so that open-rte is only linked  
where needed.


Of course, all this would happen on a tmp branch.

The design of the rsl is not solidified. I 

Re: [OMPI devel] [devel-core] [RFC] Runtime Services Layer

2007-08-24 Thread Terry D. Dontje

George Bosilca wrote:

Looks like I'm the only one barely excited about this idea. The  
system that you described, is well known. It been around for around  
10 years, and it's called PMI. The interface you have in the tmp  
branch as well as the description you gave in your email are more  
than similar with what they sketch in the following two documents:


http://www-unix.mcs.anl.gov/mpi/mpich/developer/design/pmiv2draft.htm
http://www-unix.mcs.anl.gov/mpi/mpich/developer/design/pmiv2.htm

Now, there is something wrong with reinventing the wheel if there are  
no improvements. And so far I'm unable to notice any major  
improvement neither compared with PMI nor with what we have today  
(except maybe being able to use PMI inside Open MPI).


 


I agree with the first sentence above.  I think this goes along
the line of Raplh's comment of "what are we trying to solve here?"
When this all started about 6 months ago I think the main concern
was finding what interfaces existed between ORTE and OMPI.  Though
I am not sure how that blossomed into redesigning the interface.
Not saying there isn't a reason to just that we should step back
and make sure we know why we are.

Again, my main concern is about fault tolerance. There is nothing in  
PMI (and nothing in RSL so far) that allow any kind of fault  
tolerance [And believe me re-writing the MPICH mpirun to allow  
checkpoint/restart is a hassle]. Moreover, your approach seems to  
open the possibility of having heterogeneous RTE (in terms of  
features) which in my view is definitively the wrong approach.


 


I am curious about this last paragraph.  Is it your belief that the current
ORTE does lend itself to being extended to incorporate fault tolerance?

Also, by heterogenous RTE are you meaning RTE running on a cluster
of heterogenous set of platforms?  If so, I would like to understand why
you think that is the "wrong" approach. 


--td


  george.

On Aug 16, 2007, at 9:47 PM, Tim Prins wrote:

 


WHAT: Solicitation of feedback on the possibility of adding a runtime
services layer to Open MPI to abstract out the runtime.

WHY: To solidify the interface between OMPI and the runtime  
environment,

and to allow the use of different runtime systems, including different
versions of ORTE.

WHERE: Addition of a new framework to OMPI, and changes to many of the
files in OMPI to funnel all runtime request through this framework.  
Few

changes should be required in OPAL and ORTE.

WHEN: Development has started in tmp/rsl, but is still in its  
infancy. We hope

to have a working system in the next month.

TIMEOUT: 8/29/07

--
Short version:

I am working on creating an interface between OMPI and the runtime  
system.
This would make a RSL framework in OMPI which all runtime services  
would be

accessed from. Attached is a graphic depicting this.

This change would be invasive to the OMPI layer. Few (if any) changes
will be required of the ORTE and OPAL layers.

At this point I am soliciting feedback as to whether people are
supportive or not of this change both in general and for v1.3.


Long version:

The current model used in Open MPI assumes that one runtime system is
the best for all environments. However, in many environments it may be
beneficial to have specialized runtime systems. With our current  
system this

is not easy to do.

With this in mind, the idea of creating a 'runtime services layer' was
hatched. This would take the form of a framework within OMPI,  
through which

all runtime functionality would be accessed. This would allow new or
different runtime systems to be used with Open MPI. Additionally,  
with such a
system it would be possible to have multiple versions of open rte  
coexisting,
which may facilitate development and testing. Finally, this would  
solidify the

interface between OMPI and the runtime system, as well as provide
documentation and side effects of each interface function.

However, such a change would be fairly invasive to the OMPI layer, and
needs a buy-in from everyone for it to be possible.

Here is a summary of the changes required for the RSL (at least how  
it is

currently envisioned):

1. Add a framework to ompi for the rsl, and a component to support  
orte.

2. Change ompi so that it uses the new interface. This involves:
a. Moving runtime specific code into the orte rsl component.
b. Changing the process names in ompi to an opaque object.
c. change all references to orte in ompi to be to the rsl.
3. Change the configuration code so that open-rte is only linked  
where needed.


Of course, all this would happen on a tmp branch.

The design of the rsl is not solidified. I have been playing in a  
tmp branch
(located at https://svn.open-mpi.org/svn/ompi/tmp/rsl) which  
everyone is

welcome to look at and comment on, but be advised that things here are
subject to change (I don't think it even compiles right now). There  
are

some fairly large open questions on this, 

Re: [OMPI devel] [devel-core] [RFC] Runtime Services Layer

2007-08-24 Thread George Bosilca
Looks like I'm the only one barely excited about this idea. The  
system that you described, is well known. It been around for around  
10 years, and it's called PMI. The interface you have in the tmp  
branch as well as the description you gave in your email are more  
than similar with what they sketch in the following two documents:


http://www-unix.mcs.anl.gov/mpi/mpich/developer/design/pmiv2draft.htm
http://www-unix.mcs.anl.gov/mpi/mpich/developer/design/pmiv2.htm

Now, there is something wrong with reinventing the wheel if there are  
no improvements. And so far I'm unable to notice any major  
improvement neither compared with PMI nor with what we have today  
(except maybe being able to use PMI inside Open MPI).


Again, my main concern is about fault tolerance. There is nothing in  
PMI (and nothing in RSL so far) that allow any kind of fault  
tolerance [And believe me re-writing the MPICH mpirun to allow  
checkpoint/restart is a hassle]. Moreover, your approach seems to  
open the possibility of having heterogeneous RTE (in terms of  
features) which in my view is definitively the wrong approach.


  george.

On Aug 16, 2007, at 9:47 PM, Tim Prins wrote:


WHAT: Solicitation of feedback on the possibility of adding a runtime
services layer to Open MPI to abstract out the runtime.

WHY: To solidify the interface between OMPI and the runtime  
environment,

and to allow the use of different runtime systems, including different
versions of ORTE.

WHERE: Addition of a new framework to OMPI, and changes to many of the
files in OMPI to funnel all runtime request through this framework.  
Few

changes should be required in OPAL and ORTE.

WHEN: Development has started in tmp/rsl, but is still in its  
infancy. We hope

to have a working system in the next month.

TIMEOUT: 8/29/07

--
Short version:

I am working on creating an interface between OMPI and the runtime  
system.
This would make a RSL framework in OMPI which all runtime services  
would be

accessed from. Attached is a graphic depicting this.

This change would be invasive to the OMPI layer. Few (if any) changes
will be required of the ORTE and OPAL layers.

At this point I am soliciting feedback as to whether people are
supportive or not of this change both in general and for v1.3.


Long version:

The current model used in Open MPI assumes that one runtime system is
the best for all environments. However, in many environments it may be
beneficial to have specialized runtime systems. With our current  
system this

is not easy to do.

With this in mind, the idea of creating a 'runtime services layer' was
hatched. This would take the form of a framework within OMPI,  
through which

all runtime functionality would be accessed. This would allow new or
different runtime systems to be used with Open MPI. Additionally,  
with such a
system it would be possible to have multiple versions of open rte  
coexisting,
which may facilitate development and testing. Finally, this would  
solidify the

interface between OMPI and the runtime system, as well as provide
documentation and side effects of each interface function.

However, such a change would be fairly invasive to the OMPI layer, and
needs a buy-in from everyone for it to be possible.

Here is a summary of the changes required for the RSL (at least how  
it is

currently envisioned):

1. Add a framework to ompi for the rsl, and a component to support  
orte.

2. Change ompi so that it uses the new interface. This involves:
 a. Moving runtime specific code into the orte rsl component.
 b. Changing the process names in ompi to an opaque object.
 c. change all references to orte in ompi to be to the rsl.
3. Change the configuration code so that open-rte is only linked  
where needed.


Of course, all this would happen on a tmp branch.

The design of the rsl is not solidified. I have been playing in a  
tmp branch
(located at https://svn.open-mpi.org/svn/ompi/tmp/rsl) which  
everyone is

welcome to look at and comment on, but be advised that things here are
subject to change (I don't think it even compiles right now). There  
are

some fairly large open questions on this, including:

1. How to handle mpirun (that is, when a user types 'mpirun', do they
always get ORTE, or do they sometimes get a system specific  
runtime). Most
likely mpirun will always use ORTE, and alternative launching  
programs would

be used for other runtimes.
2. Whether there will be any performance implications. My guess is  
not,

but am not quite sure of this yet.

Again, I am interested in people's comments on whether they think  
adding
such abstraction is good or not, and whether it is reasonable to do  
such a

thing for v1.3.

Thanks,

Tim PrinsDiagram.pdf>___

devel-core mailing list
devel-c...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel-core




Re: [OMPI devel] [devel-core] [RFC] Runtime Services Layer

2007-08-19 Thread Tim Prins
On Friday 17 August 2007 10:53:41 am Richard Graham wrote:
> Tim,
>   This looks like a good idea, and is a good step toward componentizing the
> run-time services the are available from the MPI's perspective.
>   A few comments:
>  - It is a good idea to play around in a sandbox to see what may or may not
> work - otherwise we are just guessing.  However, this is driven by the
> current code structure, and may or may not align with longer term plans.
> What is needed here, I believe, is a deliberate design - i.e. figure out
> where we want to go, and then see if anything in the implementation needs
> to change before it is moved over to the trunk.
This is a good point. I have tried to make the design reflect exactly what we 
need from a generic runtime system, and not just copying how we currently use 
orte. However, there are some things currently in the RSL which are somewhat 
orte specific (i.e. multiple init 'stages'), but should not interfer with 
other runtimes being used (since the runtimes can just use no-ops for 
unneeded stages).

>  - We are where we are, and can't just throw it away (it could even be
> exactly what we want), so even if our "ideal" is different than our current
> state, I believe incremental change is the way to go, to preserve an
> operating code base.
Agreed. I think the implementation of the rsl can be done with minimal risk of 
breaking things, since it requires little (if any) change in opal and orte, 
and mostly minor changes in ompi.

>   - I think it is way too early to talk about moving things over to the
> trunk in the next month or so, unless there sufficient evaluation can be
> done in a month or so.  This is not to discourage you at all, but just to
> caution against moving too fast, and then having to redo things.
>   I am very supportive if this, I do believe this is the right way to go,
> unless someone else can come up with a better idea, and time to implement.

Thanks for the comments,

Tim

>
> Thanks,
> Rich
>
> On 8/16/07 9:47 PM, "Tim Prins"  wrote:
> > WHAT: Solicitation of feedback on the possibility of adding a runtime
> > services layer to Open MPI to abstract out the runtime.
> >
> > WHY: To solidify the interface between OMPI and the runtime environment,
> > and to allow the use of different runtime systems, including different
> > versions of ORTE.
> >
> > WHERE: Addition of a new framework to OMPI, and changes to many of the
> > files in OMPI to funnel all runtime request through this framework. Few
> > changes should be required in OPAL and ORTE.
> >
> > WHEN: Development has started in tmp/rsl, but is still in its infancy. We
> > hope to have a working system in the next month.
> >
> > TIMEOUT: 8/29/07
> >
> > --
> > Short version:
> >
> > I am working on creating an interface between OMPI and the runtime
> > system. This would make a RSL framework in OMPI which all runtime
> > services would be accessed from. Attached is a graphic depicting this.
> >
> > This change would be invasive to the OMPI layer. Few (if any) changes
> > will be required of the ORTE and OPAL layers.
> >
> > At this point I am soliciting feedback as to whether people are
> > supportive or not of this change both in general and for v1.3.
> >
> >
> > Long version:
> >
> > The current model used in Open MPI assumes that one runtime system is
> > the best for all environments. However, in many environments it may be
> > beneficial to have specialized runtime systems. With our current system
> > this is not easy to do.
> >
> > With this in mind, the idea of creating a 'runtime services layer' was
> > hatched. This would take the form of a framework within OMPI, through
> > which all runtime functionality would be accessed. This would allow new
> > or different runtime systems to be used with Open MPI. Additionally, with
> > such a system it would be possible to have multiple versions of open rte
> > coexisting, which may facilitate development and testing. Finally, this
> > would solidify the interface between OMPI and the runtime system, as well
> > as provide documentation and side effects of each interface function.
> >
> > However, such a change would be fairly invasive to the OMPI layer, and
> > needs a buy-in from everyone for it to be possible.
> >
> > Here is a summary of the changes required for the RSL (at least how it is
> > currently envisioned):
> >
> > 1. Add a framework to ompi for the rsl, and a component to support orte.
> > 2. Change ompi so that it uses the new interface. This involves:
> >  a. Moving runtime specific code into the orte rsl component.
> >  b. Changing the process names in ompi to an opaque object.
> >  c. change all references to orte in ompi to be to the rsl.
> > 3. Change the configuration code so that open-rte is only linked where
> > needed.
> >
> > Of course, all this would happen on a tmp branch.
> >
> > The design of the rsl is not solidified. I have been playing in a tmp
> > branch 

Re: [OMPI devel] [devel-core] [RFC] Runtime Services Layer

2007-08-19 Thread Tim Prins
On Friday 17 August 2007 08:40:01 am Jeff Squyres wrote:
> I am definitely interested to see what the RSL turns out to be; I
> think it has many potential benefits.  There are also some obvious
> issues to be worked out (e.g., mpirun and friends).
Yeah, thinking through this and talking to others, it seems like the best way 
to deal with this is to say that mpirun points to our default runtime (orte), 
and that to use any other rsl component, you have to use that system's 
specific launcher (could be a 'srun', or a 'mpirun-foobar', whatever the 
system wants to do).

>
> As for whether this should go in v1.3, I don't know if it's possible
> to say yet -- it will depend on when RSL becomes [at least close to]
> ready, what the exact schedule for v1.3 is (which we've been skittish
> to define, since we're going for a feature-driven release), etc.

I agree that it is impossible to say right now, but wanted to throw it out 
there for people to consider/think about. 

Tim

>
> On Aug 16, 2007, at 9:47 PM, Tim Prins wrote:
> > WHAT: Solicitation of feedback on the possibility of adding a runtime
> > services layer to Open MPI to abstract out the runtime.
> >
> > WHY: To solidify the interface between OMPI and the runtime
> > environment,
> > and to allow the use of different runtime systems, including different
> > versions of ORTE.
> >
> > WHERE: Addition of a new framework to OMPI, and changes to many of the
> > files in OMPI to funnel all runtime request through this framework.
> > Few
> > changes should be required in OPAL and ORTE.
> >
> > WHEN: Development has started in tmp/rsl, but is still in its
> > infancy. We hope
> > to have a working system in the next month.
> >
> > TIMEOUT: 8/29/07
> >
> > --
> > Short version:
> >
> > I am working on creating an interface between OMPI and the runtime
> > system.
> > This would make a RSL framework in OMPI which all runtime services
> > would be
> > accessed from. Attached is a graphic depicting this.
> >
> > This change would be invasive to the OMPI layer. Few (if any) changes
> > will be required of the ORTE and OPAL layers.
> >
> > At this point I am soliciting feedback as to whether people are
> > supportive or not of this change both in general and for v1.3.
> >
> >
> > Long version:
> >
> > The current model used in Open MPI assumes that one runtime system is
> > the best for all environments. However, in many environments it may be
> > beneficial to have specialized runtime systems. With our current
> > system this
> > is not easy to do.
> >
> > With this in mind, the idea of creating a 'runtime services layer' was
> > hatched. This would take the form of a framework within OMPI,
> > through which
> > all runtime functionality would be accessed. This would allow new or
> > different runtime systems to be used with Open MPI. Additionally,
> > with such a
> > system it would be possible to have multiple versions of open rte
> > coexisting,
> > which may facilitate development and testing. Finally, this would
> > solidify the
> > interface between OMPI and the runtime system, as well as provide
> > documentation and side effects of each interface function.
> >
> > However, such a change would be fairly invasive to the OMPI layer, and
> > needs a buy-in from everyone for it to be possible.
> >
> > Here is a summary of the changes required for the RSL (at least how
> > it is
> > currently envisioned):
> >
> > 1. Add a framework to ompi for the rsl, and a component to support
> > orte.
> > 2. Change ompi so that it uses the new interface. This involves:
> >  a. Moving runtime specific code into the orte rsl component.
> >  b. Changing the process names in ompi to an opaque object.
> >  c. change all references to orte in ompi to be to the rsl.
> > 3. Change the configuration code so that open-rte is only linked
> > where needed.
> >
> > Of course, all this would happen on a tmp branch.
> >
> > The design of the rsl is not solidified. I have been playing in a
> > tmp branch
> > (located at https://svn.open-mpi.org/svn/ompi/tmp/rsl) which
> > everyone is
> > welcome to look at and comment on, but be advised that things here are
> > subject to change (I don't think it even compiles right now). There
> > are
> > some fairly large open questions on this, including:
> >
> > 1. How to handle mpirun (that is, when a user types 'mpirun', do they
> > always get ORTE, or do they sometimes get a system specific
> > runtime). Most
> > likely mpirun will always use ORTE, and alternative launching
> > programs would
> > be used for other runtimes.
> > 2. Whether there will be any performance implications. My guess is
> > not,
> > but am not quite sure of this yet.
> >
> > Again, I am interested in people's comments on whether they think
> > adding
> > such abstraction is good or not, and whether it is reasonable to do
> > such a
> > thing for v1.3.
> >
> > Thanks,
> >
> > Tim Prins
> > 
> > ___
> >