On 4/24/07, Simon Laws <[EMAIL PROTECTED]> wrote:
Following on from the release content thread [1] I'd like to kick off a
discussion on how we resurrect support for a distributed runtime. We had
this feature before the core modularization and I think it would be good to
bring it back again. For me this is about working out how the tuscany
runtime can be successfully hosted in a distributed environment without
having to recreate what is done very well by existing distributed computing
technologies.
The assembly model specification [3] deals briefly with this in its
discussion of SCA Domains
"An SCA Domain represents a complete runtime configuration, potentially
distributed over a series of interconnected runtime nodes."
Here is my quick sketch of the main structures described by the SCA Domain
section of the assembly spec.
SCA Domain
Logical Services - service to control the domain
InstalledContribution (1 or more)
Base URI
Installed artifact URIs
Contribution - SCA and non-SCA artefacts contributed to the
runtime
/META_INF/sca-contribution.xml
deployable (composites)
import
export
Dependent Contributions - references to installed contributions on
which this one depends
Deployment Time Composites - composites added into an installed
contribution
Virtual Domain Level Composite - the running set of SCA artefacts. Top
level services are visible outside domain
Component, Service, Reference
Derived from notionally included installed composites
The assembly spec says that "A goal of SCA's approach to deployment is
that the contents of a contribution should not need to be modified in order
to install and use the contents of the contribution in a domain.". This
seems sensible in that we don't want to have to rewrite composite files
every time we alter the physical set of nodes on which they are to run.
Typically in a distributed environment there is some automation of the
assignment of applications to nodes to cater for resilience, load balancing,
throughput etc.
The assembly spec is not prescriptive about how an SCA Domain should be
mapped and supported across multiple runtime nodes but I guess the starting
point is to consider the set of components a system has, i.e. the set of
(top level?) components that populate the Virtual Domain Composite and
consider them as likely candidates for distributing across runtimes.
So I would expect a manager of a distributed SCA runtime to go through a
number of stages in getting the system up and running.
Define an SCA Domain (Looking at the mailing list Luciano is think these
thoughts also)
- domain definition
- as simple as a file structure (based on hierarchy from assembly
spec) in a shared file system.
- could implement more complex registry based system
- allocate nodes to the domain
- As simple as running up SCA runtime on each node in the domain.
- For more complex scenarios might want to use a
scheduling/management system
Add contributions to the domain
- Identify contributions required to form running service network
(developers will build/define contributions)
- Contribution these contributions to the domain
- in the simple shared file system scenario I would image they just
end up in on this file system available to all nodes in
the domain.
Add contributions to the Virtual Domain Level Composite
- At this point it think we have to know where artifacts are physically
going to run
- It could be that all runtimes load all contributions and only expose
those destined for them or, I.e. each node has the full model loaded but
knows which bits it's running.
- Alternatively we could target each node specifically and ask it to
load a particular installed contribution and define
a distributed model.
Manage the Domain
- Need to be able to target the logical service provided by the domain
at the appropriate runtime node
In order to make this work the sca default binding has to be able to work
remotely across distributed runtimes so we need to select an existing
binding, or create a new one, to do this.
I think in the first instance we should set the bar fairly low. I.e have
the target be running a sample application across two SCA runtimes
supporting java component implementations. This pretty much picks up where
we were with the distribution support before the core modularization effort.
I'm not sure what the target scenario should be but we could take one of
the samples we already have, e.g. SimpleBigBank which happens to have two
simple java components in its implementations, but we could go with any of
them.
Thoughts
Simon
[1] http://www.mail-archive.com/[email protected]/msg16898.html
[2] http://www.mail-archive.com/tuscany-dev%40ws.apache.org/msg16831.html
[3]
http://www.osoa.org/display/Main/Service+Component+Architecture+Specifications
Hi
I've transferred this information over the the project wiki (
http://cwiki.apache.org/confluence/display/TUSCANY/Distributed+Runtime) and
I've added a few more thoughts about how we can start off with quite a
simplistic approach. I'm keen to get some input on this so feel free to
comment here and add notes to the wiki. I hope that we can start to cut some
code asap (well at least once the release activity has died down this week)
to help us understand the problem so if any of this takes your fancy fell
free to roll up your sleves etc.
Regards
Simon