I have a lot of comments inline but want to overall summarize by
saying I think we should address all of your concerns by
incrementally improving core2. As you said below, you are not arguing
for a rewrite and I think that would be the best way to accommodate
the wide variety of things the community is interested in working on
(not everyone wants to work on the "baby steps", as valuable as they
are).
Having worked with you for over a year, I'm absolutely sure you can
make significant contributions to improving what we have in core2.
How about it?
Jim
On Jul 7, 2006, at 10:17 AM, Jean-Sebastien Delfino wrote:
More comments inline.
Jim Marino wrote:
Comments inline
On Jul 6, 2006, at 6:17 PM, Jean-Sebastien Delfino wrote:
Jeremy,
I won't comment on your attacks at the bottom of this email. I
was hoping for a more constructive technical discussion. I added
my answers and comments on the specific technical issues inline.
Jeremy Boynes wrote:
On Jul 5, 2006, at 12:43 PM, Jean-Sebastien Delfino wrote:
My proposal is not to merge M1 and the core2 sandbox. I am
proposing to start a new fresh code stream and build the
runtime through baby steps. We may be able to reuse some pieces
of existing code, but more important is to engage our community
in this exercise and integrate the new ideas that will emerge
from this.
I don't believe the two issues are necessarily coupled. Quite a
few members of the community are engaged on the sandbox code
already and we could work with you to improve that rather having
to throw everything out and start over with all new ideas.
Here's an example where I'm struggling with both M1 and the
core2 sandbox and thinking that we can do better if we start
with a new fresh stream: our (recursive) assembly metadata model.
- M1 does not implement the recursive composition model and
would require significant changes to support it. Core2 is an
attempt to implement it but I'm not sure it's quite right, and
also think that it can be simplified.
It would really help if you could come up with concrete areas
where it is not right or where it could be simplified - for
example, end user scenarios that are not supported.
- M1 used Lists to represent relationships, Core2 uses Maps, I
think M1 was better since it allowed to keep the order in the
relationships.
There's nothing I remember in the assembly spec where order
matters. On the other hand there are many areas where things are
keyed by a name which has to be unique. This seems like a
natural mapping (sorry) to a Map. In M1 I started to move toward
simple map structures but you replaced it with what seemed like
a fairly complicated specialized List implementation that sent
notifications that updated a Map anyway. Given the desire for
simplification, are there any end-user scenarios that require
ordering to be preserved and that can't be supported with a
basic HashMap or LinkedHashMap?
As an administrator I'll want my administration tool to display
components displayed in the order I declared them in SCDL.
SCDL isn't the only form assembly can be serialized to/from. Also,
if I were an admin, I'd probably want to sort the components
according to some useful criteria, not how they are listed in a
SCDL as most admins will never look at XML. One could always use
LikedHashMap though.
Maybe SCDL isn't the only form but this is not relevant, we need to
support SCDL don't we? As soon as you put assembly elements in a
document that a user/developer can edit the order is relevant.
I disagree with your statement about administrators. They often
look at and work with XML configuration files. If you want to
support other sort criteria in addition that's fine, but admin,
config and editing tools need to at least support the order from
the XML document.
Actually that is generally not the case regarding XML in a
datacenter. In my experience, most admins in datacenters change
things through admin consoles or scripts so that there is an audit
history (in U.S. financial services institutions it is obligatory).
Of course admins in less bureaucratic environments may tweak XML
configuration files (I should have worded my previous response
better), but my point was very few admins crack deployment archives
and mess with application artifacts. That is extremely bad practice
and something we should never encourage. This was a fundamental
design principal we had in the spec group (i.e. deployment units
should not be cracked) and hence we included an override mechanism in
the Assembly Specification. For dynamic wiring, assemblers (a
different role than admins) would use some kind of tool - having them
edit an XML file would not be the best approach to this problem.
So, I just don't see the value in this use-case, although it would be
trivial for us to implement even if it promotes an anti-pattern and
unnecessarily complicates what we currently have.
I'll also want a configuration or admin tool loading/saving
modified SCDL to write things in the order that they were
initially, not in a random order. As an application developer I'd
like to have an SCA debugging tool showing me my components in a
list in the right order as well. Also if I want to implement the
model defined by the XML schemas in the spec using any of the
DataBinding technologies out there, I'll end up with Lists, not
Maps.
We have been using StAX just fine for this and it accommodates a
number of databinding solutions for extensions. Are you proposing
we revisit this decision made back before the M1 release to go
with STaX loading? If so, for what reasons? BTW, not all
databinding solutions will have problems - XStream will work just
fine with what we have. Also, are you sure about XMLBeans and JAXB
or are you just speaking about a current implementation of SDO?
Not quite correct, the decision we made back before M1 was to go
with StAX loading, write the loaders by hand for now, and see how
the SDO team could generate this code after M1. Independent of
that, I don't want to tie us to any specific data binding, so we
better pick representations for model relationships that are
commonly used by most databindings to represent XSD <element...
maxOccurs="unbounded"/>, i.e. Lists, not Maps.
JAXB, Castor, XStream, and JiBX (through an shipped extension)
support Maps, and probably a number of other databinding solutions do
as well. But let's set that aside since there is a more important
issue...
The decision as I recall was twofold. First, go with a runtime model
that was natural for Java developers and supported Java idioms, not
the constraints of a particular databinding solution. In my book, if
a databinding solution cannot accommodate the requirements of the
runtime, it is not the right tool for the job, in this case, loading
of core configuration data (not extension configuration). Jeremy
chose to implement this requirement with StAX. It worked, was simple,
and provided the ability to have extension developers use their
databinding framework of choice to load required configuration
information (which may involve evaluating artifacts other than XML).
This last feature was important, as the runtime must be able to
handle multiple databinding technologies simultaneously.
The second part of the decision was to decouple the runtime from SDO,
not because people don't like SDO, but because it promotes choice,
modularity, and simplicity. This is entirely consistent with SCA,
which does not mandate SDO and which I imagine will be used with a
variety of databinding technologies (e.g. JAXB). This also promotes
modularity and simplicity as it allows people to come and work on (or
extend) the runtime without having to learn a SDO (or another
particular databinding technology).
Also, just to beat a dead horse further (how long have we been having
this debate ;-) ), to me, and probably a lot of other Java
developers, StAX is a pervasive and simple way of dealing with XML -
its in the javax namespace, pervasive in open source, and will be in
the JDK. Given that we can use SDO, JAXB, etc. to handle extensions,
what's the problem with using what we have? What benefit do we gain
by constraining the runtime model's use of very common (and in my
opinion effective) Java idioms?
Finally even if we decided to use Maps in some cases to provide
keyed access to some elements of the model, we'd have to do it
differently. For example a single Map containing all components,
references and services in a composite (according to the spec
they cannot have the same names) instead of three Maps like you
have in Core2.
And this is why LinkedHashMap will not help you here.
Again, this is trivial to implement and LinkedHashMap will do just
fine with many of the databinding solutions available today.
- Core2 only defines implementation classes for the model, I
think we should have interfaces + default implementation
classes instead, like we had in M1, to allow for alternate
implementations of the model.
One of the most complex things with the M1 model was all the
interfaces involved, the need to pass factory implementations
around, the number of different factories involved (one per
extension implementation) and the potential issues with code
assuming its implementation of the factory was the one used.
The core2 model uses concrete classes which are really just data
holders - there's no behaviour in them to be abstracted through
the interface. This gives a much simpler programming model for
extensions using the model.
Do you have any scenarios that would require different
implementations of the model? Are they so different that they
might as well just use different classes?
I don't think that having just implementation classes is much
simpler. If you interact with the model SPI, reading interfaces
is simpler IMO and more suitable for inclusion in a specification
document... allowing multiple implementations of these
interfaces. Also we have to support the whole lifecycle of an SCA
application (development, deploy/install, runtime, admin etc.)
and I'd like to allow some flexibility for different tools,
running at different times to use different implementations of
the assembly model interfaces.
Oisin from the STP project said the POJO based approach would suit
them just fine. I don't see the complexity. On the contrary, all
of the AssemblyFactories we had in M1 lead IMO to a massive
antipattern where they were passed throughout the core. I'm happy
to walk through the relevant code if people are interested. All
the factories did was new up a POJO. Not worth the complexity in
my opinion but I'm happy to compare the work in the sandbox with
your proposal if you'd like to walk us through it.
When the runtime depends on too many factories, this is the
manifestation of bigger coupling problems. The factories for all
the extensions should not be visible at all from the core runtime,
and if we externalize the WSDL and Java interface support and the
Java implementation support our of core like I'm proposing, you're
not dealing with many factories.
But we will be. Factories will be proliferated through the entire
loader infrastructure unless the loaders new the factories up and
that seems a bit superfluous given the factories themselves just new
up POJOs. Just new the POJOs up and things are simple. Also,
factories will be proliferated across or testcases which was also a
cause of needless complexity in M1.
I am sure that tooling projects will need to add much to this
model, support for events, change tracking, tracking between XML
elements and model objects to provider proper feedback to
application developers, integration with modeling technologies used
in the tooling world, support for cloning maybe... tons of things.
I spent several years developing tools so I think I know what I'm
talking about here. The first thing I'll ask as a tooling developer
is: please give me interfaces so I can hook what I need in the
implementations.
And the tooling people should go do all of that (if they want to) but
keep it out of the runtime (and vice versa) ;-) ! We are writing code
for a runtime, not a tooling environment. Use change notification,
interfaces, round-tripping support, cloning, whatever when writing a
tool. In other words, tooling people should build the technology that
is right for them and the runtime people likewise. Sharing a discrete
(and relatively small) number of classes really doesn't buy that much
given the divergence in use cases between tooling and runtime. If we
can do it, great, but we should not compromise the runtime or tooling
to do so.
Also, I'm not sure your requirement for interfaces is shared by
everyone on the tooling side. Oisin (Eclipse STP) indicated he would
be fine with the POJO approach.
I'm happy to walk people through the interfaces or answer any
questions on the list,
Great, how about doing a diff between core2 and your proposed
approach and how core2 could be improved to accomodate your issues?
- Over usage of Java Generics breaks flexibility in some cases,
for example Component<I extends Implementation> will force you
to recreate an instance of Component to swap its implementation
with an implementation of a different type (and lose all the
wires going in/out of the component).
There may be cases where generics may be overkill but I don't
think that really requires us to throw out the model. There are
other cases where the use of wildcards would be appropriate; for
example, in the scenario you give here you could just create a
Component<Implementation> to allow different types of
implementation to be used.
Then instead of
Component<Implementation> {
Implementation getImplementation();
}
I think we can just do
Component {
Implementation getImplementation();
}
What we have now in core2 is overkill IMO.
then do we need to cast to the right impl type?
The core runtime should not have to cast, simply because it should
not depend on any component implementation type (not even the Java
or System implementation types).
A loader will need to cast the above.
- Core2 defines ReferenceDefinitions (without bindings) and
BoundReferenceDefinitions (with bindings). IMO there are
Reference types and Reference instances and both can have
bindings.
or Reference.
I'm with you here - we need to refactor the way bindings are
handled for both Service and Reference. One thing the sandbox
model is missing is the ability to associate multiple bindings
with a single Service/Reference.
My main point is not about supporting multiple bindings on a
Service or Reference. I think this is secondary and the
interfaces I put in my sandbox to support a design discussion
don't even have that either. My point is that Services,
References, and their instantiation by Components are at the
foundation of the SCA assembly model... and therefore need to be
modeled correctly. I'm proposing a different design, illustrated
by the interfaces I checked in.
Could you elaborate?
I think it should be very simple:
- Component types have service and reference types
- Components are instances of component types and have services and
references, which are instances of the service and reference types
- Service and reference types can have bindings
- Bindings can be overriden in service and reference instances
This is clear when you look at a Composite. A composite is a
Component Type, has service and reference types (aka composite
services and references) which can have bindings.
A component can be implemented by a Composite, has services and
references, which can use the (default) bindings from their
respective service and reference types, or specify (override)
bindings.
I would find it extremely useful if you could perhaps compare this
with what we have in core2 and point to how core2 could be improved
to accommodate some of your concerns in this area.
- I think that Remotable should be on Interface and not Service.
I agree Service is wrong and that it should be on
ServiceContract. Thanks for catching it.
- Scope should be defined in the Java component implementation,
separate from the core model.
Scope is not a Java specific concept.
Interaction scope (stateless vs. stateful) can apply to any
ServiceContract.
Container scope is the contract between an implementation and a
ScopeContainer and applies to any implementation type that can
support stateful interactions. This would include JavaScript,
Groovy, C++, ... I think that means that support for state
management (which is what scope is configuring) belongs in the
core with the configuration metadata supplied by the
implementation type.
I don't think it's quite right. First interaction scopes are
defined on interfaces and not service contracts. Also they
control whether an interface is conversational or not,
independent from any state management.
Anyway I was talking about a different scope, the implementation
scope defined in the Java C&I spec, which governs the lifecycle
of Java component implementation instances. I think the
definition and implementation of lifecycle management will vary
greatly depending on the component implementation type, for
example Java component implementations and BPEL component
implementations typically deal with this in a very different way.
Well, I don't think that's the case at all and actually there is a
concept of implementation scope in assembly - it just varies by
implementation type, which is entirely consistent with our design.
BPEL is the odd case, and this came up as we wrote the scope
changes into the spec (Ken did a lot of the work here). Across
many implementation types, e.g. Groovy, JavaScript, Ruby, Jython,
etc. (maybe even C++) I see use for the same scopes as in Java. Do
you disagree?
Also, I'm curious why you think the scope containers complicate
the core and need to be moved out? Or are you saying this based on
your reading of the spec? They seem quite simple to me.
I'm saying that scope management is specific to the implementation
type and therefore needs to be made pluggable, i.e. moved out of
core. The Java scope management is just one example of scope
management.
We have it partly pluggable. There are just a few more things to do.
Would you care to help out on this?
I think it makes sense to keep commonly used scopes in core and have
implementation specific ones as plugins, not necessarily tied to an
implementation (outside of BPEL, most scopes are probably applicable
to a wide variety of types).
Therefore, in my view state/lifecycle management should be left
to the component implementation contributions and not belong to
core.
I think this would lead to over-complication, particularly for the
extension developer. Right now, scope containers can be reused. In
particular, how would conversational services be handled? If I
want to use module or session scope containers for my Groovy
script, then I'd have to write those rather then just reuse what
the core gives me? Also, be reusing, we also allow an additional
extension type in terms of scope. For example, someone could add a
distributed cache scope and have that shared by Groovy, Java,
whatever.
I'll also note two things. Getting scope containers to work
properly with instance tracking is not trivial. I'd hate to push
that on extension developers. Second, this basic design has been
there since before M1. Why wasn't this brought up before since it
is such a significant issue?
- Java and WSDL interfaces should be defined separate from the
core model, we need to support multiple interface definition
languages supported by plugins, not in the core.
The model supports generic IDL through the use of
ServiceContract. Java and WSDL are two forms of IDL that are
mandated by the specification. This is really just a question of
where those implementations are packaged and again I don't think
this warrants a rewrite.
Packaging issues are important and often hide bigger dependency/
coupling problems. I think we should package the support for Java
and WSDL interfaces separate from the core to prevent any
coupling between the two, and also give people who will have to
support new interface definition languages a better template to
follow.
Individual issues do not warrant a rewrite. What about the sum of
many issues?
None of the issues warrant a rewrite, not even the sum. Most of
your criticisms seem centered around the model which is fairly
decoupled from the bulk of the core2 runtime. Even if we adopted
your changes wholesale, I'd doubt they would change the core2
runtime significantly. Even the scope containers could be moved
out without breaking anything and very little code changes,
although that would be a mistake IMO. I'm sorry but I fail to see
the need for a rewrite.
I am not proposing a rewrite of the whole runtime, see my original
email, it's not a whole rewrite. I'm proposing a staged / baby step
approach integrating the good work from M1 and the sandbox and new
discussions where I think what we have is not right or where new
ideas from the group come up.
Why not just do this starting from core2? If it's not a rewrite, then
it sounds like incremental improvements. We could start from
scenarios and go through doing this. I think this approach would
accommodate those that wish to focus on end-user scenarios and those
that are focused on more technology scenarios such as conversations.
I believe we need to be inclusive of both, and cannot force one
approach to doing scenarios on the entire community. Also, I think
this is the correct *long-term* way to getting more people involved.
We will always have newcomers and we need to ensure the runtime is
modular enough so they can work their way in as deep as they are
interested. Doing a "baby-step" rebuild just for the people that are
part of the community now doesn't really teach us how to continuously
grow the community.
I'm starting with the model SPI because I think that having the
assembly model right is critical for an assembly runtime. Most of
the ideas here have an impact on the architecture of the runtime,
so I thought this was a good starting point, and also a good base
of discussion to help all in our community discuss and understand
better the new recursive composition model.
And here's where I think the crux of the disagreement lies...and it's
the same debate that started over a year ago and I thought we had
gotten past. The *configuration* model should be decoupled from the
rest of the runtime, not determine its architecture. Also, the
configuration model is only one small part of the SPI/API (I think we
need to begin to make this distinction as suggested by Jeremy).
Similarly, the SCA specifications are not blueprints for a runtime
design; they describe a wiring and programming model for service-
based applications. If multiple runtimes with divergent architectures
cannot implement SCA, then it will have failed as a set of
specifications.
Of course, that is not to say we should not have SCA concepts
reflected in the runtime architecture. Along these lines, one of the
key changes we made in core2 was to do this better with the actual
runtime structures. Namely, we reserved the "Component" naming scheme
for runtime artifacts as opposed to the configuration model (they
used to be called "Context"), as those will be dealt with by
developers working on core and extenders much more than the
configuration model will be.
Also, the model is just an in-memory representation of configuration
data, nothing more and nothing less. One of the key culprits in the
M1 architecture was the fact that we did not have this clean
distinction. We did agree to have it, we just did not evolve the code
enough in that direction, and that was one of the key driving factors
for creating core2.
- Implementation should extend ComponentType IMO instead of
pointing to it, and we may even be able to simplify and just
remove Implementation. Also I am not sure why we need to
distinguish between AtomicImplementation and
CompositeImplementation.
One of the problems the assembly spec has is that it is
difficult to do top-down design because you cannot link a
component to a componentType without having an implementation. I
agree this is an area that we (and the spec) need to sort out.
IMO a component is associated with one componentType but may
have multiple implementations so I don't think they are quite
the same thing or that either can be removed.
AtomicImplementation is a marker for implementations that cannot
have children.
In my view a component has a type. The ComponentType is either
abstract (just defining the types of services offered, references
used, and properties that can be configured), or concrete. A POJO
component implementation is a concrete ComponentType.
Perhaps we could walk through your model?
Yes, the interfaces I put under m2-design are there to illustrate
ideas and support a design discussion. I'm working on some UML
diagrams that I think will help too.
Great, perhaps another area we could discuss in the context of
improvements to core2?
- Support for Composite Includes is missing, this is a
significant part of the recursive composition model (half of
it, one of the two ways to nest composites).
It's not really half - it's really just a very small part of the
model, comparable to the <import> element we used to support in
M1. Again, I don't see why we need to rewrite the model to add
this in. Quite the opposite: you've said you've been looking for
way to engage and this could be it.
I disagree. Includes are a very significant part of the assembly
model (the other part is the ability to use a composite as a
component implementation). Two examples:
- An included composite is the equivalent of a module fragment in
the 0.9 spec. This concept is key to allowing a team to work on
various pieces of an application, split in multiple composites,
included in a composite representing the application.
- When (formerly subsystems) composites get deployed to an SCA
system, they are actually included in that system, rather than
being used as component implementations.
It's not "half" of the recursive model. In fact, most of the time
we spent in the spec group was grappling with other issues related
to recursion.
I don't see an immediate relation between the time spent by the
spec group on a specific item and its importance for application
developers. I am looking at this from an application developer
point of view and saying that I'll use Includes as much as
composition through (composite) components. Includes will allow a
team to distribute work on an SCA application and also represent a
key concept for system composition. I am starting to look at
scenarios and can actually see the usage of Includes in almost all
of them, but having trouble finding good use cases for the other
form of composition (nested component implementations). So I stand
by my statement that understanding how includes work is key here.
Yep, and that's why I originally pushed the include mechanism in the
spec group over a year ago (I didn't push the fragment classpath
approach though, so don't blame me for that one). However, to say
that is "half" of implementing the recursive model leaves the detail
out. In this context Jeremy's metaphor of the paddling duck is
apropos. The runtime should be like a duck in that on the surface it
just merrily moves along the water but under the surface it is
paddling away. The same for the runtime: from the app developer's
perspective things just work and they are simple, but under the
covers the runtime is managing all of the complexity. In my opinion,
there is a lot more complexity to recursion than what the app
developer sees. Hopefully in the end, the runtime architecture is
graceful and we don't wind up with an ugly duckling as we did in M1 ;-)
This list is not exhaustive... Another idea would be to
externalize support for Composites in a separate plugin not
part of the core service model (since there may be other ways
to compose services in addition to an SCA composite, with
Spring or other similar programming models), I'd like to know
what people think about that.
Having the composite implementation type in the core does not
preclude that - again, it's just packaging for ease-of-use.
I think it's more significant than packaging. Are you saying that
we could move the code supporting composites out of core2 without
breaking the code in core2?
Why would we do this? We can already support multiple composite
implementation types - have a look at the Spring extension. That
just sounds like unnecessary complication.
Why? to avoid unecessary and dangerous coupling that will hurt us
when we try to evolve this runtime. How to illustrate that? how
about trying to move code supporting composites out of core2? I'm
realizing I'm asking the same question again... but I think it's an
important question, sill unanswered.
I don't think moving packages around guarantees dangerous coupling;
pluralism, vigilance, and good design do. It may, though,
unnecessarily compound complexity. We have a Spring extension. How
about adding more composite types to core2? This way, we can expand
the number of containers, achieve a level of pluralism, watch that we
don't over-couple, and derive a good extension design?
You seem to have the impression that the core is sealed and that
we only support things that are included within it. That is not
the case. The only place we need things bundled with the core is
in the bootstrap phase - specifically, we need them bundled with
the primordial deployer. The actual runtime is created from the
SCDL passed to that primordial deployer, can contain any mix of
components and need not contain any of the infrastructure used
to boot it.
I just checked in sandbox/sebastien/m2-design/model.spi a set
of new interfaces. This is just an initial strawman to trigger
a constructive discussion and ideas on how to best represent
the recursive model. I also need help to define a scenario (not
unit test cases, but an end to end sample application) to help
put the recursive composition model in perspective and make
sure we all understand it the same way.
I am troubled that you have chosen to start on your own codebase
at a time when most of us have been trying to have constructive
discussion on this list. Based on the approach you proposed in
your original email I would have hoped that we could have
started with your end-user scenarios and had a chance to explore
how they could be supported by M1, the sandbox, or some other
code before starting another codebase. I'm disappointed that,
having started this very thread nearly a week ago with the
premise of community, your first response on it was to commit a
large chunk of independent code rather than follow up with any
of the other people who have already contributed to the discussion.
I think discussion led to compromise and consensus on the
scenario-driven approach that you proposed. As shown above and
in other recent threads, there's plenty of room for improvements
and/or new features in our current code and a willingness to
discuss them, albeit in terms of technical merit rather than
personal opinion. I hope you can find a way to join in rather
than forge your own path.
--Jeremy
-------------------------------------------------------------------
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
--Jean-Sebastien
--------------------------------------------------------------------
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
--
Jean-Sebastien
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]