I have a lot of comments inline but want to overall summarize by saying I think we should address all of your concerns by incrementally improving core2. As you said below, you are not arguing for a rewrite and I think that would be the best way to accommodate the wide variety of things the community is interested in working on (not everyone wants to work on the "baby steps", as valuable as they are).

Having worked with you for over a year, I'm absolutely sure you can make significant contributions to improving what we have in core2. How about it?

Jim


On Jul 7, 2006, at 10:17 AM, Jean-Sebastien Delfino wrote:

More comments inline.

Jim Marino wrote:
Comments inline
On Jul 6, 2006, at 6:17 PM, Jean-Sebastien Delfino wrote:

Jeremy,

I won't comment on your attacks at the bottom of this email. I was hoping for a more constructive technical discussion. I added my answers and comments on the specific technical issues inline.

Jeremy Boynes wrote:
On Jul 5, 2006, at 12:43 PM, Jean-Sebastien Delfino wrote:

My proposal is not to merge M1 and the core2 sandbox. I am proposing to start a new fresh code stream and build the runtime through baby steps. We may be able to reuse some pieces of existing code, but more important is to engage our community in this exercise and integrate the new ideas that will emerge from this.


I don't believe the two issues are necessarily coupled. Quite a few members of the community are engaged on the sandbox code already and we could work with you to improve that rather having to throw everything out and start over with all new ideas.


Here's an example where I'm struggling with both M1 and the core2 sandbox and thinking that we can do better if we start with a new fresh stream: our (recursive) assembly metadata model.

- M1 does not implement the recursive composition model and would require significant changes to support it. Core2 is an attempt to implement it but I'm not sure it's quite right, and also think that it can be simplified.

It would really help if you could come up with concrete areas where it is not right or where it could be simplified - for example, end user scenarios that are not supported.


- M1 used Lists to represent relationships, Core2 uses Maps, I think M1 was better since it allowed to keep the order in the relationships.

There's nothing I remember in the assembly spec where order matters. On the other hand there are many areas where things are keyed by a name which has to be unique. This seems like a natural mapping (sorry) to a Map. In M1 I started to move toward simple map structures but you replaced it with what seemed like a fairly complicated specialized List implementation that sent notifications that updated a Map anyway. Given the desire for simplification, are there any end-user scenarios that require ordering to be preserved and that can't be supported with a basic HashMap or LinkedHashMap?

As an administrator I'll want my administration tool to display components displayed in the order I declared them in SCDL.
SCDL isn't the only form assembly can be serialized to/from. Also, if I were an admin, I'd probably want to sort the components according to some useful criteria, not how they are listed in a SCDL as most admins will never look at XML. One could always use LikedHashMap though.

Maybe SCDL isn't the only form but this is not relevant, we need to support SCDL don't we? As soon as you put assembly elements in a document that a user/developer can edit the order is relevant.

I disagree with your statement about administrators. They often look at and work with XML configuration files. If you want to support other sort criteria in addition that's fine, but admin, config and editing tools need to at least support the order from the XML document.
Actually that is generally not the case regarding XML in a datacenter. In my experience, most admins in datacenters change things through admin consoles or scripts so that there is an audit history (in U.S. financial services institutions it is obligatory).

Of course admins in less bureaucratic environments may tweak XML configuration files (I should have worded my previous response better), but my point was very few admins crack deployment archives and mess with application artifacts. That is extremely bad practice and something we should never encourage. This was a fundamental design principal we had in the spec group (i.e. deployment units should not be cracked) and hence we included an override mechanism in the Assembly Specification. For dynamic wiring, assemblers (a different role than admins) would use some kind of tool - having them edit an XML file would not be the best approach to this problem.

So, I just don't see the value in this use-case, although it would be trivial for us to implement even if it promotes an anti-pattern and unnecessarily complicates what we currently have.


I'll also want a configuration or admin tool loading/saving modified SCDL to write things in the order that they were initially, not in a random order. As an application developer I'd like to have an SCA debugging tool showing me my components in a list in the right order as well. Also if I want to implement the model defined by the XML schemas in the spec using any of the DataBinding technologies out there, I'll end up with Lists, not Maps.

We have been using StAX just fine for this and it accommodates a number of databinding solutions for extensions. Are you proposing we revisit this decision made back before the M1 release to go with STaX loading? If so, for what reasons? BTW, not all databinding solutions will have problems - XStream will work just fine with what we have. Also, are you sure about XMLBeans and JAXB or are you just speaking about a current implementation of SDO?

Not quite correct, the decision we made back before M1 was to go with StAX loading, write the loaders by hand for now, and see how the SDO team could generate this code after M1. Independent of that, I don't want to tie us to any specific data binding, so we better pick representations for model relationships that are commonly used by most databindings to represent XSD <element... maxOccurs="unbounded"/>, i.e. Lists, not Maps.

JAXB, Castor, XStream, and JiBX (through an shipped extension) support Maps, and probably a number of other databinding solutions do as well. But let's set that aside since there is a more important issue...

The decision as I recall was twofold. First, go with a runtime model that was natural for Java developers and supported Java idioms, not the constraints of a particular databinding solution. In my book, if a databinding solution cannot accommodate the requirements of the runtime, it is not the right tool for the job, in this case, loading of core configuration data (not extension configuration). Jeremy chose to implement this requirement with StAX. It worked, was simple, and provided the ability to have extension developers use their databinding framework of choice to load required configuration information (which may involve evaluating artifacts other than XML). This last feature was important, as the runtime must be able to handle multiple databinding technologies simultaneously.

The second part of the decision was to decouple the runtime from SDO, not because people don't like SDO, but because it promotes choice, modularity, and simplicity. This is entirely consistent with SCA, which does not mandate SDO and which I imagine will be used with a variety of databinding technologies (e.g. JAXB). This also promotes modularity and simplicity as it allows people to come and work on (or extend) the runtime without having to learn a SDO (or another particular databinding technology).

Also, just to beat a dead horse further (how long have we been having this debate ;-) ), to me, and probably a lot of other Java developers, StAX is a pervasive and simple way of dealing with XML - its in the javax namespace, pervasive in open source, and will be in the JDK. Given that we can use SDO, JAXB, etc. to handle extensions, what's the problem with using what we have? What benefit do we gain by constraining the runtime model's use of very common (and in my opinion effective) Java idioms?



Finally even if we decided to use Maps in some cases to provide keyed access to some elements of the model, we'd have to do it differently. For example a single Map containing all components, references and services in a composite (according to the spec they cannot have the same names) instead of three Maps like you have in Core2.

And this is why LinkedHashMap will not help you here.
Again, this is trivial to implement and LinkedHashMap will do just fine with many of the databinding solutions available today.


- Core2 only defines implementation classes for the model, I think we should have interfaces + default implementation classes instead, like we had in M1, to allow for alternate implementations of the model.

One of the most complex things with the M1 model was all the interfaces involved, the need to pass factory implementations around, the number of different factories involved (one per extension implementation) and the potential issues with code assuming its implementation of the factory was the one used.

The core2 model uses concrete classes which are really just data holders - there's no behaviour in them to be abstracted through the interface. This gives a much simpler programming model for extensions using the model.

Do you have any scenarios that would require different implementations of the model? Are they so different that they might as well just use different classes?


I don't think that having just implementation classes is much simpler. If you interact with the model SPI, reading interfaces is simpler IMO and more suitable for inclusion in a specification document... allowing multiple implementations of these interfaces. Also we have to support the whole lifecycle of an SCA application (development, deploy/install, runtime, admin etc.) and I'd like to allow some flexibility for different tools, running at different times to use different implementations of the assembly model interfaces.

Oisin from the STP project said the POJO based approach would suit them just fine. I don't see the complexity. On the contrary, all of the AssemblyFactories we had in M1 lead IMO to a massive antipattern where they were passed throughout the core. I'm happy to walk through the relevant code if people are interested. All the factories did was new up a POJO. Not worth the complexity in my opinion but I'm happy to compare the work in the sandbox with your proposal if you'd like to walk us through it.

When the runtime depends on too many factories, this is the manifestation of bigger coupling problems. The factories for all the extensions should not be visible at all from the core runtime, and if we externalize the WSDL and Java interface support and the Java implementation support our of core like I'm proposing, you're not dealing with many factories.

But we will be. Factories will be proliferated through the entire loader infrastructure unless the loaders new the factories up and that seems a bit superfluous given the factories themselves just new up POJOs. Just new the POJOs up and things are simple. Also, factories will be proliferated across or testcases which was also a cause of needless complexity in M1.

I am sure that tooling projects will need to add much to this model, support for events, change tracking, tracking between XML elements and model objects to provider proper feedback to application developers, integration with modeling technologies used in the tooling world, support for cloning maybe... tons of things. I spent several years developing tools so I think I know what I'm talking about here. The first thing I'll ask as a tooling developer is: please give me interfaces so I can hook what I need in the implementations.

And the tooling people should go do all of that (if they want to) but keep it out of the runtime (and vice versa) ;-) ! We are writing code for a runtime, not a tooling environment. Use change notification, interfaces, round-tripping support, cloning, whatever when writing a tool. In other words, tooling people should build the technology that is right for them and the runtime people likewise. Sharing a discrete (and relatively small) number of classes really doesn't buy that much given the divergence in use cases between tooling and runtime. If we can do it, great, but we should not compromise the runtime or tooling to do so.

Also, I'm not sure your requirement for interfaces is shared by everyone on the tooling side. Oisin (Eclipse STP) indicated he would be fine with the POJO approach.

I'm happy to walk people through the interfaces or answer any questions on the list,

Great, how about doing a diff between core2 and your proposed approach and how core2 could be improved to accomodate your issues?


- Over usage of Java Generics breaks flexibility in some cases, for example Component<I extends Implementation> will force you to recreate an instance of Component to swap its implementation with an implementation of a different type (and lose all the wires going in/out of the component).

There may be cases where generics may be overkill but I don't think that really requires us to throw out the model. There are other cases where the use of wildcards would be appropriate; for example, in the scenario you give here you could just create a Component<Implementation> to allow different types of implementation to be used.

Then instead of
Component<Implementation> {
  Implementation getImplementation();
}
I think we can just do
Component {
  Implementation getImplementation();
}
What we have now in core2 is overkill IMO.

then do we need to cast to the right impl type?

The core runtime should not have to cast, simply because it should not depend on any component implementation type (not even the Java or System implementation types).
A loader will need to cast the above.



- Core2 defines ReferenceDefinitions (without bindings) and BoundReferenceDefinitions (with bindings). IMO there are Reference types and Reference instances and both can have bindings.
or Reference.

I'm with you here - we need to refactor the way bindings are handled for both Service and Reference. One thing the sandbox model is missing is the ability to associate multiple bindings with a single Service/Reference.

My main point is not about supporting multiple bindings on a Service or Reference. I think this is secondary and the interfaces I put in my sandbox to support a design discussion don't even have that either. My point is that Services, References, and their instantiation by Components are at the foundation of the SCA assembly model... and therefore need to be modeled correctly. I'm proposing a different design, illustrated by the interfaces I checked in.

Could you elaborate?

I think it should be very simple:
- Component types have service and reference types
- Components are instances of component types and have services and references, which are instances of the service and reference types
- Service and reference types can have bindings
- Bindings can be overriden in service and reference instances
This is clear when you look at a Composite. A composite is a Component Type, has service and reference types (aka composite services and references) which can have bindings. A component can be implemented by a Composite, has services and references, which can use the (default) bindings from their respective service and reference types, or specify (override) bindings.

I would find it extremely useful if you could perhaps compare this with what we have in core2 and point to how core2 could be improved to accommodate some of your concerns in this area.


- I think that Remotable should be on Interface and not Service.

I agree Service is wrong and that it should be on ServiceContract. Thanks for catching it.


- Scope should be defined in the Java component implementation, separate from the core model.

Scope is not a Java specific concept.
Interaction scope (stateless vs. stateful) can apply to any ServiceContract. Container scope is the contract between an implementation and a ScopeContainer and applies to any implementation type that can support stateful interactions. This would include JavaScript, Groovy, C++, ... I think that means that support for state management (which is what scope is configuring) belongs in the core with the configuration metadata supplied by the implementation type.


I don't think it's quite right. First interaction scopes are defined on interfaces and not service contracts. Also they control whether an interface is conversational or not, independent from any state management. Anyway I was talking about a different scope, the implementation scope defined in the Java C&I spec, which governs the lifecycle of Java component implementation instances. I think the definition and implementation of lifecycle management will vary greatly depending on the component implementation type, for example Java component implementations and BPEL component implementations typically deal with this in a very different way.
Well, I don't think that's the case at all and actually there is a concept of implementation scope in assembly - it just varies by implementation type, which is entirely consistent with our design. BPEL is the odd case, and this came up as we wrote the scope changes into the spec (Ken did a lot of the work here). Across many implementation types, e.g. Groovy, JavaScript, Ruby, Jython, etc. (maybe even C++) I see use for the same scopes as in Java. Do you disagree?

Also, I'm curious why you think the scope containers complicate the core and need to be moved out? Or are you saying this based on your reading of the spec? They seem quite simple to me.

I'm saying that scope management is specific to the implementation type and therefore needs to be made pluggable, i.e. moved out of core. The Java scope management is just one example of scope management.


We have it partly pluggable. There are just a few more things to do. Would you care to help out on this?

I think it makes sense to keep commonly used scopes in core and have implementation specific ones as plugins, not necessarily tied to an implementation (outside of BPEL, most scopes are probably applicable to a wide variety of types).


Therefore, in my view state/lifecycle management should be left to the component implementation contributions and not belong to core.

I think this would lead to over-complication, particularly for the extension developer. Right now, scope containers can be reused. In particular, how would conversational services be handled? If I want to use module or session scope containers for my Groovy script, then I'd have to write those rather then just reuse what the core gives me? Also, be reusing, we also allow an additional extension type in terms of scope. For example, someone could add a distributed cache scope and have that shared by Groovy, Java, whatever.

I'll also note two things. Getting scope containers to work properly with instance tracking is not trivial. I'd hate to push that on extension developers. Second, this basic design has been there since before M1. Why wasn't this brought up before since it is such a significant issue?


- Java and WSDL interfaces should be defined separate from the core model, we need to support multiple interface definition languages supported by plugins, not in the core.

The model supports generic IDL through the use of ServiceContract. Java and WSDL are two forms of IDL that are mandated by the specification. This is really just a question of where those implementations are packaged and again I don't think this warrants a rewrite.


Packaging issues are important and often hide bigger dependency/ coupling problems. I think we should package the support for Java and WSDL interfaces separate from the core to prevent any coupling between the two, and also give people who will have to support new interface definition languages a better template to follow.

Individual issues do not warrant a rewrite. What about the sum of many issues?

None of the issues warrant a rewrite, not even the sum. Most of your criticisms seem centered around the model which is fairly decoupled from the bulk of the core2 runtime. Even if we adopted your changes wholesale, I'd doubt they would change the core2 runtime significantly. Even the scope containers could be moved out without breaking anything and very little code changes, although that would be a mistake IMO. I'm sorry but I fail to see the need for a rewrite.

I am not proposing a rewrite of the whole runtime, see my original email, it's not a whole rewrite. I'm proposing a staged / baby step approach integrating the good work from M1 and the sandbox and new discussions where I think what we have is not right or where new ideas from the group come up.

Why not just do this starting from core2? If it's not a rewrite, then it sounds like incremental improvements. We could start from scenarios and go through doing this. I think this approach would accommodate those that wish to focus on end-user scenarios and those that are focused on more technology scenarios such as conversations. I believe we need to be inclusive of both, and cannot force one approach to doing scenarios on the entire community. Also, I think this is the correct *long-term* way to getting more people involved. We will always have newcomers and we need to ensure the runtime is modular enough so they can work their way in as deep as they are interested. Doing a "baby-step" rebuild just for the people that are part of the community now doesn't really teach us how to continuously grow the community.

I'm starting with the model SPI because I think that having the assembly model right is critical for an assembly runtime. Most of the ideas here have an impact on the architecture of the runtime, so I thought this was a good starting point, and also a good base of discussion to help all in our community discuss and understand better the new recursive composition model.


And here's where I think the crux of the disagreement lies...and it's the same debate that started over a year ago and I thought we had gotten past. The *configuration* model should be decoupled from the rest of the runtime, not determine its architecture. Also, the configuration model is only one small part of the SPI/API (I think we need to begin to make this distinction as suggested by Jeremy). Similarly, the SCA specifications are not blueprints for a runtime design; they describe a wiring and programming model for service- based applications. If multiple runtimes with divergent architectures cannot implement SCA, then it will have failed as a set of specifications.

Of course, that is not to say we should not have SCA concepts reflected in the runtime architecture. Along these lines, one of the key changes we made in core2 was to do this better with the actual runtime structures. Namely, we reserved the "Component" naming scheme for runtime artifacts as opposed to the configuration model (they used to be called "Context"), as those will be dealt with by developers working on core and extenders much more than the configuration model will be.

Also, the model is just an in-memory representation of configuration data, nothing more and nothing less. One of the key culprits in the M1 architecture was the fact that we did not have this clean distinction. We did agree to have it, we just did not evolve the code enough in that direction, and that was one of the key driving factors for creating core2.


- Implementation should extend ComponentType IMO instead of pointing to it, and we may even be able to simplify and just remove Implementation. Also I am not sure why we need to distinguish between AtomicImplementation and CompositeImplementation.

One of the problems the assembly spec has is that it is difficult to do top-down design because you cannot link a component to a componentType without having an implementation. I agree this is an area that we (and the spec) need to sort out.

IMO a component is associated with one componentType but may have multiple implementations so I don't think they are quite the same thing or that either can be removed.

AtomicImplementation is a marker for implementations that cannot have children.

In my view a component has a type. The ComponentType is either abstract (just defining the types of services offered, references used, and properties that can be configured), or concrete. A POJO component implementation is a concrete ComponentType.

Perhaps we could walk through your model?

Yes, the interfaces I put under m2-design are there to illustrate ideas and support a design discussion. I'm working on some UML diagrams that I think will help too.
Great, perhaps another area we could discuss in the context of improvements to core2?




- Support for Composite Includes is missing, this is a significant part of the recursive composition model (half of it, one of the two ways to nest composites).

It's not really half - it's really just a very small part of the model, comparable to the <import> element we used to support in M1. Again, I don't see why we need to rewrite the model to add this in. Quite the opposite: you've said you've been looking for way to engage and this could be it.

I disagree. Includes are a very significant part of the assembly model (the other part is the ability to use a composite as a component implementation). Two examples: - An included composite is the equivalent of a module fragment in the 0.9 spec. This concept is key to allowing a team to work on various pieces of an application, split in multiple composites, included in a composite representing the application. - When (formerly subsystems) composites get deployed to an SCA system, they are actually included in that system, rather than being used as component implementations.

It's not "half" of the recursive model. In fact, most of the time we spent in the spec group was grappling with other issues related to recursion.

I don't see an immediate relation between the time spent by the spec group on a specific item and its importance for application developers. I am looking at this from an application developer point of view and saying that I'll use Includes as much as composition through (composite) components. Includes will allow a team to distribute work on an SCA application and also represent a key concept for system composition. I am starting to look at scenarios and can actually see the usage of Includes in almost all of them, but having trouble finding good use cases for the other form of composition (nested component implementations). So I stand by my statement that understanding how includes work is key here.

Yep, and that's why I originally pushed the include mechanism in the spec group over a year ago (I didn't push the fragment classpath approach though, so don't blame me for that one). However, to say that is "half" of implementing the recursive model leaves the detail out. In this context Jeremy's metaphor of the paddling duck is apropos. The runtime should be like a duck in that on the surface it just merrily moves along the water but under the surface it is paddling away. The same for the runtime: from the app developer's perspective things just work and they are simple, but under the covers the runtime is managing all of the complexity. In my opinion, there is a lot more complexity to recursion than what the app developer sees. Hopefully in the end, the runtime architecture is graceful and we don't wind up with an ugly duckling as we did in M1 ;-)




This list is not exhaustive... Another idea would be to externalize support for Composites in a separate plugin not part of the core service model (since there may be other ways to compose services in addition to an SCA composite, with Spring or other similar programming models), I'd like to know what people think about that.


Having the composite implementation type in the core does not preclude that - again, it's just packaging for ease-of-use.

I think it's more significant than packaging. Are you saying that we could move the code supporting composites out of core2 without breaking the code in core2?

Why would we do this? We can already support multiple composite implementation types - have a look at the Spring extension. That just sounds like unnecessary complication.
Why? to avoid unecessary and dangerous coupling that will hurt us when we try to evolve this runtime. How to illustrate that? how about trying to move code supporting composites out of core2? I'm realizing I'm asking the same question again... but I think it's an important question, sill unanswered.
I don't think moving packages around guarantees dangerous coupling; pluralism, vigilance, and good design do. It may, though, unnecessarily compound complexity. We have a Spring extension. How about adding more composite types to core2? This way, we can expand the number of containers, achieve a level of pluralism, watch that we don't over-couple, and derive a good extension design?




You seem to have the impression that the core is sealed and that we only support things that are included within it. That is not the case. The only place we need things bundled with the core is in the bootstrap phase - specifically, we need them bundled with the primordial deployer. The actual runtime is created from the SCDL passed to that primordial deployer, can contain any mix of components and need not contain any of the infrastructure used to boot it.

I just checked in sandbox/sebastien/m2-design/model.spi a set of new interfaces. This is just an initial strawman to trigger a constructive discussion and ideas on how to best represent the recursive model. I also need help to define a scenario (not unit test cases, but an end to end sample application) to help put the recursive composition model in perspective and make sure we all understand it the same way.


I am troubled that you have chosen to start on your own codebase at a time when most of us have been trying to have constructive discussion on this list. Based on the approach you proposed in your original email I would have hoped that we could have started with your end-user scenarios and had a chance to explore how they could be supported by M1, the sandbox, or some other code before starting another codebase. I'm disappointed that, having started this very thread nearly a week ago with the premise of community, your first response on it was to commit a large chunk of independent code rather than follow up with any of the other people who have already contributed to the discussion.

I think discussion led to compromise and consensus on the scenario-driven approach that you proposed. As shown above and in other recent threads, there's plenty of room for improvements and/or new features in our current code and a willingness to discuss them, albeit in terms of technical merit rather than personal opinion. I hope you can find a way to join in rather than forge your own path.

--Jeremy




------------------------------------------------------------------- --
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



--Jean-Sebastien


-------------------------------------------------------------------- -
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




--
Jean-Sebastien


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to