Re: Framework for StAX-based model loading

Jim Marino Wed, 05 Apr 2006 09:10:45 -0700

I think this this is a really good approach and will give us a greatbinding/extension story for Tuscany. Two comments on the statementthat the model may look a little different than what we have here.The first one is that in general, I'm o.k. with that as long as itfollows common Java idioms. I don't think this will be a problem.The second is I'm happy to help out porting the runtime core to usethe new model so just let me know when we have a cut of the new model.

Jim


On Apr 4, 2006, at 3:16 PM, Frank Budinsky wrote:

Resending my previous post ... I see that the formatting I wrote itwithwent away and it became impossible to read. Congratulations toanyone that

got anything out of it at all :-)

Sorry about that,
Frank.


Here is a proposal for moving forward on this issue that I think (am
hoping :-) everyone can live with.

1. Remove the existing logical and physical models (and corresponding
   transformer code) and replace them with a single logical model
   created as follows:
  a. One-time generate a pure JavaBean model from the XMLSchema
     (sca-core.xsd) using a hacked up prototype of the SDO generator.
     This prototype suppresses SDO things (e.g., reflective methods,
     for example), so the generated classes are not SDOs - they're
     POJOs.

b. Hand modify the generated classes to add additional methodsneededfor the logical model. The end result, will look very similar,but

     not identical, to the current logical model, so a small amount of
     work will be needed to port client code from the current logical
     model to this new combined "logical/physical" model.

2. Modify Jeremy's StAX handlers to work with this new model, and also
   use them as the prototypical example of the output for a new
   -generateLoader option for the SDO generator. The plan for the May
   release will be to use the modified hand written StAX handlers, but

they will be marked as "to be generated", so that it's clearthat in

   the future these handlers will be replaced with generated ones.

3. Start immediately on the SDO -generateLoader and -simpleBeansoptions.

   We will use -generateLoader for the core model, as soon as it's

available (target 2-3 months?), but we will not plan to regenthe core

   model using -simpleBeans. The model will remain hand coded

indefinitely, but we can revisit the possibility of generatingpartsof it in the future (but this won't be a priority). The -simpleBeans

   option will be available for use by people adding extensions to the
   model, if they want to start with XSDs.

4. Also start immediately on a -generateSerializer option so thatwe willbe able to use simple beans that need to be saved as well asloaded.

   Given that we support generation of POJOs, we need both generated
   loaders and serializers to use them. We also need to start work on
   defining the necessary Java annotations to support generating

loaders/serializers from hand-written POJOs (or more generallyto also

   generate SDOs from hand written Java interfaces.

I think this approach is a win-win for both the SCA and SDO projects.

Thoughts?

Thanks,
Frank.

Frank Budinsky/Toronto/[EMAIL PROTECTED] wrote on 03/27/2006 01:26:46 PM:

Jim Marino <[EMAIL PROTECTED]> wrote on 03/24/2006 05:53:46 PM:

Thanks Frank for answering these questions.  I have a few more that
maybe you or others could offer opinions on.

On Mar 24, 2006, at 12:10 PM, Frank Budinsky wrote:

I don't know much about how the sca properties are configured, but

I'll
try to answer your questions anyway.

- As a user what steps do I need to take to provide custom data
values for config properties? In a previous post, I listed an

example

of a concrete "Foo" class


Option 1)

Provide an XML schema completxType definition for the Type and let

the

generator gen the impl including the deserialization support. In

the

future, we plan to also let you provide a Java interface (with
annotations, if necessary) to define the type, and then have the
implementation class generated for you.

The SDO generator will essentially generate the same Foo class

that

you
showed in the other thread, just with the addition of a base class
(DataObjectBase), and some get/set method overrides that implement
efficient switch-based reflective accessors - used by the generic

XML

serializer/deserializer. If we also provide an option to generated

loader, in the future, we could also provide an option to supress

the

generation of the reflective accessors. The resulting class would

no

longer be an SDO object in this case - but it would be easy to do

as

value-add feature in our generator (i.e., a -generateSimpleBean
option).

Option 2)

Write the Foo implementation class yourself (or maybe generate it

with

some other technology - like JAXB) and then simply register it as

DataType with SDO. Remember that not all objects in an SDO model
need to
be DataObjects. If you want non-DataObjects, they're modeled as
DataTypes,
and you need to provide create from and convert to String methods

for

them.

I think option two is the more appealing one for applications
developers. I read option 1 to require a schema, which we may be

able

to do for extensions, but is a bit much to ask application

developers

to produce.  So, I'm curious as to how the conversion methods you
mentioned look like.  Assume I have the following Java

implementation

and configuration class:

I wouldn't write off option 1 so quickly. For your example, a schema

(or

equivalent SDO metadata) something like this is all that one needs:

<element name="myFoo" type="Foo"/>

<complexType name="Foo">
    <sequence>
        <element name="name" type="xsd:int"/>
        <element name="foo" type="Foo"/>
        <element name="myJaxBThing" type="jaxb:jaxBThing"/>
    </sequence>
</complexType>

This schema could be deduced (under the covers) from the Javaclasses

you

show below, so you wouldn't need to actually write it (once weget the

Java import support working, of course).


public class MyComponent{

     @Property
     private Foo; myFoo;

}


public class Foo{

     public Foo(){}

     private String name;

     public setName(String val){
         name = val;
     }

     private Foo foo;

     public void setFoo(Foo val){
         foo = val;

     }

     private MyJaxBThing jaxBThing;

     public void setMyJaxBThing(MyJaxBThing thing){
         jaxBthing = thing;
     }
}



And I want to use the following configuration:

     <component name="myComp>
         <implementation.java class="MyComponent/>
         <properties>
             <v:myFoo>
                     <v:name>my name</v:name>
                     <v:foo>
                             <v:name>my sub name</v:name>
                     </v:foo>
                     <jaxb:jaxBThing>
                             <!-- other configuration according

to

JAX-B--->

                     <jaxb:jaxBThing>
             <v:myFoo>
         </properties>
     </component>

I'm assuming I would have to register Foo and MyJaxBThing with SDO?
Could someone walk through the steps I would need to do to tell the

runtime how to take the particular configuration and deserializeit?

Assuming, however, that we don't have metadata, but just want to
deserialize by hand. I don't think the SDO approach is any easier or
more
difficult than the StAX approach. By default the SDO deserializerwill

represent the "untyped" properties section of the model as aSequence
(i.e., an unstructured representation of the "xsd:any" contents).

We'll

need some way to plug-in a converter, maybe something like a

FooFactory,

similar to what Jeremy described for the StAX approach. Btw, SDO has
createFromString methods for all the standard basic types plus a

generic

createFromString method that work like Jeremy described (i.e., try
valueOf, constructors, etc.).
Also, what would the string transformation methods look like inthis

case? I'm also having difficulty pinning down how the JAXB class is
instantiated (I'm assuming something needs to access a JAXB factory
at some point).

I don't know enough about JAXB to say. Maybe someone else knows?


Another really common use case (sorry to keep harping on this one,
but I see it all of the time) is support for List and Map. I should
be able to specify some type of XML serialized form and have

property

configuration injected on a component as a List or Map.  I'm

assuming

based on your comments below this can be done to the SDO
implementation and we could provide this to end-users without them
having to configure something?

Yes ... the Sequence (DOM-like) view of the properties is there by
default.


One final scenario, related to this, is support for factories for
property instantiation. IoC containers such as Spring have a way to
pass a factory in to the injection engine to delegate to for

creating

property instances.  Could this be done with SDO?

I think we could provide something like this in Tuscany ... a
Tuscany-specific extension SPI.

- What steps do I need to extend the current model? What

dependencies

are there?


I'm not sure about this, it depends on the model. Is there a base
type in
the XSD for these properties. If so, then I suspect that you need

to

define the schema for your extension. If you go with option 1,
above, that
comes for free. If you want to do things by hand, then I think you

could
just treat your extension as unstructured XML (in the open content
extension points in the model). Maybe someone else understands the

model
here better than I do?

- Can I use a custom binding technology to produce my model

object?


I think I answered this in the option 2) section, above.

- Is it easy to support isolation between classloaders in managed
environments? My impression is that this is extremely problematic

due

to required support of .INSTANCE.  If that is the case, what is

the

likelihood that the spec can be changed in a timely manner to

improve

this?


I don't think I understand where this problem will come up. In the

static
generated class scenarios that we're talking about, there really
shouldn't
be any access to .INSTANCE variables. Maybe someone can give a
concrete
example where this might be a problem, and we can try to figure

out

the
solution from there.

I have two concrete examples here where I have seen problems in

other

projects:
1. Assume there are two nested components whose implementationtypes

are loaded by different classloaders. These two nested components
have a property that takes a "Foo". The configuration schema is the
same but the "Foo" classes are different because they are loaded by
different classloaders. Do you think we will run into any issues

here?

Not unless the first Foo instance is passed to the second component
(that's expecting the second Foo). But this doesn't strike me as an

SDO

issue, it would be a problem even if the Foo class was hand coded,

don't

you think?

2. Another concern is around application reloadability. If I have a
registered type of "Foo" and the application it was registered by
needs to be reloaded, how is it flushed from SDO? Does thecontainer

have to call a flush method somewhere?

This depends on how we handle the scoping. If the TypeHelper that

knows

about Foo is in a private application scope then it should go away

with

the application.

I think we need to be clear that any shortcomings in the SDO spec
should
not be a problem in generated scenarios. Other than saying that

the

generated interfaces for SDO types are bean-like, the SDO spec
dictates
very little about the nature of the generated code. We can fix
whatever we
need to.

I appreciate that and you taking the time to help explain thisstuff

to me. I guess I'm going to be a typical example of someone who

wants

to extend the container and has a bunch of questions :-)

This is a good excersize for me as well. Regardless of the actual

decision

of whether or not to use SDO for this particular purpose in SCA, it

will

help to clarify the issues and what parts of the SDO impl need

attention.


Thanks, Frank.

We really are just trying to leverage the Tuscany generator to do
XML binding here ... our config loader does not need to be a fully
compliant SDO application.

Thanks,
Frank.


Jim Marino <[EMAIL PROTECTED]> wrote on 03/24/2006 01:31:20 PM:

I think there may be some issues uncovered with the requirements

and

I'm not sure we all understand the advantages/disadvantages of

each

approach.  We may be over-analyzing this but the discussion was
getting very heated, there was a lot of disagreement over what

the

actual (dis)advantages were, and I wanted to understand (at least

for

myself) the broader implications.  I thought stepping back a bit

what

help clarify these things. For example, I am personally unclear

on

how to do the following with SDO:

- As a user what steps do I need to take to provide custom data
values for config properties? In a previous post, I listed an

example

of a concrete "Foo" class

- What steps do I need to extend the current model? What

dependencies

are there?

- Can I use a custom binding technology to produce my model

object?


- Is it easy to support isolation between classloaders in managed
environments? My impression is that this is extremely problematic

due

to required support of .INSTANCE.  If that is the case, what is

the

likelihood that the spec can be changed in a timely manner to

improve

this?

I thought Jeremy's list was good and would provide a way to

"weight"

answers to these and other questions.

Jim

On Mar 24, 2006, at 6:10 AM, Frank Budinsky wrote:

Jim, looking at your requirements (which I don't disagree with),

think
that both approaches, if not already, can be made to meet them.

Personally I think that we're over analyzing this. Both

approaches

have
some advantages and disadvantages, but both will work. Whichever
approach
we take, I suspect that some people will like it and others

won't

For
example, people that know how to program with StAX will say it's
easy to
use ... people who don't will say the opposite. If we can get to
the point
that we effectively generate the logical model (so the user has

to

write
no code), I think everyone will agree it's easy to use, since

doing

nothing is easy by definition :-) Of course we need to take a
leap of
faith that the current painful SDO codegen will evolve to that

in

the end.

Having a vested interest to make the SDO binding technology as
good as
possible, I would support, and obviously love to see the

decision

go that
way, That said, I think it's got to be about time to just make a
decision
and run with it. If this much discussion went into every design
decision,
we'd still be sharpening our chisels and working on carving the
wheel :-)

Thanks,
Frank




Jim Marino <[EMAIL PROTECTED]>
03/23/2006 02:53 PM
Please respond to
tuscany-dev


To
[email protected]
cc

Subject
Re: Framework for StAX-based model loading






There has been a lot of discussion on this topic and Jeremy's

point

brings up an issue I think needs to be fleshed out.

Specifically,

what are the requirements and priorities for loading

configuration.

Could we perhaps take the following approach?

1. Agree on the requirements and their priorities without

getting

into a technical discussion. I would suggest we rank

requirements

by

absolute priority, i.e. the most important first, the next
important,
etc. rather than "requirements A and B are p1, requirements  X

and

Y p2"

2. Based on the requirements and priorities, compare the StAX

and

SDO
approaches for each

3. Agree on one approach moving forward for configuration

If this acceptable, my opinion on requirements in priority order

are:

1. The configuration mechanism must be easy for end-users to use

to

promote widespread adoption of Tuscany

     - For example, basic types defined by the spec should be a
given, but it should also be easy for someone to add a custom

type.

For instance, my Foo component may take a Bar type as

configuration.

Based on past experience with IoC containers, I have found this
to be
a very common situation.

     -I assume this would have to involve describing the type

and

registering some kind of custom handler with the runtime

2. The configuration mechanism must be easy for container

extenders

to promote widespread adoption of Tuscany in the developer

community


     - Similar to point 1, although I think the requirements on
ease-
of-use may be slightly different.
     - One additional item here is the configuration mechanism
should
follow Java idioms as closely as possible. Manipulating the

model

should not be foreign to Java developers
     - As a side note, I think items 1 and 2 are intimately

related,

but 1 is slightly more important since Tuscany developers will
have a
higher pain threshold than end-users

3. Operation in a variety of deployment environments. For

example,

how does each approach handle different classloader hierarchy
scenarios?

4. Ability to handle serializations other than XML. This was one

of

the reasons why we went to a separate logical model. It's also

not

just related to testing although that is one use case. For

example,

configuration may be pulled from sources other than XML such as

registry.

5. Maintenance

     - There are probably two considerations here. First, what

we

use
should be easily understood and used by Java developers wanting

to

contribute to Tuscany. A second consideration is as the spec XML
changes, is it easy for us to evolve the code. Here, I would say

we

concentrate on the first. The second use case has a lower

priority

have put to item 8.

6. Versioning

     - We need a mechanism that easily supports versioning. In

the

future, we will need to support multiple configuration format
versions

7. Performance

     - We need something that will be performant. On at least

two

separate occasions, I have seen IoC container start-up brought

to

its
knees handling configuration processing.  This may not seem like

big deal but when there are 1,000s (or even a couple hundred) of
components, it rears its head.

8. Ease on "us", the commiters (the second maintenance
consideration)

     - This is where I would say how easy is it to accommodate

spec

changes comes in. Either approach can handle changes so the

question

becomes which alternative offers a better solution for

commiters.


Perhaps we could come up with a set of objective criteria to
judge by
and then move to a technical discussion of each approach?
Jim

On Mar 23, 2006, at 11:02 AM, Jeremy Boynes wrote:

I think we need to be careful to distinguish the needs we have

for

loading our configurations from the needs users have of SDO in
general. I think the SCA schemas have things in them that are
atypical: lots of extensibility, many namespaces, custom data
types, few attributes/properties and so forth. On the other

hand,

our use case doesn't need things like change tracking or

streaming

that SDO provides.

We need a good SDO implementation, we need a loading mechanism

that

can handle our configurations; the two don't have to be the

same.

If they are, that is good; if they aren't, that's not bad.

--
Jeremy

Jean-Sebastien Delfino wrote:

Raymond Feng wrote:

Hi, Frank.

I think I fully agree with you. An efficient databinding is

what

we're looking for.

Ideally, if SDO later on supports lazy-loading (create the
DataObject skeleton first and pull in properties as they're
assessed) from XMLStreamReader, I assume we'll take advantage

of

the benifits advocated by both camps (Databinding vs. StAX).

Raymond

----- Original Message ----- From: "Frank Budinsky"
<[EMAIL PROTECTED]>
To: <[email protected]>
Sent: Thursday, March 23, 2006 9:37 AM
Subject: Re: Framework for StAX-based model loading

I stand by my statement that the EMF problem is short term

pain

for long
term gain :-) I think that in the long term using the SDO
generator will
be the best and easiest way to do this. Yes I am biased, but
I've seen it
before - avoiding reuse/dependencies works nicely at first,

but

as things
grow/change and get more comlicated, the amount of

reworking/

reinventing
becomes quite a nightmare. The opposite problem, which I

think

we're
suffering from here, is that the reusable component that we

are

trying to
leverage isn't as nice and clean and a perfect fit as we'd

like,

so it
really looks undesirable. Since we have control of all the
pieces, in this
case, I think we have a great opportunity to make it a clean
fit. And like
I said in my reply to Jeremy, earlier, I really strongly

feel

that the
problems that we're identifying here are not unique to SCA,

so

fixing them
is really in our best interest.

Frank.

"ant elder" <[EMAIL PROTECTED]> wrote on 03/23/2006
10:13:24 AM:

On 3/23/06, Guillaume Nodet <[EMAIL PROTECTED]> wrote:

<snip/>

 As the binding itself uses JAXB2 (though it may change in

the future), I have to include all eclipse dependencies

and

SDO stuff,

just to load the system configuration files :(



From the discussion I'm starting to be persuaded by some of

the

arguments

for the SDO approach, but this EMF dependency seems a draw
back. If

we're

going to support alternate data bindings for the WS binding

its

not

great to

still be dragging in EMF to run the thing. And I'd guess it
would be

much

easier to sell SDO to say the Axis2 guys to use instead of
XmlBeans if

there

was a pure Java SDO impl. Any Axis2 guys listening who'd
comment on

this?


As another comparison look at Axis2, they have their own

very

simple

Axis

Data Binding (ADB) which supports simple XSDs, and they use
XmlBeans for

all

the complicated stuff. They don't use XmlBeans all the time
because lots

of

things don't need the complexity a full blown data binding
brings. And

as

Guillaume points out, the SCA binding schema are usually

pretty

simple.

   ...ant

Raymond,
That's a very good point, I agree.
I think that this whole discussion thread is very useful as it
helps us identify requirements and areas of improvement for

our

SDO databinding and codegen story. For example, Guillaume
mentioned that it would be great to have a Maven 1 SDO codegen
plugin, as ServiceMix is still built with Maven 1 at the

moment

(and I guess a number of other projects out there still use

Maven

1 as well). I can spend some time in the next few days and

work

with anybody who would like to volunteer and try to wrap the

code

generator in a Maven 1 plugin, if it helps. Guillaume, are you
using Ant at all? or just Maven 1?

Re: Framework for StAX-based model loading

Reply via email to