Re: Data flow on a wire

2006-03-22 Thread Jeremy Boynes

Jim Marino wrote:
A couple of us have started to discuss this as well in relation to 
Celtix...My main concerns, which there appears to be agreement, are:


1. We are not instituting a canonical form model similar to JBI in the 
runtime. I think Jeremy stated this is not the case


Having trouble parsing :-)
I do not think we should have a canonical form model similar to JBI.
I do think we should identify a set of common forms to reduce the number 
of format conversions we need to implement.


2. Local invokes - i.e. where semantics are pass-by-reference - honor 
that and do not have *parameters* mediated. This appears non-controversial.


Let's be controversial then :-)
The contract is pass-by-reference not pass-pointer-by-value - if the 
runtime chooses to mediate the pointer format then it is free to do so.


This may enable us to support local calls across language - e.g. 
converting a Java object reference to a pointer in C++. It is the 
runtime's responsibility to make sure the memory model is not violated.


3. Mediation is done using handlers and intereceptors. I think we are 
also in agreement here.




+1

I still have questions around how the container and bindings declare 
support for certain data binding formats. We need to come up with a 
design here. For example, would the following be a valid approach in 
your opinion (I haven't thought too much about it so it is kind of vague)?


1. Implementation types declare whether they support pass-by-ref. Since 
this is a Java runtime, pass-by-ref means object references in the VM.


I'd modify that slightly to say the service contract determines if 
parameters are passed by reference. Any implementation offering that 
service contract must be able to support it.


The type of runtime must not impact the service contract and the runtime 
must fail an attempt to deploy a contract that cannot be fulfilled.


The type of message used by the runtime is currently unspecified - in 
our Java runtime we are defining the content format to be reference to 
any Java Object.


2. Bindings register themselves can be queried for what formats they 
support. They may delegate to some data binding service.


I think we need to distinguish between transport binding and data 
binding. Transports should delegate all serialization to data bindings.


3. Tuscany may want to declare some common formats such as Java--SDO, 
SDO--, StaX stream--Java, Java--Stax, SDO--StaX, JAXB--Java etc. 
and have interceptors or handlers perform those transformations. These 
handlers could be registered with a wire builder and inserted into  an 
invocation chain


Declare to me implies something special. I think these are transforms 
that we include in our baseline profile - they are no different to 
ones provided by users.


Yes, they register their availability and the wire builder selects them 
as appropriate when constructing the wire.


4. Implementation types and bindings declare what pass-by-value formats 
they support in order of preference.


This may be what you meant but I think this depends on the wiring 
requirements. The implementation type shouldn't specify a general 
preference; instead it should say which ones are supported for a 
particular parameter and provide a relative cost for each. The wire 
builder calculates cost for source and target and selects the most 
efficient wire.




Dan, I know you have a bunch of thoughts on this, so it would be 
interesting to discuss them in this thread.




Please, this is key so the more perspectives the better.
--
Jeremy



Re: Data flow on a wire

2006-03-22 Thread Jim Marino


On Mar 22, 2006, at 9:32 AM, Jeremy Boynes wrote:


Jim Marino wrote:

A couple of us have started to discuss this as well in relation to  
Celtix...My main concerns, which there appears to be agreement, are:
1. We are not instituting a canonical form model similar to JBI  
in the runtime. I think Jeremy stated this is not the case




Having trouble parsing :-)
I do not think we should have a canonical form model similar to JBI.
I do think we should identify a set of common forms to reduce the  
number of format conversions we need to implement.


Yes. At first I read this as I *do* think.. which scared me :-) I  
need to check my vision again.  We're in agreement here.


2. Local invokes - i.e. where semantics are pass-by-reference -  
honor that and do not have *parameters* mediated. This appears non- 
controversial.




Let's be controversial then :-)
The contract is pass-by-reference not pass-pointer-by-value -  
if the runtime chooses to mediate the pointer format then it is  
free to do so.




Maybe I should have been clearer - by not mediate the parameters I  
mean that the runtime cannot violate pass by reference. For Java-- 
Java in particular, which I would guess will be at least 90% of  
local calls (which we should optimize for), the strategy for doing  
this should be passing references directly. For other situations I  
don't have an opinion other than it should be done in a handler or  
interceptor or extension and not in the core runtime.


This may enable us to support local calls across language - e.g.  
converting a Java object reference to a pointer in C++. It is the  
runtime's responsibility to make sure the memory model is not  
violated.


This to me is a nice to have sometime in the future but not  
something we should optimize for right now.


3. Mediation is done using handlers and intereceptors. I think we  
are also in agreement here.




+1


I still have questions around how the container and bindings  
declare support for certain data binding formats. We need to come  
up with a design here. For example, would the following be a valid  
approach in your opinion (I haven't thought too much about it so  
it is kind of vague)?
1. Implementation types declare whether they support pass-by-ref.  
Since this is a Java runtime, pass-by-ref means object references  
in the VM.




I'd modify that slightly to say the service contract determines if  
parameters are passed by reference. Any implementation offering  
that service contract must be able to support it.



I think we need to follow the spec here.

The type of runtime must not impact the service contract and the  
runtime must fail an attempt to deploy a contract that cannot be  
fulfilled.


The type of message used by the runtime is currently unspecified -  
in our Java runtime we are defining the content format to be  
reference to any Java Object.


Yes. What would you propose here? Also, could you provide a  
description of what happens when an invoke is done across two local  
Java services?


2. Bindings register themselves can be queried for what formats  
they support. They may delegate to some data binding service.




I think we need to distinguish between transport binding and data  
binding. Transports should delegate all serialization to data  
bindings.

Yes, I forgot to preface the first bindings with transport binding.



3. Tuscany may want to declare some common formats such as Java-- 
SDO, SDO--, StaX stream--Java, Java--Stax, SDO--StaX, JAXB-- 
Java etc. and have interceptors or handlers perform those  
transformations. These handlers could be registered with a wire  
builder and inserted into  an invocation chain




Declare to me implies something special. I think these are  
transforms that we include in our baseline profile - they are no  
different to ones provided by users.


Yes, they register their availability and the wire builder selects  
them as appropriate when constructing the wire.
Declare = register, nothing more. They are just extensions included  
in the baseline. We do need some way of naming them though.



4. Implementation types and bindings declare what pass-by-value  
formats they support in order of preference.




This may be what you meant but I think this depends on the wiring  
requirements. The implementation type shouldn't specify a general  
preference; instead it should say which ones are supported for a  
particular parameter and provide a relative cost for each. The wire  
builder calculates cost for source and target and selects the most  
efficient wire.


Relative cost is a preference isn't it? The implementation type  
bases this preference on a selfish calculation since it does not know  
what the source type is. What's the difference?


Dan, I know you have a bunch of thoughts on this, so it would be  
interesting to discuss them in this thread.




Please, this is key so the more perspectives the better.
--
Jeremy






Re: Data flow on a wire

2006-03-22 Thread Raymond Feng



Re-posted since the previous one is 
missing the diagram

Hi,I think I have an interesting picture for this topic.1) 
The data transformation capabilities for various databindings can be nicely 
modeled as a weighted, directed graph with the following rules. (Illustrated 
in the attached diagram).a. Each databinding is mapped to a 
vertex.b. If databinding A can be transformed to databinding B, then an edge 
will be added from vertex A to vertex B.c. The weight of the edge is the 
cost of the transformation from the source to the sink.2) In the 
data mediator/interceptor on the wire, if we find out that the data needs to 
be transformed from databinding A to databinding E. Then we can apply 
Dijkstra's Shortest Path Algorithm to the graph and figure the most 
performed path. It can be A--E, or A--C--E depending on the weights. 
If no path can be found, then the data cannot be mediated.Any 
thoughts?Thanks,Raymond
- Original Message - 
From: "Jeremy Boynes" [EMAIL PROTECTED]
To: tuscany-dev@ws.apache.org
Sent: Wednesday, March 15, 2006 3:37 
PM
Subject: Data flow on a wire
A couple of us had an offline chat about what the format of data 
would be exchanged on the wire during an interaction between a client 
and a provider. The spur for this was the JSON binding Ant was working 
on which has no obvious affinity to XML.  Another issue 
related to this has been about supporting streaming types for 
interactions where data flows through a system rather than terminating 
there. This is related to Axiom and its use for precisely this purpose 
in Axis2.  I wanted to capture thoughts whilst still current and 
open the discussion.  As I see it there is no single answer to 
this, well apart from "it depends." :-) I think it is necessary for us 
to support the flow of any data type that is supported by both the 
client and the provider. With the ability to attach data transformation 
mediations to wires, this actually becomes a requirement to support any 
data type that can be mapped from client to provider and back 
again.  In any interchange there are just two things that are 
defined: the format of data that will be supplied by the client and the 
format of data that will be consumed (delivered to) the provider. 
Neither client or provider needs to be aware of the format of data on 
the other end or of what gyrations the fabric went though in order to 
make the connection. As part of making the connection, it is the 
fabric's job to make the connection as efficient as possible, factoring 
in the semantic meaning of the data, the policies that need to be 
applied, and what the different containers support.  All 
this flexibility just about requires we use the most generic type 
possible to hold the data being exchanged: a java.lang.Object or a 
(void*) depending on the runtime. The actual instance used would depend 
on the actual wire, some examples from Java land being: * POJO (for 
local pass by reference) * SDO (when supplied by the 
application) * Axiom OMElement (for the Axis2 binding) * StAX 
XMLStreamReader (for streamed access to a XML infoset) * 
ObjectInputStream (for cross-classloader serialization) and so 
forth.  Each container and transport binding just needs to 
declare which data formats it can support for each endpoint it manages. 
The wiring framework need to know about these formats and about 
what transformations can be engaged in the wire pipeline. 
 For example, the Axis2 transport may declare that it can support 
Axiom and StAX for a certain port and the Java container may declare 
that it can only handle SDOs for an implementation that expects to be 
passed a DataObject. The wiring framework can resolve this by adding a 
StAX-SDO transform into the pipeline.  The 
limitation here is whether a transformation can be constructed to match 
the formats on either end. If one exists then great, but as the number 
increases then developing n-squared transforms becomes impractical. A 
better approach would be to pick the most common formats and require 
bindings and containers to support those at a minimum, with other 
point-to-point transforms being added as warranted.  Given the 
flow issue descibed above and the XML nature of many our interactions I 
would suggest that a StAX XMLStreamReader may be the most apporpriate 
common format (at least for now). It's native to Axis2 and Raymond has 
posted patches already to support it in SDO.  Alternatively, we 
don't need all of StAX for this to work so it may be simpler to provide 
a basic API that is essentially the same as an XMLStreamReader but 
without all the other stuff.  Thanks for reading this far. The 
idea was to capture thinking and all input is welcome. 
-- Jeremy


Re: Data flow on a wire

2006-03-22 Thread Raymond Feng
Sorry, the attachment cannot go through. I added it to the wiki page @ 
http://wiki.apache.org/ws/Tuscany/DataMediation.


Thanks,
Raymond

- Original Message - 
From: Jeremy Boynes [EMAIL PROTECTED]

To: tuscany-dev@ws.apache.org
Sent: Wednesday, March 15, 2006 3:37 PM
Subject: Data flow on a wire



A couple of us had an offline chat about what the format of data would
be exchanged on the wire during an interaction between a client and a
provider. The spur for this was the JSON binding Ant was working on
which has no obvious affinity to XML.

Another issue related to this has been about supporting streaming types
for interactions where data flows through a system rather than
terminating there. This is related to Axiom and its use for precisely
this purpose in Axis2.

I wanted to capture thoughts whilst still current and open the discussion.

As I see it there is no single answer to this, well apart from it
depends. :-) I think it is necessary for us to support the flow of any
data type that is supported by both the client and the provider. With
the ability to attach data transformation mediations to wires, this
actually becomes a requirement to support any data type that can be
mapped from client to provider and back again.

In any interchange there are just two things that are defined: the
format of data that will be supplied by the client and the format of
data that will be consumed (delivered to) the provider. Neither client
or provider needs to be aware of the format of data on the other end or
of what gyrations the fabric went though in order to make the
connection. As part of making the connection, it is the fabric's job to
make the connection as efficient as possible, factoring in the semantic
meaning of the data, the policies that need to be applied, and what the
different containers support.

All this flexibility just about requires we use the most generic type
possible to hold the data being exchanged: a java.lang.Object or a
(void*) depending on the runtime. The actual instance used would depend
on the actual wire, some examples from Java land being:
* POJO (for local pass by reference)
* SDO (when supplied by the application)
* Axiom OMElement (for the Axis2 binding)
* StAX XMLStreamReader (for streamed access to a XML infoset)
* ObjectInputStream (for cross-classloader serialization)
and so forth.

Each container and transport binding just needs to declare which data
formats it can support for each endpoint it manages. The wiring
framework need to know about these formats and about what
transformations can be engaged in the wire pipeline.

For example, the Axis2 transport may declare that it can support Axiom
and StAX for a certain port and the Java container may declare that it
can only handle SDOs for an implementation that expects to be passed a
DataObject. The wiring framework can resolve this by adding a StAX-SDO
transform into the pipeline.

The limitation here is whether a transformation can be constructed to
match the formats on either end. If one exists then great, but as the
number increases then developing n-squared transforms becomes
impractical. A better approach would be to pick the most common formats
and require bindings and containers to support those at a minimum, with
other point-to-point transforms being added as warranted.

Given the flow issue descibed above and the XML nature of many our
interactions I would suggest that a StAX XMLStreamReader may be the most
apporpriate common format (at least for now). It's native to Axis2 and
Raymond has posted patches already to support it in SDO.

Alternatively, we don't need all of StAX for this to work so it may be
simpler to provide a basic API that is essentially the same as an
XMLStreamReader but without all the other stuff.

Thanks for reading this far. The idea was to capture thinking and all
input is welcome.
--
Jeremy 




Re: Data flow on a wire

2006-03-22 Thread Jim Marino


On Mar 22, 2006, at 10:10 AM, Jim Marino wrote:




On Mar 22, 2006, at 9:32 AM, Jeremy Boynes wrote:




Jim Marino wrote:



A couple of us have started to discuss this as well in relation  
to Celtix...My main concerns, which there appears to be  
agreement, are:
1. We are not instituting a canonical form model similar to JBI  
in the runtime. I think Jeremy stated this is not the case






Having trouble parsing :-)
I do not think we should have a canonical form model similar to  
JBI.
I do think we should identify a set of common forms to reduce the  
number of format conversions we need to implement.




Yes. At first I read this as I *do* think.. which scared me :-) I  
need to check my vision again.  We're in agreement here.







2. Local invokes - i.e. where semantics are pass-by-reference -  
honor that and do not have *parameters* mediated. This appears  
non-controversial.






Let's be controversial then :-)
The contract is pass-by-reference not pass-pointer-by-value -  
if the runtime chooses to mediate the pointer format then it is  
free to do so.






Maybe I should have been clearer - by not mediate the parameters  
I mean that the runtime cannot violate pass by reference. For  
Java--Java in particular, which I would guess will be at least  
90% of local calls (which we should optimize for), the strategy for  
doing this should be passing references directly. For other  
situations I don't have an opinion other than it should be done in  
a handler or interceptor or extension and not in the core runtime.




This may enable us to support local calls across language - e.g.  
converting a Java object reference to a pointer in C++. It is the  
runtime's responsibility to make sure the memory model is not  
violated.




This to me is a nice to have sometime in the future but not  
something we should optimize for right now.







3. Mediation is done using handlers and intereceptors. I think we  
are also in agreement here.






+1




I still have questions around how the container and bindings  
declare support for certain data binding formats. We need to come  
up with a design here. For example, would the following be a  
valid approach in your opinion (I haven't thought too much about  
it so it is kind of vague)?
1. Implementation types declare whether they support pass-by-ref.  
Since this is a Java runtime, pass-by-ref means object references  
in the VM.






I'd modify that slightly to say the service contract determines if  
parameters are passed by reference. Any implementation offering  
that service contract must be able to support it.





I think we need to follow the spec here.



The type of runtime must not impact the service contract and the  
runtime must fail an attempt to deploy a contract that cannot be  
fulfilled.


The type of message used by the runtime is currently unspecified -  
in our Java runtime we are defining the content format to be  
reference to any Java Object.




Yes. What would you propose here? Also, could you provide a  
description of what happens when an invoke is done across two local  
Java services?







2. Bindings register themselves can be queried for what formats  
they support. They may delegate to some data binding service.






I think we need to distinguish between transport binding and data  
binding. Transports should delegate all serialization to data  
bindings.




Yes, I forgot to preface the first bindings with transport binding.







3. Tuscany may want to declare some common formats such as  
Java--SDO, SDO--, StaX stream--Java, Java--Stax, SDO--StaX,  
JAXB--Java etc. and have interceptors or handlers perform those  
transformations. These handlers could be registered with a wire  
builder and inserted into  an invocation chain






Declare to me implies something special. I think these are  
transforms that we include in our baseline profile - they are no  
different to ones provided by users.


Yes, they register their availability and the wire builder selects  
them as appropriate when constructing the wire.



Declare = register, nothing more. They are just extensions included  
in the baseline. We do need some way of naming them though.








4. Implementation types and bindings declare what pass-by-value  
formats they support in order of preference.






This may be what you meant but I think this depends on the wiring  
requirements. The implementation type shouldn't specify a general  
preference; instead it should say which ones are supported for a  
particular parameter and provide a relative cost for each. The  
wire builder calculates cost for source and target and selects the  
most efficient wire.




Relative cost is a preference isn't it? The implementation type  
bases this preference on a selfish calculation since it does not  
know what the source type is. What's the difference?






Jeez, I'm having trouble reading today - I also didn't see per  
parameter vs. general.  Sorry and 

Re: Data flow on a wire

2006-03-15 Thread Jim Marino


On Mar 15, 2006, at 3:37 PM, Jeremy Boynes wrote:


A couple of us had an offline chat about what the format of data would
be exchanged on the wire during an interaction between a client and a
provider. The spur for this was the JSON binding Ant was working on
which has no obvious affinity to XML.

Another issue related to this has been about supporting streaming  
types

for interactions where data flows through a system rather than
terminating there. This is related to Axiom and its use for precisely
this purpose in Axis2.

I wanted to capture thoughts whilst still current and open the  
discussion.


As I see it there is no single answer to this, well apart from it
depends. :-) I think it is necessary for us to support the flow of  
any

data type that is supported by both the client and the provider. With
the ability to attach data transformation mediations to wires, this
actually becomes a requirement to support any data type that can be
mapped from client to provider and back again.

In any interchange there are just two things that are defined: the
format of data that will be supplied by the client and the format of
data that will be consumed (delivered to) the provider. Neither client
or provider needs to be aware of the format of data on the other  
end or

of what gyrations the fabric went though in order to make the
connection. As part of making the connection, it is the fabric's  
job to
make the connection as efficient as possible, factoring in the  
semantic
meaning of the data, the policies that need to be applied, and what  
the

different containers support.

All this flexibility just about requires we use the most generic type
possible to hold the data being exchanged: a java.lang.Object or a
(void*) depending on the runtime. The actual instance used would  
depend

on the actual wire, some examples from Java land being:
* POJO (for local pass by reference)
* SDO (when supplied by the application)
* Axiom OMElement (for the Axis2 binding)
* StAX XMLStreamReader (for streamed access to a XML infoset)
* ObjectInputStream (for cross-classloader serialization)
and so forth.

Each container and transport binding just needs to declare which data
formats it can support for each endpoint it manages. The wiring
framework need to know about these formats and about what
transformations can be engaged in the wire pipeline.

For example, the Axis2 transport may declare that it can support Axiom
and StAX for a certain port and the Java container may declare that it
can only handle SDOs for an implementation that expects to be passed a
DataObject. The wiring framework can resolve this by adding a StAX- 
SDO

transform into the pipeline.

The limitation here is whether a transformation can be constructed to
match the formats on either end. If one exists then great, but as the
number increases then developing n-squared transforms becomes
impractical. A better approach would be to pick the most common  
formats
and require bindings and containers to support those at a minimum,  
with

other point-to-point transforms being added as warranted.

This seems kind of like JBI. A question here is whether a normalized  
form is really practical and whether it is the easiest thing to do.  
Also, is mediation even the concern of the runtime? Should the  
runtime just make it possible to do mediation and delegate to a  
mediator interceptor/handler or create an implementation type that  
is a mediation component? Also, what about local invoke? I assume a  
container would have to declare support of primitives and Object? I  
think it may just be easier to settle on Object as the common form.



Given the flow issue descibed above and the XML nature of many our
interactions I would suggest that a StAX XMLStreamReader may be the  
most

apporpriate common format (at least for now). It's native to Axis2 and
Raymond has posted patches already to support it in SDO.

Again, what about local invocations or things that just require  
simple serialization over a socket?

Alternatively, we don't need all of StAX for this to work so it may be
simpler to provide a basic API that is essentially the same as an
XMLStreamReader but without all the other stuff.

Thanks for reading this far. The idea was to capture thinking and all
input is welcome.
--
Jeremy