Jerome Louvel wrote:
Hi Marc,
thx for this, I do think there is more stuff that needs to happen, I'm
just not sure yet about the whole approach.
* I need to have a second look and provide a test case, but the IMHO
clear remaining short-term 'issue' to solve in the current transformer
is the getTransformer() method that is currently keeping a lazy
initialized transformer object to reuse.
Since Transformer is not threadsafe this is not wise. I think this
should be changed to keeping a Templates object from which we keep
cloning distinct and separate Transformer instances (for one-time use)
The TransformRepresentation isn't intended to be reused. It is a wrapper
representation that gets its content from the application of a
stylesheet representation to a source representation (+ other
representations retrieved from the context).
Instance of this class are typically set as entity of a Request or a
Response. Therefore, there is no real need for thread-safety as a
Request/Response is guaranteed to be executed by a single thread.
Instead of making the TransformRep. instances reusable by doing the
changes that you suggest, to not reuse the JAXP Transformer, you should
just recreate a new TransformRep. each time you want to process your
stylesheeet.
ah ok, well it wasn't clear from the code/javadoc in the
TransformRepresentation class:
getTransformer() is public, unsynchronized and has javadoc saying the
nested transformer is to be reused...
For this purpose, there is another class, org.restlet.Transformer that
acts as a filter and can automatically do this work based on a
stylesheet representation and some additional metadata.
However, I feel we need more then just fixes, even the term
'refactoring' seems to do injustice to how I think this should be
approached.
I haven't thought this trough completely, but my feeling is that
1/ xml transformations are bigger then only xslt, a more elaborate way
to transforming "Representations of mime-type text/xml" is needed IMHO.
Agreed, I had this hope in mind when initially designing those classes,
hence the generic class names. The ultimate goal is, by detecting the
media type of the stylesheet for example, to automatically use the
corresponding transformation engine (XSLT, FreeMarker, etc.)
hm, I think templating should even be able to produce XML that is then
fed into pipelines
inside these pipelines custom xslt transformers/filters should be active
(xslt is only one of the things you'ld like to do)
It should cater for pipelines, multi-output, tracing and debugging,
logging, xml-api mismatch handling to cover sax/dom/stax/, etc
- for pipelines, don't you think that the Filter/Router mechanism of
Restlet is sufficient?
Well, there is a bit of a mismatch (as well as some overlap) filters
work on request-response objects. Pipelines work on (xml)representations.
As such maybe the passed filters would help
decide/build/configure/parameterize the different active parts in the
pipeline, but the pipelining/content handling itself seems distinct, no?
- for the multi-output need, I'm not sure if/how it could fit with the
current design indeed. Unless there is a way to *pull* each output
representation from the XSLT engine, it is going to be tricky.
Haven't thought this through, and I was mainly thinking about something
the pipeline architecture by itself would do as side effects (e.g. save
a copy or summary of whatever it produced to some other server)
But now that you make me think about the restlet context, I do think a
multi-part kind of response would make sense, no?
Maybe there should be room for a BagRepresentation which is essentially
a Map<String, Representation> ?
There are surely use cases for this:
- production of online produced zip-files, where the different embedded
files themselves are dynamic (doesn't the odf format fit this bill?)
- production of 'selfcontained-saveable' html with embedded images
through data: protocol serialization
- ...
2/ I'm quite uncomfortable with the concept of the current
TransformerRepresentation.
* It having a pointer to the xml input source feels very unnatural: an
xslt sheet is to be seen as a generic xml transformation program that
can be executed on various xml inputs.
The TransformRepresentation isn't the representation of the stylesheet,
it's a representation of the output of a transformation combining a
stylesheet representation (passed as parameter) and a source
representation.
I did get that, but it's rather the fact that the level where the xslt
is compiled into the Templates or Tranformer object _IS_ the same level
that is tied to the input source.
I'm quite convinced any pipelining system will need to offer active
stateful (used-once) components like the transformer indeed, but there
should surely be a stateless configurable level of transformer-factories
* In essence we talk about things that transmogrify (calvin and hobbes!)
representations, sure they themselves could be referenced, and 'get'
from a resource (as a representation) but from there I think an active
component is built that by itself is not a representation nor a
resource, no?
Indeed a JAXP Transformer is used internally, but keep in mind that the
Request/Response entities must be returned to the connector who will
control the best way to use its content. So, we need to use some late
transformation mechanism, some dynamic representations if you prefer...
That's why the multi-output scenario is not obvious here.
Maybe we should just have a mechanism to allow hooking-up stuff on the
level of the request/response to be cleaned up after processing is
completed?
3/ from my experience with Cocoon (see a related hindsight design
evaluation here:
http://marc.info/?l=xml-cocoon-dev&m=113385491027675&w=2
<http://marc.info/?l=xml-cocoon-dev&m=113385491027675&w=2>) I'm also
convinced there must be a clear separation between the
xml-transformation component and the restlet API.
My suggestion would be to
- either ambitiously define and build a (or pragmatically find an
existing) fairly generic and decent XML-manipulation system that can
work in streaming mode (sax and/or stax) and incorporate xslt (xslt2
maybe)
- and most importantly: lives in its own right *and* has clear hooks to
allow "execution contexts" to allow for pluggable defined 'parameters'
and uri-resolving
- only then fit in that beast into restlet:
- by providing a tied in transform-context implementation
- by defining uri-syntaxes that translate into building up
transformation machinery
I agree and think that the JAXP API should control this system. Maybe a
similar/better API is needed for some use cases.
I'm afraid JAXP is not a conclusive term in this ;-(
inside jaxp there is sax, trax-sources and dom, and while there are
(somewhat) workable transitions between all of them, they are distinct
and only interoperate at the cost of handling the impedance-mismatch
plus, next to that there is at least stax to consider as well imho
regards,
-marc=