Improve handling of input in XSLTMediator
-----------------------------------------

                 Key: SYNAPSE-213
                 URL: https://issues.apache.org/jira/browse/SYNAPSE-213
             Project: Synapse
          Issue Type: Improvement
          Components: Core
    Affects Versions: 1.1, NIGHTLY
            Reporter: Andreas Veithen
            Priority: Minor


Currently XSLTMediator uses two different strategies to feed the XML input into 
the XSLT processor:

* When useDOMSourceAndResults is set to false, the Axiom tree will be 
serialized to a byte stream (in memory or to a temporary file for large 
documents) and then fed into the XSLT processor using a StreamSource object.
* When useDOMSourceAndResults is set to true, the code will call 
ElementHelper.importOMElement to get a DOM compliant version of the Axiom tree. 
The resulting DOM tree is then passed to the XSLT processor using a DOMSource.

First it should be noted that using a temporary file for the XML input (in 
contrast to the output of the transformation) doesn't eliminate the need to 
keep the entire input document in memory. Indeed:

* When the input is read, Axiom will built the entire tree and keep in memory.
* Due to the way XSLT works, the XSLT processor also requires a complete 
in-memory representation of the input document. The only exception is for XSLT 
processors that supports streaming, which is not the case for Xalan. Xalan uses 
its own object model called DTM (Document Table Model) to store the input 
document in memory.

Since the input document must be kept in memory anyway, the only question is 
how to efficiently feed the original Axiom tree into the XSLT processor without 
creating too much overhead and consuming too much memory. Assuming that Xalan 
is used, the current situation is as follows:

* When useDOMSourceAndResults is set to false, three copies of the XML input 
will be built: the Axiom tree, the serialized byte stream and Xalan's DTM 
representation. When temporary files are used for large documents, only two 
will coexist in memory. However, using temporary files introduces a large 
overhead.
* When useDOMSourceAndResults is set to true, at least two copies of the input 
will be built: the Axiom tree and the DOM tree. Indeed, from the code in 
ElementHelper.importOMElement it can be seen that an entirely new copy of the 
input tree will be created. In addition, Xalan will create a DTM representation 
of the DOM tree. The document at http://xml.apache.org/xalan-j/dtm.html 
suggests that this representation is not a complete copy of the DOM tree, but a 
wrapper/adapter that is backed by the original DOM tree.

Both strategies used by XSLTMediator are far from optimal. There are at least 
two strategies that should give better results (with at least one of them being 
actually simpler):

* Trick Axis2 into producing a DOM compatible tree from the outset, by using a 
StAXSOAPModelBuilder with a DOMSOAPFactory (this produces objects that 
implement both the Axiom and DOM interfaces). This however might require some 
tweaking. The advantage is that there is no need to create a copy anymore. 
Xalan will only create a DTM wrapper around the existing tree.
* Make sure that a DTM representation is created directly from the Axiom tree 
without intermediate copy (byte stream or DOM tree). With Java 6/JAXP 1.4 this 
would be very easy because it has support for StAXSource, which integrates 
nicely with Axiom. In the meantime, the solution is to pull StAX events from 
Axiom, convert them to SAX events and push them to the XSLT processor. The 
Spring WS project has a utility class StaxSource (extending SAXSource) that 
does this in a completely transparent way (new 
StaxSource(omElement.getXMLStreamReader())). By using 
getXMLStreamReaderWithoutCaching instead of getXMLStreamReader, this could 
probably be further optimized to instruct Axiom not to create the tree for the 
part of the input message that is being transformed (unless it has already been 
constructed at that moment).


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to