[
https://issues.apache.org/jira/browse/CAMEL-8273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291819#comment-14291819
]
Stephan Siano commented on CAMEL-8273:
--------------------------------------
Crap, you are right, I only ran the unit tests in org.apache.camel.language,
not the ones in org.apache.builder.xml.
I am not 100% sure what the failed XPathTest.testXPathSplitConcurrent() means
(it evaluates an XPath and then concurrently tries to create a Document from
the Nodes with a TypeConverter in 100 threads and I actually don't understand
why that behavior should change depending on the DocumentFactoryImpl being
instantiated by Camel or by the JDK within the XPath.eval method), but the
failed XPathFeatureTest.testXPathResult() looks like a showstopper for the
whole approach to me. If the XPath implementation from the JDK gets an
InputSource as source or the evaluation it will intantiate a DOM parser with
default settings (that allow XXE) and I see no way around that.
I will do some further analyis on that, but it might really be necessary to do
the DOM conversion before the XPath (as in the current coding)
> More flexible selection of default documentType in XPath expressions
> --------------------------------------------------------------------
>
> Key: CAMEL-8273
> URL: https://issues.apache.org/jira/browse/CAMEL-8273
> Project: Camel
> Issue Type: Improvement
> Components: camel-core
> Reporter: Stephan Siano
> Assignee: Claus Ibsen
> Fix For: 2.15.0
>
> Attachments:
> 0001-CAMEL-8273-More-flexible-selection-of-default-docume.patch
>
>
> In the current implementation of XPath if no documentType is defined (likely
> in most cases) the document used for XPath evaluation is parsed into a (DOM)
> Document using the JDK XML parser before applying the XPath expression on it.
> For large documents this might be resource intensive, especially if the XPath
> is evaluated using a more efficient parser like Saxon.
> With the current implementation it is possible to workaround this by setting
> a documentType attribute to the XPath expression, but doing this efficiently
> requires some internal knowledge about the previous component in the camel
> route (which type it creates) and the qualities of the used XML parser (e.g.
> the JDK parser accepts only InputSource and Node as input types for XPath
> evaluation whereas Saxon does also support other types like SAXSource).
> The attached patch will make the data type used by default for XPath
> evaluation more flexible (depending on the type of the input).
> There are two cases to differentiate:
> documentType is set on the XPath expression:
> current implementation:
> 1. try to convert to the documentType
> 2. if that fails do some extra conversions for some additional data types
> (WrappedFile, BeanInvocation, String)
> 3. if that fails throw an exception
> new implementation:
> 1. try to convert to the documentType
> 2. if that fails, use the message if it is of type Node, InputSource or
> DOMSource or do some type conversions for specific data types (WrappedFile,
> BeanInvocation, String, InputStream, Reader, byte[]...)
> 3. if that fails throw an exception
> documentType is not set on the XPath expresson
> old implementation:
> this is actually the same as if documentType was set to Document
> new implementation:
> 1. Use the message if it is of type Node, InputSource or DOMSource or do some
> type conversions for specific data types (WrappedFile, BeanInvocation,
> String, InputStream, Reader, byte[]...) (to InputSource)
> 2. If the old message is not of one of the types above, convert to DOM
> Document
> 3. If this fails throw an Exception
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)