[ 
https://issues.apache.org/jira/browse/CAMEL-8273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291819#comment-14291819
 ] 

Stephan Siano commented on CAMEL-8273:
--------------------------------------

Crap, you are right, I only ran the unit tests in org.apache.camel.language, 
not the ones in org.apache.builder.xml.

I am not 100% sure what the failed XPathTest.testXPathSplitConcurrent() means 
(it evaluates an XPath and then concurrently tries to create a Document from 
the Nodes with a TypeConverter in 100 threads and I actually don't understand 
why that behavior should change depending on the DocumentFactoryImpl being 
instantiated by Camel or by the JDK within the XPath.eval method), but the 
failed XPathFeatureTest.testXPathResult() looks like a showstopper for the 
whole approach to me. If the XPath implementation from the JDK gets an 
InputSource as source or the evaluation it will intantiate a DOM parser with 
default settings (that allow XXE) and I see no way around that.

I will do some further analyis on that, but it might really be necessary to do 
the DOM conversion before the XPath (as in the current coding)

> More flexible selection of default documentType in XPath expressions
> --------------------------------------------------------------------
>
>                 Key: CAMEL-8273
>                 URL: https://issues.apache.org/jira/browse/CAMEL-8273
>             Project: Camel
>          Issue Type: Improvement
>          Components: camel-core
>            Reporter: Stephan Siano
>            Assignee: Claus Ibsen
>             Fix For: 2.15.0
>
>         Attachments: 
> 0001-CAMEL-8273-More-flexible-selection-of-default-docume.patch
>
>
> In the current implementation of XPath if no documentType is defined (likely 
> in most cases) the document used for XPath evaluation is parsed into a (DOM) 
> Document using the JDK XML parser before applying the XPath expression on it.
> For large documents this might be resource intensive, especially if the XPath 
> is evaluated using a more efficient parser like Saxon.
> With the current implementation it is possible to workaround this by setting 
> a documentType attribute to the XPath expression, but doing this efficiently 
> requires some internal knowledge about the previous component in the camel 
> route (which type it creates) and the qualities of the used XML parser (e.g. 
> the JDK parser accepts only InputSource and Node as input types for XPath 
> evaluation whereas Saxon does also support other types like SAXSource).
> The attached patch will make the data type used by default for XPath 
> evaluation more flexible (depending on the type of the input).
> There are two cases to differentiate:
> documentType is set on the XPath expression:
> current implementation:
> 1. try to convert to the documentType
> 2. if that fails do some extra conversions for some additional data types 
> (WrappedFile, BeanInvocation, String)
> 3. if that fails throw an exception
> new implementation:
> 1. try to convert to the documentType
> 2. if that fails, use the message if it is of type Node, InputSource or 
> DOMSource or do some type conversions for specific data types (WrappedFile, 
> BeanInvocation, String, InputStream, Reader, byte[]...)
> 3. if that fails throw an exception
> documentType is not set on the XPath expresson
> old implementation:
> this is actually the same as if documentType was set to Document
> new implementation:
> 1. Use the message if it is of type Node, InputSource or DOMSource or do some 
> type conversions for specific data types (WrappedFile, BeanInvocation, 
> String, InputStream, Reader, byte[]...) (to InputSource)
> 2. If the old message is not of one of the types above, convert to DOM 
> Document
> 3. If this fails throw an Exception



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to