Re: Slow access to XML node attribute because of XPath in NodeModel.get()?

Daniel Dekany Fri, 13 Oct 2017 06:47:24 -0700

I would like to add that this DOM wrapper model is one of the less
known parts of the code to me. Barring some small changes, it's the
same as it was when long long ago it was written by the original
authors. That raises the probability of finding improvement
opportunities there quite much. Like, I see that at least when using
Xalan, it calls `new org.apache.xpath.XPath.XPath(String,
SourceLocator, PrefixResolver, int, ErrorListener)` each time you
execute a query. That can be a big performance killer. But I don't
know if it can be avoided. These should be researched. Contributions
on that filed are highly welcome.


BTW, you are using Xalan for XPath, the one embedded into the JDK
(hence it's in Sun package). There's an option for using Jaxen
instead. I suspect Jaxen will be slower, not faster, but who knows...
might worths a try. For that, add
https://mvnrepository.com/artifact/jaxen/jaxen/1.1.6 to your
dependencies, and call NodeModel.useJaxenXPathSupport() sometimes
during application startup.


Friday, October 13, 2017, 8:01:17 AM, Christoph Rüger wrote:

> 2017-10-13 0:33 GMT+02:00 Daniel Dekany <[email protected]>:
>
>> Actually, in `someNode.something` (which is the same as
>> `someNode["something"]`) the "something" meant to be XPath (appart
>> from a few FreeMarker-specific things).
>
> But as you have seen in some
>> very simple cases we solve the XPath query "natively" instead of
>> delegating it to XPath.
>
>
> Thanks Daniel, good to know. So there probably was a reason for this too in
> the past.
>
> Now if profiling reliably(!) shows that it's
>> indeed XPath in this case that's so slow, then it seems it would be
>
> better if we take over even more cases from XPath, such as
>> "@something"... I will have to look into that though. (This can be
>> tricky though, as you replace the implementation of that kind of query
>> under thousands of already working applications, so it really has to
>> behave exactly like the real thing earlier.
>
>
> I agree, this is very risky and my poor-mans "benchmark" is not reliable at
> all. I can try if I can find some time to come up with a programmatic
> test-case maybe which makes it better visible.
> Your insights tell me so far that we better go with the custom
> TemplateMethodModel for now (shortterm) to solve our specific business
> problem. Less risk.
>
> I will try fiddling around with the sources of
> freemarker.ext.dom.NodeModel.get(String)
> etc. which already do some "key.startsWith("@@")" checks. Maybe I find the
> place where to add an improvement and try it out.
>
>
>> That's unlike with
>> product.PRODUCT_DETAILS which was never delegated to the real XPath
>> implementation.)
>>
>
>> Thursday, October 12, 2017, 7:35:42 PM, Christoph Rüger wrote:
>>
>> > Hi,
>> >
>> > we have the following XML snippet:
>> >
>> > <PRODUCT_DETAILS>
>> >   <INTERNATIONAL_PID *type="GTIN"*>4011395534809</INTERNATIONAL_PID>
>> > </PRODUCT_DETAILS>
>> >
>> > We iterate over XML-Nodes like this:
>> >
>> > <#list product["PRODUCT_DETAILS"]?children as detail>
>>
>> Note that you can write `product["PRODUCT_DETAILS"]` as
>> `product.PRODUCT_DETAILS`.
>>
>
> Ok. Usually we do. Not sure why we didn't here. Was a code snippet from a
> colleague :)
>
>
>>
>> >   nodeValue: ${detail!}
>> > </#list>
>> >
>> > This code snippet is pretty fast (result appears after e.g. 3 seconds).
>> But
>> > when we turn it into this and try to add the node attribute "type"...
>> >
>> > <#list product["PRODUCT_DETAILS"]?children as detail>
>> >  <#if detail.*@type*[0]?? && detail.*@type*[0] == "GTIN">
>> >   node*Attribute* type: ${detail.@type}
>> >   nodeValue: ${detail!}
>> >  </#if>
>> > </#list>
>> >
>> > ...it gets very very slow (result appears after > 1 minute). We have a
>> large XML with >> 1000 products here.
>> >
>> > I have digged a little bit in the code and found that when the check for
>> > *detail.@type[0]??* is in the mix, then
>> > *freemarker.ext.dom.NodeModel.get(String) *is jumping into the
>> else-branch
>> > which uses XPath:
>> >
>> >  public TemplateModel get(String key) throws TemplateModelException {
>> >
>> >    if (key.startsWith("@@")) {
>> >     // ....
>> >
>> > }
>> >
>> > else{
>> >
>> > XPathSupport xps = getXPathSupport();
>> >           if (xps != null) {
>> >
>> > *return xps.executeQuery(node, key);*            }
>> >
>> > }
>> >
>> > This *xps.executeQuery(node, key); *seems to be the reason why it gets
>> > slow. Internally it
>> > calls
>> > freemarker.ext.dom.SunInternalXalanXPathSupport.executeQuery(Object,
>> > String)
>> >
>> > To work around this problem I have tried creating an own
>> > TemplateMethodModel ${nodeAttrib(node, attributename)}
>> >
>> > where I do something like this (I know it's ugly but experimental...) :
>> >
>> > new TemplateMethodModelEx(){
>> >
>> > public Object exec(List arg0) throws TemplateModelException {
>> >
>> > Object node = arg0.get(0);
>> >
>> > Object attribute = arg0.get(1);
>> >
>> > if(node instanceof NodeModel) {
>> >
>> > NamedNodeMap attributes = ((NodeModel) node).getNode().getAttributes();
>> >
>> > if(attributes != null && attributes.getLength() > 0) {
>> >
>> > Node namedItem = *attributes.getNamedItem(attribute.toString())*;
>> >
>> > if(namedItem != null) {
>> >
>> > System.out.println(namedItem.getNodeValue());
>> >
>> > return namedItem.getNodeValue();
>> >
>> > }
>> >
>> > else {
>> >
>> > return "";
>> >
>> > }
>> >
>> > }
>> >
>> > }
>> >
>> > return "";
>> >
>> > }
>> >
>> >
>> > When we then use this it is much faster:
>> > <#list product["PRODUCT_DETAILS"]?children as detail>
>> >   <#if *nodeAttribute(details, "type")!* == "GTIN" >
>> >    node*Attribute* type: ${*nodeAttribute(details, "type")!*}
>> >    nodeValue: ${detail!}
>> >   </#if>
>> > </#list>
>> >
>> > It seems that NamedNodeMap.*getNamedItem() *much faster and does not use
>> > XPath. Although it is also iterating internally over the whole array to
>> > find the index of the attribute - the overhead seems to be less then
>> XPath.
>> >
>> > My questions are:
>> > 1. What is the reason that *detail.@type[0]?? *causes a heavy XPath
>> > evaluation under the hood?
>> > 2. Are we doing something wrong?
>> > 3. Should we go the custom TemplateMethodModel way? Is this approach ok?
>> >
>> > I hope this was understandable.
>> >
>> > Thanks
>> > Christoph
>> >
>>
>> --
>> Thanks,
>>  Daniel Dekany
>>
>>
>
>
> -- 
> Christoph Rüger, Geschäftsführer
> Synesty <https://synesty.com/> - Anbinden und Automatisieren ohne
> Programmieren - Automatisierung, Schnittstellen, Datenfeeds
>
> Xing: https://www.xing.com/profile/Christoph_Rueger2
> LinkedIn: http://www.linkedin.com/pub/christoph-rueger/a/685/198
>

-- 
Thanks,
 Daniel Dekany

Re: Slow access to XML node attribute because of XPath in NodeModel.get()?

Reply via email to