Slow access to XML node attribute because of XPath in NodeModel.get()?

Christoph Rüger Thu, 12 Oct 2017 10:36:28 -0700

Hi,

we have the following XML snippet:


<PRODUCT_DETAILS>
  <INTERNATIONAL_PID *type="GTIN"*>4011395534809</INTERNATIONAL_PID>
</PRODUCT_DETAILS>

We iterate over XML-Nodes like this:

<#list product["PRODUCT_DETAILS"]?children as detail>
  nodeValue: ${detail!}
</#list>

This code snippet is pretty fast (result appears after e.g. 3 seconds). But
when we turn it into this and try to add the node attribute "type"...

<#list product["PRODUCT_DETAILS"]?children as detail>
 <#if detail.*@type*[0]?? && detail.*@type*[0] == "GTIN">
  node*Attribute* type: ${detail.@type}
  nodeValue: ${detail!}
 </#if>
</#list>

...it gets very very slow (result appears after > 1 minute). We have a
large XML with > 1000 products here.

I have digged a little bit in the code and found that when the check for
*detail.@type[0]??* is in the mix, then
*freemarker.ext.dom.NodeModel.get(String) *is jumping into the else-branch
which uses XPath:

 public TemplateModel get(String key) throws TemplateModelException {

   if (key.startsWith("@@")) {
    // ....

}

else{

XPathSupport xps = getXPathSupport();
          if (xps != null) {

*return xps.executeQuery(node, key);*            }

}

This *xps.executeQuery(node, key); *seems to be the reason why it gets
slow. Internally it
calls freemarker.ext.dom.SunInternalXalanXPathSupport.executeQuery(Object,
String)

To work around this problem I have tried creating an own
TemplateMethodModel ${nodeAttrib(node, attributename)}

where I do something like this (I know it's ugly but experimental...) :

new TemplateMethodModelEx(){

public Object exec(List arg0) throws TemplateModelException {

Object node = arg0.get(0);

Object attribute = arg0.get(1);

if(node instanceof NodeModel) {

NamedNodeMap attributes = ((NodeModel) node).getNode().getAttributes();

if(attributes != null && attributes.getLength() > 0) {

Node namedItem = *attributes.getNamedItem(attribute.toString())*;

if(namedItem != null) {

System.out.println(namedItem.getNodeValue());

return namedItem.getNodeValue();

}

else {

return "";

}

}

}

return "";

}


When we then use this it is much faster:
<#list product["PRODUCT_DETAILS"]?children as detail>
  <#if *nodeAttribute(details, "type")!* == "GTIN" >
   node*Attribute* type: ${*nodeAttribute(details, "type")!*}
   nodeValue: ${detail!}
  </#if>
</#list>

It seems that NamedNodeMap.*getNamedItem() *much faster and does not use
XPath. Although it is also iterating internally over the whole array to
find the index of the attribute - the overhead seems to be less then XPath.

My questions are:
1. What is the reason that *detail.@type[0]?? *causes a heavy XPath
evaluation under the hood?
2. Are we doing something wrong?
3. Should we go the custom TemplateMethodModel way? Is this approach ok?

I hope this was understandable.

Thanks
Christoph

-- 
Synesty GmbH
Moritz-von-Rohr-Str. 1a
07745 Jena
Tel.: +49 3641 559649
Fax.: +49 3641 5596499
Internet: http://synesty.com

Geschäftsführer: Christoph Rüger
Unternehmenssitz: Jena
Handelsregister B beim Amtsgericht: Jena
Handelsregister-Nummer: HRB 508766
Ust-IdNr.: DE287564982

Slow access to XML node attribute because of XPath in NodeModel.get()?

Reply via email to