[ 
https://issues.apache.org/jira/browse/XALANJ-2540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16439427#comment-16439427
 ] 

Matthew Broadhead commented on XALANJ-2540:
-------------------------------------------

if i go to the xalan frontpage [https://xalan.apache.org/] it says the code can 
be found at [http://svn.apache.org/repos/asf/xalan/xalan-j/trunk/] which just 
says "Not found".  I am trying to find the 
org.apache.xml.dtm.ObjectFactory.lookUpFactoryClassName() function mentioned in 
the original bug report.  Can anyone help?

> Very inefficient default behaviour for looking up DTMManager
> ------------------------------------------------------------
>
>                 Key: XALANJ-2540
>                 URL: https://issues.apache.org/jira/browse/XALANJ-2540
>             Project: XalanJ2
>          Issue Type: Improvement
>      Security Level: No security risk; visible to anyone(Ordinary problems in 
> Xalan projects.  Anybody can view the issue.) 
>          Components: DTM, XPath
>    Affects Versions: 2.7.1, 2.7
>            Reporter: Lukas Eder
>            Priority: Major
>
> I have analysed an issue that has been bothering me for some time. When 
> executing XPath evaluations, it looks like a very significant amount of time 
> is spent in the initialisation of the XPathContext. I have asked this 
> question on Stack Overflow and answered it myself:
> http://stackoverflow.com/questions/6340802/java-xpath-apache-jaxp-implementation-performance
> I think the default behaviour of 
> org.apache.xml.dtm.ObjectFactory.lookUpFactoryClassName() is quite 
> sub-optimal and should be improved, statically. I imagine, it is unlikely 
> that this configuration is going to change once classes have been loaded. 
> Hence, the fallback lookup of META-INF/service/org.apache.xml.dtm.DTMManager 
> should only be done once.
> For reference, here's the question and answer again in JIRA:
> ----
> I have come to an astonishing conclusion that this:
>     Element e = (Element) 
> document.getElementsByTagName("SomeElementName").item(0);
>     String result = ((Element) e).getTextContent();
> Seems to be an incredible 100x faster than this:
>     // Accounts for 30%, can be cached
>     XPathFactory factory = XPathFactory.newInstance();
>     // Negligible
>     XPath xpath = factory.newXPath();
>     // Accounts for 70% (caching a compiled expression doesn't change much...)
>     String result = (String) xpath.evaluate(
>       "//SomeElementName", document, XPathConstants.STRING);
> I'm using the JVM's default implementation of JAXP:
> org.apache.xpath.jaxp.XPathFactoryImpl
> org.apache.xpath.jaxp.XPathImpl
> I'm really confused, because it's easy to see how JAXP could optimise the 
> above XPath query to actually execute a simple getElementsByTagName() 
> instead. But it doesn't seem to do that. This problem is limited to around 
> 5-6 frequently used XPath calls, that are abstracted and hidden by an API. 
> Those queries involve simple paths (e.g. /a/b/c, no variables, conditions) 
> against an always available DOM Document only. So, if an optimisation can be 
> done, it will be quite easy to achieve.
> ----
> I have debugged and profiled my test-case and Xalan/JAXP in general. I 
> managed to identify the big major problem in
> org.apache.xml.dtm.ObjectFactory.lookUpFactoryClassName()
> It can be seen that every one of the 10k test XPath evaluations led to the 
> classloader trying to lookup the DTMManager instance in some sort of default 
> configuration. This configuration is not loaded into memory but accessed 
> every time. Furthermore, this access seems to be protected by a lock on the 
> ObjectFactory.class itself. When the access fails (by default), then the 
> configuration is loaded from the xalan.jar file's
> META-INF/service/org.apache.xml.dtm.DTMManager
> configuration file. Every time!:
> Fortunately, this behaviour can be overridden by specifying a JVM parameter 
> like this:
> -Dorg.apache.xml.dtm.DTMManager=
>   org.apache.xml.dtm.ref.DTMManagerDefault
> or
> -Dcom.sun.org.apache.xml.internal.dtm.DTMManager=
>   com.sun.org.apache.xml.internal.dtm.ref.DTMManagerDefault
> So here's a performance improvement overview for 10k consecutive XPath 
> evaluations of //SomeNodeName against a 90k XML file (measured with 
> System.nanoTime():
> measured library        : Xalan 2.7.0 | Xalan 2.7.1 | Saxon-HE 9.3 | jaxen 
> 1.1.3   
> --------------------------------------------------------------------------------
> without optimisation    :     10400ms |      4717ms |              |     
> 25500ms
> reusing XPathFactory    :      5995ms |      2829ms |              |
> reusing XPath           :      5900ms |      2890ms |              |
> reusing XPathExpression :      5800ms |      2915ms |      16000ms |     
> 25000ms
> adding the JVM param    :      1163ms |       761ms |        n/a   |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org
For additional commands, e-mail: dev-h...@xalan.apache.org

Reply via email to