[jira] [Commented] (XALANJ-2540) Very inefficient default behaviour for looking up DTMManager

2018-05-23 Thread Matthew Broadhead (JIRA)

[ 
https://issues.apache.org/jira/browse/XALANJ-2540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16486962#comment-16486962
 ] 

Matthew Broadhead commented on XALANJ-2540:
---

[~garydgregory] under your recommendation i have cloned 2.7.1 as that is the 
highest maintenance release i can see.  i have greped for 2.7.3 but cannot see 
that mentioned in any of the files

There is no pom.xml or anything so it looks like manual building?

[~msahyoun] thanks i have looked in ObjectFactory and see where it doing 
lookUpFactoryClassName().  do you think it is possible to cache the result into 
a Singleton for future requests?  or might this cause clashes?

I could submit a patch for Singleton suggestion but I am not sure how to build 
and deploy the project

> Very inefficient default behaviour for looking up DTMManager
> 
>
> Key: XALANJ-2540
> URL: https://issues.apache.org/jira/browse/XALANJ-2540
> Project: XalanJ2
>  Issue Type: Improvement
>  Security Level: No security risk; visible to anyone(Ordinary problems in 
> Xalan projects.  Anybody can view the issue.) 
>  Components: DTM, XPath
>Affects Versions: 2.7.1, 2.7
>Reporter: Lukas Eder
>Priority: Major
>
> I have analysed an issue that has been bothering me for some time. When 
> executing XPath evaluations, it looks like a very significant amount of time 
> is spent in the initialisation of the XPathContext. I have asked this 
> question on Stack Overflow and answered it myself:
> http://stackoverflow.com/questions/6340802/java-xpath-apache-jaxp-implementation-performance
> I think the default behaviour of 
> org.apache.xml.dtm.ObjectFactory.lookUpFactoryClassName() is quite 
> sub-optimal and should be improved, statically. I imagine, it is unlikely 
> that this configuration is going to change once classes have been loaded. 
> Hence, the fallback lookup of META-INF/service/org.apache.xml.dtm.DTMManager 
> should only be done once.
> For reference, here's the question and answer again in JIRA:
> 
> I have come to an astonishing conclusion that this:
> Element e = (Element) 
> document.getElementsByTagName("SomeElementName").item(0);
> String result = ((Element) e).getTextContent();
> Seems to be an incredible 100x faster than this:
> // Accounts for 30%, can be cached
> XPathFactory factory = XPathFactory.newInstance();
> // Negligible
> XPath xpath = factory.newXPath();
> // Accounts for 70% (caching a compiled expression doesn't change much...)
> String result = (String) xpath.evaluate(
>   "//SomeElementName", document, XPathConstants.STRING);
> I'm using the JVM's default implementation of JAXP:
> org.apache.xpath.jaxp.XPathFactoryImpl
> org.apache.xpath.jaxp.XPathImpl
> I'm really confused, because it's easy to see how JAXP could optimise the 
> above XPath query to actually execute a simple getElementsByTagName() 
> instead. But it doesn't seem to do that. This problem is limited to around 
> 5-6 frequently used XPath calls, that are abstracted and hidden by an API. 
> Those queries involve simple paths (e.g. /a/b/c, no variables, conditions) 
> against an always available DOM Document only. So, if an optimisation can be 
> done, it will be quite easy to achieve.
> 
> I have debugged and profiled my test-case and Xalan/JAXP in general. I 
> managed to identify the big major problem in
> org.apache.xml.dtm.ObjectFactory.lookUpFactoryClassName()
> It can be seen that every one of the 10k test XPath evaluations led to the 
> classloader trying to lookup the DTMManager instance in some sort of default 
> configuration. This configuration is not loaded into memory but accessed 
> every time. Furthermore, this access seems to be protected by a lock on the 
> ObjectFactory.class itself. When the access fails (by default), then the 
> configuration is loaded from the xalan.jar file's
> META-INF/service/org.apache.xml.dtm.DTMManager
> configuration file. Every time!:
> Fortunately, this behaviour can be overridden by specifying a JVM parameter 
> like this:
> -Dorg.apache.xml.dtm.DTMManager=
>   org.apache.xml.dtm.ref.DTMManagerDefault
> or
> -Dcom.sun.org.apache.xml.internal.dtm.DTMManager=
>   com.sun.org.apache.xml.internal.dtm.ref.DTMManagerDefault
> So here's a performance improvement overview for 10k consecutive XPath 
> evaluations of //SomeNodeName against a 90k XML file (measured with 
> System.nanoTime():
> measured library: Xalan 2.7.0 | Xalan 2.7.1 | Saxon-HE 9.3 | jaxen 
> 1.1.3   
> 
> without optimisation: 10400ms |  4717ms |  | 
> 25500ms
> reusing XPathFactory:  5995ms |  2829ms |  |
> reusing XPath   :  5900ms |  2890ms |  

[jira] [Commented] (XALANJ-2540) Very inefficient default behaviour for looking up DTMManager

2018-04-16 Thread Matthew Broadhead (JIRA)

[ 
https://issues.apache.org/jira/browse/XALANJ-2540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16439427#comment-16439427
 ] 

Matthew Broadhead commented on XALANJ-2540:
---

if i go to the xalan frontpage [https://xalan.apache.org/] it says the code can 
be found at [http://svn.apache.org/repos/asf/xalan/xalan-j/trunk/] which just 
says "Not found".  I am trying to find the 
org.apache.xml.dtm.ObjectFactory.lookUpFactoryClassName() function mentioned in 
the original bug report.  Can anyone help?

> Very inefficient default behaviour for looking up DTMManager
> 
>
> Key: XALANJ-2540
> URL: https://issues.apache.org/jira/browse/XALANJ-2540
> Project: XalanJ2
>  Issue Type: Improvement
>  Security Level: No security risk; visible to anyone(Ordinary problems in 
> Xalan projects.  Anybody can view the issue.) 
>  Components: DTM, XPath
>Affects Versions: 2.7.1, 2.7
>Reporter: Lukas Eder
>Priority: Major
>
> I have analysed an issue that has been bothering me for some time. When 
> executing XPath evaluations, it looks like a very significant amount of time 
> is spent in the initialisation of the XPathContext. I have asked this 
> question on Stack Overflow and answered it myself:
> http://stackoverflow.com/questions/6340802/java-xpath-apache-jaxp-implementation-performance
> I think the default behaviour of 
> org.apache.xml.dtm.ObjectFactory.lookUpFactoryClassName() is quite 
> sub-optimal and should be improved, statically. I imagine, it is unlikely 
> that this configuration is going to change once classes have been loaded. 
> Hence, the fallback lookup of META-INF/service/org.apache.xml.dtm.DTMManager 
> should only be done once.
> For reference, here's the question and answer again in JIRA:
> 
> I have come to an astonishing conclusion that this:
> Element e = (Element) 
> document.getElementsByTagName("SomeElementName").item(0);
> String result = ((Element) e).getTextContent();
> Seems to be an incredible 100x faster than this:
> // Accounts for 30%, can be cached
> XPathFactory factory = XPathFactory.newInstance();
> // Negligible
> XPath xpath = factory.newXPath();
> // Accounts for 70% (caching a compiled expression doesn't change much...)
> String result = (String) xpath.evaluate(
>   "//SomeElementName", document, XPathConstants.STRING);
> I'm using the JVM's default implementation of JAXP:
> org.apache.xpath.jaxp.XPathFactoryImpl
> org.apache.xpath.jaxp.XPathImpl
> I'm really confused, because it's easy to see how JAXP could optimise the 
> above XPath query to actually execute a simple getElementsByTagName() 
> instead. But it doesn't seem to do that. This problem is limited to around 
> 5-6 frequently used XPath calls, that are abstracted and hidden by an API. 
> Those queries involve simple paths (e.g. /a/b/c, no variables, conditions) 
> against an always available DOM Document only. So, if an optimisation can be 
> done, it will be quite easy to achieve.
> 
> I have debugged and profiled my test-case and Xalan/JAXP in general. I 
> managed to identify the big major problem in
> org.apache.xml.dtm.ObjectFactory.lookUpFactoryClassName()
> It can be seen that every one of the 10k test XPath evaluations led to the 
> classloader trying to lookup the DTMManager instance in some sort of default 
> configuration. This configuration is not loaded into memory but accessed 
> every time. Furthermore, this access seems to be protected by a lock on the 
> ObjectFactory.class itself. When the access fails (by default), then the 
> configuration is loaded from the xalan.jar file's
> META-INF/service/org.apache.xml.dtm.DTMManager
> configuration file. Every time!:
> Fortunately, this behaviour can be overridden by specifying a JVM parameter 
> like this:
> -Dorg.apache.xml.dtm.DTMManager=
>   org.apache.xml.dtm.ref.DTMManagerDefault
> or
> -Dcom.sun.org.apache.xml.internal.dtm.DTMManager=
>   com.sun.org.apache.xml.internal.dtm.ref.DTMManagerDefault
> So here's a performance improvement overview for 10k consecutive XPath 
> evaluations of //SomeNodeName against a 90k XML file (measured with 
> System.nanoTime():
> measured library: Xalan 2.7.0 | Xalan 2.7.1 | Saxon-HE 9.3 | jaxen 
> 1.1.3   
> 
> without optimisation: 10400ms |  4717ms |  | 
> 25500ms
> reusing XPathFactory:  5995ms |  2829ms |  |
> reusing XPath   :  5900ms |  2890ms |  |
> reusing XPathExpression :  5800ms |  2915ms |  16000ms | 
> 25000ms
> adding the JVM param:  1163ms |   761ms |n/a   |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (XALANJ-2540) Very inefficient default behaviour for looking up DTMManager

2018-04-16 Thread Matthew Broadhead (JIRA)

[ 
https://issues.apache.org/jira/browse/XALANJ-2540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16439216#comment-16439216
 ] 

Matthew Broadhead commented on XALANJ-2540:
---

?

> Very inefficient default behaviour for looking up DTMManager
> 
>
> Key: XALANJ-2540
> URL: https://issues.apache.org/jira/browse/XALANJ-2540
> Project: XalanJ2
>  Issue Type: Improvement
>  Security Level: No security risk; visible to anyone(Ordinary problems in 
> Xalan projects.  Anybody can view the issue.) 
>  Components: DTM, XPath
>Affects Versions: 2.7.1, 2.7
>Reporter: Lukas Eder
>Priority: Major
>
> I have analysed an issue that has been bothering me for some time. When 
> executing XPath evaluations, it looks like a very significant amount of time 
> is spent in the initialisation of the XPathContext. I have asked this 
> question on Stack Overflow and answered it myself:
> http://stackoverflow.com/questions/6340802/java-xpath-apache-jaxp-implementation-performance
> I think the default behaviour of 
> org.apache.xml.dtm.ObjectFactory.lookUpFactoryClassName() is quite 
> sub-optimal and should be improved, statically. I imagine, it is unlikely 
> that this configuration is going to change once classes have been loaded. 
> Hence, the fallback lookup of META-INF/service/org.apache.xml.dtm.DTMManager 
> should only be done once.
> For reference, here's the question and answer again in JIRA:
> 
> I have come to an astonishing conclusion that this:
> Element e = (Element) 
> document.getElementsByTagName("SomeElementName").item(0);
> String result = ((Element) e).getTextContent();
> Seems to be an incredible 100x faster than this:
> // Accounts for 30%, can be cached
> XPathFactory factory = XPathFactory.newInstance();
> // Negligible
> XPath xpath = factory.newXPath();
> // Accounts for 70% (caching a compiled expression doesn't change much...)
> String result = (String) xpath.evaluate(
>   "//SomeElementName", document, XPathConstants.STRING);
> I'm using the JVM's default implementation of JAXP:
> org.apache.xpath.jaxp.XPathFactoryImpl
> org.apache.xpath.jaxp.XPathImpl
> I'm really confused, because it's easy to see how JAXP could optimise the 
> above XPath query to actually execute a simple getElementsByTagName() 
> instead. But it doesn't seem to do that. This problem is limited to around 
> 5-6 frequently used XPath calls, that are abstracted and hidden by an API. 
> Those queries involve simple paths (e.g. /a/b/c, no variables, conditions) 
> against an always available DOM Document only. So, if an optimisation can be 
> done, it will be quite easy to achieve.
> 
> I have debugged and profiled my test-case and Xalan/JAXP in general. I 
> managed to identify the big major problem in
> org.apache.xml.dtm.ObjectFactory.lookUpFactoryClassName()
> It can be seen that every one of the 10k test XPath evaluations led to the 
> classloader trying to lookup the DTMManager instance in some sort of default 
> configuration. This configuration is not loaded into memory but accessed 
> every time. Furthermore, this access seems to be protected by a lock on the 
> ObjectFactory.class itself. When the access fails (by default), then the 
> configuration is loaded from the xalan.jar file's
> META-INF/service/org.apache.xml.dtm.DTMManager
> configuration file. Every time!:
> Fortunately, this behaviour can be overridden by specifying a JVM parameter 
> like this:
> -Dorg.apache.xml.dtm.DTMManager=
>   org.apache.xml.dtm.ref.DTMManagerDefault
> or
> -Dcom.sun.org.apache.xml.internal.dtm.DTMManager=
>   com.sun.org.apache.xml.internal.dtm.ref.DTMManagerDefault
> So here's a performance improvement overview for 10k consecutive XPath 
> evaluations of //SomeNodeName against a 90k XML file (measured with 
> System.nanoTime():
> measured library: Xalan 2.7.0 | Xalan 2.7.1 | Saxon-HE 9.3 | jaxen 
> 1.1.3   
> 
> without optimisation: 10400ms |  4717ms |  | 
> 25500ms
> reusing XPathFactory:  5995ms |  2829ms |  |
> reusing XPath   :  5900ms |  2890ms |  |
> reusing XPathExpression :  5800ms |  2915ms |  16000ms | 
> 25000ms
> adding the JVM param:  1163ms |   761ms |n/a   |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org
For additional commands, e-mail: dev-h...@xalan.apache.org



[jira] [Commented] (XALANJ-2540) Very inefficient default behaviour for looking up DTMManager

2018-04-09 Thread Matthew Broadhead (JIRA)

[ 
https://issues.apache.org/jira/browse/XALANJ-2540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16431029#comment-16431029
 ] 

Matthew Broadhead commented on XALANJ-2540:
---

if it is easy to fix can you explain how to do it?  do you know which code is 
involved?  this issue has 31 upvotes.  we have problems with jsp taglibs in 
tomcat and tomee causing problems when redeploying webapps 
(https://bz.apache.org/bugzilla/show_bug.cgi?id=61875).  also blocking using 
apache fop or any xslt processing in Tomcat and TomEE (works but not after 
webapp redeploy).  

> Very inefficient default behaviour for looking up DTMManager
> 
>
> Key: XALANJ-2540
> URL: https://issues.apache.org/jira/browse/XALANJ-2540
> Project: XalanJ2
>  Issue Type: Improvement
>  Security Level: No security risk; visible to anyone(Ordinary problems in 
> Xalan projects.  Anybody can view the issue.) 
>  Components: DTM, XPath
>Affects Versions: 2.7.1, 2.7
>Reporter: Lukas Eder
>Priority: Major
>
> I have analysed an issue that has been bothering me for some time. When 
> executing XPath evaluations, it looks like a very significant amount of time 
> is spent in the initialisation of the XPathContext. I have asked this 
> question on Stack Overflow and answered it myself:
> http://stackoverflow.com/questions/6340802/java-xpath-apache-jaxp-implementation-performance
> I think the default behaviour of 
> org.apache.xml.dtm.ObjectFactory.lookUpFactoryClassName() is quite 
> sub-optimal and should be improved, statically. I imagine, it is unlikely 
> that this configuration is going to change once classes have been loaded. 
> Hence, the fallback lookup of META-INF/service/org.apache.xml.dtm.DTMManager 
> should only be done once.
> For reference, here's the question and answer again in JIRA:
> 
> I have come to an astonishing conclusion that this:
> Element e = (Element) 
> document.getElementsByTagName("SomeElementName").item(0);
> String result = ((Element) e).getTextContent();
> Seems to be an incredible 100x faster than this:
> // Accounts for 30%, can be cached
> XPathFactory factory = XPathFactory.newInstance();
> // Negligible
> XPath xpath = factory.newXPath();
> // Accounts for 70% (caching a compiled expression doesn't change much...)
> String result = (String) xpath.evaluate(
>   "//SomeElementName", document, XPathConstants.STRING);
> I'm using the JVM's default implementation of JAXP:
> org.apache.xpath.jaxp.XPathFactoryImpl
> org.apache.xpath.jaxp.XPathImpl
> I'm really confused, because it's easy to see how JAXP could optimise the 
> above XPath query to actually execute a simple getElementsByTagName() 
> instead. But it doesn't seem to do that. This problem is limited to around 
> 5-6 frequently used XPath calls, that are abstracted and hidden by an API. 
> Those queries involve simple paths (e.g. /a/b/c, no variables, conditions) 
> against an always available DOM Document only. So, if an optimisation can be 
> done, it will be quite easy to achieve.
> 
> I have debugged and profiled my test-case and Xalan/JAXP in general. I 
> managed to identify the big major problem in
> org.apache.xml.dtm.ObjectFactory.lookUpFactoryClassName()
> It can be seen that every one of the 10k test XPath evaluations led to the 
> classloader trying to lookup the DTMManager instance in some sort of default 
> configuration. This configuration is not loaded into memory but accessed 
> every time. Furthermore, this access seems to be protected by a lock on the 
> ObjectFactory.class itself. When the access fails (by default), then the 
> configuration is loaded from the xalan.jar file's
> META-INF/service/org.apache.xml.dtm.DTMManager
> configuration file. Every time!:
> Fortunately, this behaviour can be overridden by specifying a JVM parameter 
> like this:
> -Dorg.apache.xml.dtm.DTMManager=
>   org.apache.xml.dtm.ref.DTMManagerDefault
> or
> -Dcom.sun.org.apache.xml.internal.dtm.DTMManager=
>   com.sun.org.apache.xml.internal.dtm.ref.DTMManagerDefault
> So here's a performance improvement overview for 10k consecutive XPath 
> evaluations of //SomeNodeName against a 90k XML file (measured with 
> System.nanoTime():
> measured library: Xalan 2.7.0 | Xalan 2.7.1 | Saxon-HE 9.3 | jaxen 
> 1.1.3   
> 
> without optimisation: 10400ms |  4717ms |  | 
> 25500ms
> reusing XPathFactory:  5995ms |  2829ms |  |
> reusing XPath   :  5900ms |  2890ms |  |
> reusing XPathExpression :  5800ms |  2915ms |  16000ms | 
> 25000ms
> adding the JVM param:  1163ms |   761ms |n/a   |



--
This message 

[jira] [Commented] (XALANJ-2540) Very inefficient default behaviour for looking up DTMManager

2018-04-04 Thread Matthew Broadhead (JIRA)

[ 
https://issues.apache.org/jira/browse/XALANJ-2540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425953#comment-16425953
 ] 

Matthew Broadhead commented on XALANJ-2540:
---

Does anyone know the issues with this?  Is it actually fixable?  What code is 
this happening in?

> Very inefficient default behaviour for looking up DTMManager
> 
>
> Key: XALANJ-2540
> URL: https://issues.apache.org/jira/browse/XALANJ-2540
> Project: XalanJ2
>  Issue Type: Improvement
>  Security Level: No security risk; visible to anyone(Ordinary problems in 
> Xalan projects.  Anybody can view the issue.) 
>  Components: DTM, XPath
>Affects Versions: 2.7.1, 2.7
>Reporter: Lukas Eder
>Priority: Major
>
> I have analysed an issue that has been bothering me for some time. When 
> executing XPath evaluations, it looks like a very significant amount of time 
> is spent in the initialisation of the XPathContext. I have asked this 
> question on Stack Overflow and answered it myself:
> http://stackoverflow.com/questions/6340802/java-xpath-apache-jaxp-implementation-performance
> I think the default behaviour of 
> org.apache.xml.dtm.ObjectFactory.lookUpFactoryClassName() is quite 
> sub-optimal and should be improved, statically. I imagine, it is unlikely 
> that this configuration is going to change once classes have been loaded. 
> Hence, the fallback lookup of META-INF/service/org.apache.xml.dtm.DTMManager 
> should only be done once.
> For reference, here's the question and answer again in JIRA:
> 
> I have come to an astonishing conclusion that this:
> Element e = (Element) 
> document.getElementsByTagName("SomeElementName").item(0);
> String result = ((Element) e).getTextContent();
> Seems to be an incredible 100x faster than this:
> // Accounts for 30%, can be cached
> XPathFactory factory = XPathFactory.newInstance();
> // Negligible
> XPath xpath = factory.newXPath();
> // Accounts for 70% (caching a compiled expression doesn't change much...)
> String result = (String) xpath.evaluate(
>   "//SomeElementName", document, XPathConstants.STRING);
> I'm using the JVM's default implementation of JAXP:
> org.apache.xpath.jaxp.XPathFactoryImpl
> org.apache.xpath.jaxp.XPathImpl
> I'm really confused, because it's easy to see how JAXP could optimise the 
> above XPath query to actually execute a simple getElementsByTagName() 
> instead. But it doesn't seem to do that. This problem is limited to around 
> 5-6 frequently used XPath calls, that are abstracted and hidden by an API. 
> Those queries involve simple paths (e.g. /a/b/c, no variables, conditions) 
> against an always available DOM Document only. So, if an optimisation can be 
> done, it will be quite easy to achieve.
> 
> I have debugged and profiled my test-case and Xalan/JAXP in general. I 
> managed to identify the big major problem in
> org.apache.xml.dtm.ObjectFactory.lookUpFactoryClassName()
> It can be seen that every one of the 10k test XPath evaluations led to the 
> classloader trying to lookup the DTMManager instance in some sort of default 
> configuration. This configuration is not loaded into memory but accessed 
> every time. Furthermore, this access seems to be protected by a lock on the 
> ObjectFactory.class itself. When the access fails (by default), then the 
> configuration is loaded from the xalan.jar file's
> META-INF/service/org.apache.xml.dtm.DTMManager
> configuration file. Every time!:
> Fortunately, this behaviour can be overridden by specifying a JVM parameter 
> like this:
> -Dorg.apache.xml.dtm.DTMManager=
>   org.apache.xml.dtm.ref.DTMManagerDefault
> or
> -Dcom.sun.org.apache.xml.internal.dtm.DTMManager=
>   com.sun.org.apache.xml.internal.dtm.ref.DTMManagerDefault
> So here's a performance improvement overview for 10k consecutive XPath 
> evaluations of //SomeNodeName against a 90k XML file (measured with 
> System.nanoTime():
> measured library: Xalan 2.7.0 | Xalan 2.7.1 | Saxon-HE 9.3 | jaxen 
> 1.1.3   
> 
> without optimisation: 10400ms |  4717ms |  | 
> 25500ms
> reusing XPathFactory:  5995ms |  2829ms |  |
> reusing XPath   :  5900ms |  2890ms |  |
> reusing XPathExpression :  5800ms |  2915ms |  16000ms | 
> 25000ms
> adding the JVM param:  1163ms |   761ms |n/a   |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org
For additional commands, e-mail: dev-h...@xalan.apache.org