Hello,
I work on a Rails application that uses Nokogiri to parse HTML. We run
on JRuby 1.7.19, so Nokogiri takes advantage of the xerces XML parser to
do its parsing. I noticed that using Nokogiri (and thus Xerces) triggers
tons of class loading. Through some research I found setting some Java
system properties to sane defaults reduced the amount of class loading
that goes on. Here is what I set
java.lang.System.setProperty('javax.xml.xpath.XPathFactory:http://java.sun.com/jaxp/xpath/dom','com.sun.org.apache.xpath.internal.jaxp.XPathFactoryImpl')
java.lang.System.setProperty('org.apache.xerces.xni.parser.XMLParserConfiguration','org.apache.xerces.parsers.XIncludeAwareParserConfiguration')
java.lang.System.setProperty('com.sun.org.apache.xml.internal.dtm.DTMManager','com.sun.org.apache.xml.internal.dtm.ref.DTMManagerDefault')
This is actually set from Ruby code, but it interacts with the JVM just
like native Java code. I figured all this out from documentation on the
web. This reduced the amount of Jar file's that are opened dramatically
at runtime.
Using intrace to trace JarFile activity, I still get this happening
constantly however:
[07:16:31.004]:[36]:java.util.jar.JarFile:getJarEntry: {
[07:16:31.004]:[36]:java.util.jar.JarFile:getJarEntry: Arg:
com/sun/org/apache/xpath/internal/jaxp/XPathFactoryImpl.class
[07:16:31.004]:[36]:java.util.jar.JarFile:getEntry: {
[07:16:31.004]:[36]:java.util.jar.JarFile:getEntry: Arg:
com/sun/org/apache/xpath/internal/jaxp/XPathFactoryImpl.class
[07:16:31.004]:[36]:java.util.jar.JarFile:getEntry: Return: null
[07:16:31.004]:[36]:java.util.jar.JarFile:getEntry: }
[07:16:31.004]:[36]:java.util.jar.JarFile:getJarEntry: Return: null
[07:16:31.004]:[36]:java.util.jar.JarFile:getJarEntry: }
[07:16:31.004]:[36]:java.util.jar.JarFile:getJarEntry: {
[07:16:31.004]:[36]:java.util.jar.JarFile:getJarEntry: Arg:
com/sun/org/apache/xpath/internal/jaxp/XPathFactoryImpl.class
[07:16:31.004]:[36]:java.util.jar.JarFile:getEntry: {
[07:16:31.004]:[36]:java.util.jar.JarFile:getEntry: Arg:
com/sun/org/apache/xpath/internal/jaxp/XPathFactoryImpl.class
[07:16:31.004]:[36]:java.util.jar.JarFile$JarFileEntry:<init>: {
[07:16:31.004]:[36]:java.util.jar.JarFile$JarFileEntry:<init>: Arg:
java.util.jar.JarFile@61175d12
[07:16:31.004]:[36]:java.util.jar.JarFile$JarFileEntry:<init>: Arg:
com/sun/org/apache/xpath/internal/jaxp/XPathFactoryImpl.class
[07:16:31.004]:[36]:java.util.jar.JarFile$JarFileEntry:<init>: }
[07:16:31.004]:[36]:java.util.jar.JarFile:getEntry: Return:
com/sun/org/apache/xpath/internal/jaxp/XPathFactoryImpl.class
[07:16:31.004]:[36]:java.util.jar.JarFile:getEntry: }
[07:16:31.004]:[36]:java.util.jar.JarFile:getJarEntry: Return:
com/sun/org/apache/xpath/internal/jaxp/XPathFactoryImpl.class
[07:16:31.004]:[36]:java.util.jar.JarFile:getJarEntry: }
[07:16:31.005]:[36]:java.util.jar.JarFile:getJarEntry: {
[07:16:31.005]:[36]:java.util.jar.JarFile:getJarEntry: Arg:
com/sun/org/apache/xpath/internal/jaxp/XPathFactoryImpl.class
[07:16:31.005]:[36]:java.util.jar.JarFile:getEntry: {
[07:16:31.005]:[36]:java.util.jar.JarFile:getEntry: Arg:
com/sun/org/apache/xpath/internal/jaxp/XPathFactoryImpl.class
[07:16:31.005]:[36]:java.util.jar.JarFile:getEntry: Return: null
[07:16:31.005]:[36]:java.util.jar.JarFile:getEntry: }
[07:16:31.005]:[36]:java.util.jar.JarFile:getJarEntry: Return: null
[07:16:31.005]:[36]:java.util.jar.JarFile:getJarEntry: }
[07:16:31.005]:[36]:java.util.jar.JarFile:getJarEntry: {
[07:16:31.005]:[36]:java.util.jar.JarFile:getJarEntry: Arg:
com/sun/org/apache/xpath/internal/jaxp/XPathFactoryImpl.class
[07:16:31.005]:[36]:java.util.jar.JarFile:getEntry: {
[07:16:31.005]:[36]:java.util.jar.JarFile:getEntry: Arg:
com/sun/org/apache/xpath/internal/jaxp/XPathFactoryImpl.class
[07:16:31.005]:[36]:java.util.jar.JarFile$JarFileEntry:<init>: {
[07:16:31.005]:[36]:java.util.jar.JarFile$JarFileEntry:<init>: Arg:
java.util.jar.JarFile@61175d12
[07:16:31.005]:[36]:java.util.jar.JarFile$JarFileEntry:<init>: Arg:
com/sun/org/apache/xpath/internal/jaxp/XPathFactoryImpl.class
[07:16:31.005]:[36]:java.util.jar.JarFile$JarFileEntry:<init>: }
[07:16:31.005]:[36]:java.util.jar.JarFile:getEntry: Return:
com/sun/org/apache/xpath/internal/jaxp/XPathFactoryImpl.class
[07:16:31.005]:[36]:java.util.jar.JarFile:getEntry: }
[07:16:31.005]:[36]:java.util.jar.JarFile:getJarEntry: Return:
com/sun/org/apache/xpath/internal/jaxp/XPathFactoryImpl.class
[07:16:31.005]:[36]:java.util.jar.JarFile:getJarEntry: }
As you can see the class is finally found. The problem is, it repeats
this for every time we invoke Nokogiri. There is a pretty sever
performance cost from doing this.
Is there any easy way to fix this? Any assistance you can provide would
be greatly helpful. If you would like me to try and produce a contrived
code example that replicates this, I can do that.
Eric Urban
Developer at Spiceworks
---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscr...@xerces.apache.org
For additional commands, e-mail: j-users-h...@xerces.apache.org