Hello,

I work on a Rails application that uses Nokogiri to parse HTML. We run on JRuby 1.7.19, so Nokogiri takes advantage of the xerces XML parser to do its parsing. I noticed that using Nokogiri (and thus Xerces) triggers tons of class loading. Through some research I found setting some Java system properties to sane defaults reduced the amount of class loading that goes on. Here is what I set

java.lang.System.setProperty('javax.xml.xpath.XPathFactory:http://java.sun.com/jaxp/xpath/dom','com.sun.org.apache.xpath.internal.jaxp.XPathFactoryImpl')

java.lang.System.setProperty('org.apache.xerces.xni.parser.XMLParserConfiguration','org.apache.xerces.parsers.XIncludeAwareParserConfiguration')

java.lang.System.setProperty('com.sun.org.apache.xml.internal.dtm.DTMManager','com.sun.org.apache.xml.internal.dtm.ref.DTMManagerDefault')


This is actually set from Ruby code, but it interacts with the JVM just like native Java code. I figured all this out from documentation on the web. This reduced the amount of Jar file's that are opened dramatically at runtime.

Using intrace to trace JarFile activity, I still get this happening constantly however:

[07:16:31.004]:[36]:java.util.jar.JarFile:getJarEntry: {
[07:16:31.004]:[36]:java.util.jar.JarFile:getJarEntry: Arg: com/sun/org/apache/xpath/internal/jaxp/XPathFactoryImpl.class
[07:16:31.004]:[36]:java.util.jar.JarFile:getEntry: {
[07:16:31.004]:[36]:java.util.jar.JarFile:getEntry: Arg: com/sun/org/apache/xpath/internal/jaxp/XPathFactoryImpl.class
[07:16:31.004]:[36]:java.util.jar.JarFile:getEntry: Return: null
[07:16:31.004]:[36]:java.util.jar.JarFile:getEntry: }
[07:16:31.004]:[36]:java.util.jar.JarFile:getJarEntry: Return: null
[07:16:31.004]:[36]:java.util.jar.JarFile:getJarEntry: }
[07:16:31.004]:[36]:java.util.jar.JarFile:getJarEntry: {
[07:16:31.004]:[36]:java.util.jar.JarFile:getJarEntry: Arg: com/sun/org/apache/xpath/internal/jaxp/XPathFactoryImpl.class
[07:16:31.004]:[36]:java.util.jar.JarFile:getEntry: {
[07:16:31.004]:[36]:java.util.jar.JarFile:getEntry: Arg: com/sun/org/apache/xpath/internal/jaxp/XPathFactoryImpl.class
[07:16:31.004]:[36]:java.util.jar.JarFile$JarFileEntry:<init>: {
[07:16:31.004]:[36]:java.util.jar.JarFile$JarFileEntry:<init>: Arg: java.util.jar.JarFile@61175d12 [07:16:31.004]:[36]:java.util.jar.JarFile$JarFileEntry:<init>: Arg: com/sun/org/apache/xpath/internal/jaxp/XPathFactoryImpl.class
[07:16:31.004]:[36]:java.util.jar.JarFile$JarFileEntry:<init>: }
[07:16:31.004]:[36]:java.util.jar.JarFile:getEntry: Return: com/sun/org/apache/xpath/internal/jaxp/XPathFactoryImpl.class
[07:16:31.004]:[36]:java.util.jar.JarFile:getEntry: }
[07:16:31.004]:[36]:java.util.jar.JarFile:getJarEntry: Return: com/sun/org/apache/xpath/internal/jaxp/XPathFactoryImpl.class
[07:16:31.004]:[36]:java.util.jar.JarFile:getJarEntry: }
[07:16:31.005]:[36]:java.util.jar.JarFile:getJarEntry: {
[07:16:31.005]:[36]:java.util.jar.JarFile:getJarEntry: Arg: com/sun/org/apache/xpath/internal/jaxp/XPathFactoryImpl.class
[07:16:31.005]:[36]:java.util.jar.JarFile:getEntry: {
[07:16:31.005]:[36]:java.util.jar.JarFile:getEntry: Arg: com/sun/org/apache/xpath/internal/jaxp/XPathFactoryImpl.class
[07:16:31.005]:[36]:java.util.jar.JarFile:getEntry: Return: null
[07:16:31.005]:[36]:java.util.jar.JarFile:getEntry: }
[07:16:31.005]:[36]:java.util.jar.JarFile:getJarEntry: Return: null
[07:16:31.005]:[36]:java.util.jar.JarFile:getJarEntry: }
[07:16:31.005]:[36]:java.util.jar.JarFile:getJarEntry: {
[07:16:31.005]:[36]:java.util.jar.JarFile:getJarEntry: Arg: com/sun/org/apache/xpath/internal/jaxp/XPathFactoryImpl.class
[07:16:31.005]:[36]:java.util.jar.JarFile:getEntry: {
[07:16:31.005]:[36]:java.util.jar.JarFile:getEntry: Arg: com/sun/org/apache/xpath/internal/jaxp/XPathFactoryImpl.class
[07:16:31.005]:[36]:java.util.jar.JarFile$JarFileEntry:<init>: {
[07:16:31.005]:[36]:java.util.jar.JarFile$JarFileEntry:<init>: Arg: java.util.jar.JarFile@61175d12 [07:16:31.005]:[36]:java.util.jar.JarFile$JarFileEntry:<init>: Arg: com/sun/org/apache/xpath/internal/jaxp/XPathFactoryImpl.class
[07:16:31.005]:[36]:java.util.jar.JarFile$JarFileEntry:<init>: }
[07:16:31.005]:[36]:java.util.jar.JarFile:getEntry: Return: com/sun/org/apache/xpath/internal/jaxp/XPathFactoryImpl.class
[07:16:31.005]:[36]:java.util.jar.JarFile:getEntry: }
[07:16:31.005]:[36]:java.util.jar.JarFile:getJarEntry: Return: com/sun/org/apache/xpath/internal/jaxp/XPathFactoryImpl.class
[07:16:31.005]:[36]:java.util.jar.JarFile:getJarEntry: }

As you can see the class is finally found. The problem is, it repeats this for every time we invoke Nokogiri. There is a pretty sever performance cost from doing this.

Is there any easy way to fix this? Any assistance you can provide would be greatly helpful. If you would like me to try and produce a contrived code example that replicates this, I can do that.

Eric Urban
Developer at Spiceworks



---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscr...@xerces.apache.org
For additional commands, e-mail: j-users-h...@xerces.apache.org

Reply via email to