Reduce performance penalty for using an EOFException to signal the end of the 
document.
---------------------------------------------------------------------------------------

                 Key: XERCESJ-1264
                 URL: https://issues.apache.org/jira/browse/XERCESJ-1264
             Project: Xerces2-J
          Issue Type: Improvement
          Components: JAXP (javax.xml.parsers)
    Affects Versions: 2.9.0
            Reporter: Michael Glavassevich


As part of its normal control flow the XMLEntityScanner will throw an 
EOFException when it reaches the end of the document.  For small documents, 
this can take up as much as 20-25% of the total execution time in the parser.  
Without messing with the current programming model, most of this time can be 
recovered by caching the exception (which eliminates the very expensive 
fillInStackTrace() on creation).

Wolfgang Hoschek's post [1] to the j-dev list on this subject in 2004:
=====================================================

I have a server app that parsers millions of smallish documents.

Performance has been improved at lot by reusing XMLReaders. It's pretty good 
but could perhaps get better when studying the (perhaps dubious?) hints given 
by the java -server -Xprof snippet below (JDK 1.5 RC, xerces CVS head, not 
using the JDK internal xerces which appears to be twice as slow in this case, 
for whatever reason).

Accordingly, the theory is that throwing an (artifical) EOFException in 
XMLEntityScanner.load() at the end of each document consumes some 25% of the 
total execution time. Probably due too the heavy nature of exceptions and in 
particular Throwable.fillInStackTrace(). Would it perhaps be possibly (and 
correct) to avoid raising artificial exceptions for what appears to be normal 
program control flow (the documents and streams are fine)?

Here is the trace snippet:

          Stub + native   Method
  28.6%     0  +   487    java.lang.Throwable.fillInStackTrace
  28.6%     0  +   487    Total stub

   Thread-local ticks:
   0.1%     1             Blocked (of total)
   0.1%     2             Class loader
   0.1%     2             Compilation
   0.2%     3             Unknown: thread_state

Flat profile of 0.01 secs (1 total ticks): DestroyJavaVM

   Thread-local ticks:
100.0%     1             Blocked (of total)


Global summary of 35.44 seconds:
100.0%  1718             Received ticks
   0.7%    12             Received GC ticks
   9.7%   167             Compilation
   0.1%     2             Class loader
   0.2%     3             Unknown code

real    0m35.715s
user    0m34.170s
sys     0m0.190s

TRACE 300347:
         java.lang.Throwable.fillInStackTrace(Throwable.java:Unknown  
line)
         java.lang.Throwable.<init>(Throwable.java:181)
         java.lang.Exception.<init>(Exception.java:29)
         java.io.IOException.<init>(IOException.java:28)
         java.io.EOFException.<init>(EOFException.java:32)
         org.apache.xerces.impl.XMLEntityScanner.load(<Unknown  
Source>:Unknown line)
         org.apache.xerces.impl.XMLEntityScanner.skipSpaces(<Unknown  
Source>:Unknown line)
          
org.apache.xerces.impl.XMLDocumentScannerImpl$TrailingMiscDispatcher.dis 
patch(<Unknown Source>:Unknown line)
          
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(<Unkn 
own Source>:Unknown line)
         org.apache.xerces.parsers.DTDConfiguration.parse(<Unknown  
Source>:Unknown line)
         org.apache.xerces.parsers.DTDConfiguration.parse(<Unknown  
Source>:Unknown line)
         org.apache.xerces.parsers.XMLParser.parse(<Unknown  
Source>:Unknown line)
         org.apache.xerces.parsers.AbstractSAXParser.parse(<Unknown  
Source>:Unknown line)
         nu.xom.Builder.build(Builder.java:786)
         nu.xom.Builder.build(Builder.java:569)
         gov.lbl.dsd.firefish.trash.XMLXomBench.main(XMLXomBench.java:62)

I guess the relevant block is

XMLEntityScanner.load(...):
             ...
             if (changeEntity) {
                 fEntityManager.endEntity();
                 if (fCurrentEntity == null) {
                     throw new EOFException();
                 }
                 // handle the trailing edges
                 if (fCurrentEntity.position == fCurrentEntity.count) {
                     load(0, true);
                 }
             }

[1] http://mail-archives.apache.org/mod_mbox/xerces-j-dev/200409.mbox/[EMAIL 
PROTECTED]

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to