[ 
https://issues.apache.org/jira/browse/XERCESJ-1264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Glavassevich updated XERCESJ-1264:
------------------------------------------

    Fix Version/s: 2.9.1

> Reduce performance penalty for using an EOFException to signal the end of the 
> document.
> ---------------------------------------------------------------------------------------
>
>                 Key: XERCESJ-1264
>                 URL: https://issues.apache.org/jira/browse/XERCESJ-1264
>             Project: Xerces2-J
>          Issue Type: Improvement
>          Components: JAXP (javax.xml.parsers)
>    Affects Versions: 2.9.0
>            Reporter: Michael Glavassevich
>            Assignee: Michael Glavassevich
>             Fix For: 2.9.1
>
>
> As part of its normal control flow the XMLEntityScanner will throw an 
> EOFException when it reaches the end of the document.  For small documents, 
> this can take up as much as 20-25% of the total execution time in the parser. 
>  Without messing with the current programming model, most of this time can be 
> recovered by caching the exception (which eliminates the very expensive 
> fillInStackTrace() on creation).
> Wolfgang Hoschek's post [1] to the j-dev list on this subject in 2004:
> =====================================================
> I have a server app that parsers millions of smallish documents.
> Performance has been improved at lot by reusing XMLReaders. It's pretty good 
> but could perhaps get better when studying the (perhaps dubious?) hints given 
> by the java -server -Xprof snippet below (JDK 1.5 RC, xerces CVS head, not 
> using the JDK internal xerces which appears to be twice as slow in this case, 
> for whatever reason).
> Accordingly, the theory is that throwing an (artifical) EOFException in 
> XMLEntityScanner.load() at the end of each document consumes some 25% of the 
> total execution time. Probably due too the heavy nature of exceptions and in 
> particular Throwable.fillInStackTrace(). Would it perhaps be possibly (and 
> correct) to avoid raising artificial exceptions for what appears to be normal 
> program control flow (the documents and streams are fine)?
> Here is the trace snippet:
>           Stub + native   Method
>   28.6%     0  +   487    java.lang.Throwable.fillInStackTrace
>   28.6%     0  +   487    Total stub
>    Thread-local ticks:
>    0.1%     1             Blocked (of total)
>    0.1%     2             Class loader
>    0.1%     2             Compilation
>    0.2%     3             Unknown: thread_state
> Flat profile of 0.01 secs (1 total ticks): DestroyJavaVM
>    Thread-local ticks:
> 100.0%     1             Blocked (of total)
> Global summary of 35.44 seconds:
> 100.0%  1718             Received ticks
>    0.7%    12             Received GC ticks
>    9.7%   167             Compilation
>    0.1%     2             Class loader
>    0.2%     3             Unknown code
> real    0m35.715s
> user    0m34.170s
> sys     0m0.190s
> TRACE 300347:
>          java.lang.Throwable.fillInStackTrace(Throwable.java:Unknown  
> line)
>          java.lang.Throwable.<init>(Throwable.java:181)
>          java.lang.Exception.<init>(Exception.java:29)
>          java.io.IOException.<init>(IOException.java:28)
>          java.io.EOFException.<init>(EOFException.java:32)
>          org.apache.xerces.impl.XMLEntityScanner.load(<Unknown  
> Source>:Unknown line)
>          org.apache.xerces.impl.XMLEntityScanner.skipSpaces(<Unknown  
> Source>:Unknown line)
>           
> org.apache.xerces.impl.XMLDocumentScannerImpl$TrailingMiscDispatcher.dis 
> patch(<Unknown Source>:Unknown line)
>           
> org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(<Unkn 
> own Source>:Unknown line)
>          org.apache.xerces.parsers.DTDConfiguration.parse(<Unknown  
> Source>:Unknown line)
>          org.apache.xerces.parsers.DTDConfiguration.parse(<Unknown  
> Source>:Unknown line)
>          org.apache.xerces.parsers.XMLParser.parse(<Unknown  
> Source>:Unknown line)
>          org.apache.xerces.parsers.AbstractSAXParser.parse(<Unknown  
> Source>:Unknown line)
>          nu.xom.Builder.build(Builder.java:786)
>          nu.xom.Builder.build(Builder.java:569)
>          gov.lbl.dsd.firefish.trash.XMLXomBench.main(XMLXomBench.java:62)
> I guess the relevant block is
> XMLEntityScanner.load(...):
>              ...
>              if (changeEntity) {
>                  fEntityManager.endEntity();
>                  if (fCurrentEntity == null) {
>                      throw new EOFException();
>                  }
>                  // handle the trailing edges
>                  if (fCurrentEntity.position == fCurrentEntity.count) {
>                      load(0, true);
>                  }
>              }
> [1] http://mail-archives.apache.org/mod_mbox/xerces-j-dev/200409.mbox/[EMAIL 
> PROTECTED]

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to