[
https://issues.apache.org/jira/browse/XERCESJ-1264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael Glavassevich reassigned XERCESJ-1264:
---------------------------------------------
Assignee: Michael Glavassevich
> Reduce performance penalty for using an EOFException to signal the end of the
> document.
> ---------------------------------------------------------------------------------------
>
> Key: XERCESJ-1264
> URL: https://issues.apache.org/jira/browse/XERCESJ-1264
> Project: Xerces2-J
> Issue Type: Improvement
> Components: JAXP (javax.xml.parsers)
> Affects Versions: 2.9.0
> Reporter: Michael Glavassevich
> Assignee: Michael Glavassevich
>
> As part of its normal control flow the XMLEntityScanner will throw an
> EOFException when it reaches the end of the document. For small documents,
> this can take up as much as 20-25% of the total execution time in the parser.
> Without messing with the current programming model, most of this time can be
> recovered by caching the exception (which eliminates the very expensive
> fillInStackTrace() on creation).
> Wolfgang Hoschek's post [1] to the j-dev list on this subject in 2004:
> =====================================================
> I have a server app that parsers millions of smallish documents.
> Performance has been improved at lot by reusing XMLReaders. It's pretty good
> but could perhaps get better when studying the (perhaps dubious?) hints given
> by the java -server -Xprof snippet below (JDK 1.5 RC, xerces CVS head, not
> using the JDK internal xerces which appears to be twice as slow in this case,
> for whatever reason).
> Accordingly, the theory is that throwing an (artifical) EOFException in
> XMLEntityScanner.load() at the end of each document consumes some 25% of the
> total execution time. Probably due too the heavy nature of exceptions and in
> particular Throwable.fillInStackTrace(). Would it perhaps be possibly (and
> correct) to avoid raising artificial exceptions for what appears to be normal
> program control flow (the documents and streams are fine)?
> Here is the trace snippet:
> Stub + native Method
> 28.6% 0 + 487 java.lang.Throwable.fillInStackTrace
> 28.6% 0 + 487 Total stub
> Thread-local ticks:
> 0.1% 1 Blocked (of total)
> 0.1% 2 Class loader
> 0.1% 2 Compilation
> 0.2% 3 Unknown: thread_state
> Flat profile of 0.01 secs (1 total ticks): DestroyJavaVM
> Thread-local ticks:
> 100.0% 1 Blocked (of total)
> Global summary of 35.44 seconds:
> 100.0% 1718 Received ticks
> 0.7% 12 Received GC ticks
> 9.7% 167 Compilation
> 0.1% 2 Class loader
> 0.2% 3 Unknown code
> real 0m35.715s
> user 0m34.170s
> sys 0m0.190s
> TRACE 300347:
> java.lang.Throwable.fillInStackTrace(Throwable.java:Unknown
> line)
> java.lang.Throwable.<init>(Throwable.java:181)
> java.lang.Exception.<init>(Exception.java:29)
> java.io.IOException.<init>(IOException.java:28)
> java.io.EOFException.<init>(EOFException.java:32)
> org.apache.xerces.impl.XMLEntityScanner.load(<Unknown
> Source>:Unknown line)
> org.apache.xerces.impl.XMLEntityScanner.skipSpaces(<Unknown
> Source>:Unknown line)
>
> org.apache.xerces.impl.XMLDocumentScannerImpl$TrailingMiscDispatcher.dis
> patch(<Unknown Source>:Unknown line)
>
> org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(<Unkn
> own Source>:Unknown line)
> org.apache.xerces.parsers.DTDConfiguration.parse(<Unknown
> Source>:Unknown line)
> org.apache.xerces.parsers.DTDConfiguration.parse(<Unknown
> Source>:Unknown line)
> org.apache.xerces.parsers.XMLParser.parse(<Unknown
> Source>:Unknown line)
> org.apache.xerces.parsers.AbstractSAXParser.parse(<Unknown
> Source>:Unknown line)
> nu.xom.Builder.build(Builder.java:786)
> nu.xom.Builder.build(Builder.java:569)
> gov.lbl.dsd.firefish.trash.XMLXomBench.main(XMLXomBench.java:62)
> I guess the relevant block is
> XMLEntityScanner.load(...):
> ...
> if (changeEntity) {
> fEntityManager.endEntity();
> if (fCurrentEntity == null) {
> throw new EOFException();
> }
> // handle the trailing edges
> if (fCurrentEntity.position == fCurrentEntity.count) {
> load(0, true);
> }
> }
> [1] http://mail-archives.apache.org/mod_mbox/xerces-j-dev/200409.mbox/[EMAIL
> PROTECTED]
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]