Hi MIchael,
if you used
parser->useScanner(XMLUni::fgWFXMLScanner) on the
SAX1, you should use
reader->setProperty(XMLUni::fgXercesScannerName,
XMLUni::fgWFXMLScanner) in order to have them use
the same scanner, and yield comparable results.
BTW, the WF in the WFXMLScanner stands for
well-formed, in the sense that the scanner is
written so that it performs only the checks for
well-formedness (no DTD or XMLSchema validation).
The other scanners available are DGXMLScanner
(for DTD validation only), SGXMLScanner (for
XMLSchema validation only) and IGXMLScanner (for
both DTD and XMLSchema validation).
Could you add the call and rerun the benchmark?
Thanks,
Alberto
At 16.47 09/03/2007 +0100, Michael Schmidt wrote:
Hi Alberto,
> BTW, in order to set the well-formed scanner, you have to call
> setProperty(XMLUni::fgXercesScannerName, XMLUni::fgWFXMLScanner)
what exactly do you mean by well-formed scanner?
What I am looking for is a scanner without
validation and wellformed-checks. Thus, in the
end I am interested in the time needed for
tokenization of the input. Does there exist a scanner like this?
In meantime, I rerun my experiments (with the
current SVN version). Here are the results:
Xerces SAX1 Interface:
----------------------
Data size real (s) user (s) sys
(s) cpu (%) throughput (MB/s)
--------------------------------------------------------------------------------
XMark 10MB 0.83 0.3 0 38.33 33.33
XMark 100MB 5.02 2.72 0.08 54.66 35.71
XMark 1000MB 46.44 22.41 0.76 49.33 43.15
XMark 5000MB 241.04 116.96 4.03 49.66 41.32
MEDLINE 656MB 32.15 22.55 0.58 71.33 28.36
ProtSeq 685MB 32.79 25.94 0.54 80 25.86
Xerces SAX2 Interface:
----------------------
Data size real (s) user (s) sys
(s) cpu (%) throughput (MB/s)
--------------------------------------------------------------------------------
XMark 10MB 0.77 0.43 0.01 59 22.72
XMark 100MB 5.82 4.2 0.08 73.33 23.36
XMark 1000MB 56.51 42.81 0.88 77 22.88
XMark 5000MB 292.28 214.1 4.38 74 22.88
MEDLINE 656MB 54.19 44.54 0.6 83 14.53
ProtSeq 685MB 56.37 45.9 0.59 82 14.73
I did not use
"setProperty(XMLUni::fgXercesScannerName,
XMLUni::fgWFXMLScanner)" in my experiments.
Summarizing the results, SAX1 seems to be by a
factor of 2 fastern than SAX2, so the
experiments confirm what you expected, given that scanners are comparable?
Kind regards
Michael
_______________________________________________________________________
Viren-Scan für Ihren PC! Jetzt für jeden. Sofort, online und kostenlos.
Gleich testen! http://www.pc-sicherheit.web.de/freescan/?mc=022222