Hi Alberto, > BTW, in order to set the well-formed scanner, you have to call > setProperty(XMLUni::fgXercesScannerName, XMLUni::fgWFXMLScanner)
what exactly do you mean by well-formed scanner? What I am looking for is a scanner without validation and wellformed-checks. Thus, in the end I am interested in the time needed for tokenization of the input. Does there exist a scanner like this? In meantime, I rerun my experiments (with the current SVN version). Here are the results: Xerces SAX1 Interface: ---------------------- Data size real (s) user (s) sys (s) cpu (%) throughput (MB/s) -------------------------------------------------------------------------------- XMark 10MB 0.83 0.3 0 38.33 33.33 XMark 100MB 5.02 2.72 0.08 54.66 35.71 XMark 1000MB 46.44 22.41 0.76 49.33 43.15 XMark 5000MB 241.04 116.96 4.03 49.66 41.32 MEDLINE 656MB 32.15 22.55 0.58 71.33 28.36 ProtSeq 685MB 32.79 25.94 0.54 80 25.86 Xerces SAX2 Interface: ---------------------- Data size real (s) user (s) sys (s) cpu (%) throughput (MB/s) -------------------------------------------------------------------------------- XMark 10MB 0.77 0.43 0.01 59 22.72 XMark 100MB 5.82 4.2 0.08 73.33 23.36 XMark 1000MB 56.51 42.81 0.88 77 22.88 XMark 5000MB 292.28 214.1 4.38 74 22.88 MEDLINE 656MB 54.19 44.54 0.6 83 14.53 ProtSeq 685MB 56.37 45.9 0.59 82 14.73 I did not use "setProperty(XMLUni::fgXercesScannerName, XMLUni::fgWFXMLScanner)" in my experiments. Summarizing the results, SAX1 seems to be by a factor of 2 fastern than SAX2, so the experiments confirm what you expected, given that scanners are comparable? Kind regards Michael _______________________________________________________________________ Viren-Scan für Ihren PC! Jetzt für jeden. Sofort, online und kostenlos. Gleich testen! http://www.pc-sicherheit.web.de/freescan/?mc=022222
