Michael Schmidt wrote:
Hi,
I'm currently running some benchmarks with Xerces. I want to measure the throughput of the parser with empty event implementations. Thus, as a starting point, I used the SAX2Count sample (in the sample directory) and replaced the events (which simply count tags and character data) by empty implementations.
On my 2GB Duo Core processor, I've measured a throughput of about 10MB/s (I
measured different documents of differing sizes, e.g. XMark, Medline and
Protein Sequence data; they all gave values around 10MB/s). In my feeling, this
value is not very high. In the web, I found some benchmarks with Java parsers
of more than 30MB/s. Does anybody know whether the implementation of this
sample is efficient? Is there any official benchmark implementation? Or
anything else you would recommend?
Benchmarks are a very tricky thing. Are those Java parsers all
full-conformant? How the machines on which those parsers were tested
differ from yours? I would not trust a benchmark that is not carefully
designed and controlled.
Xerces-C can be very sensitive to several factors, including the compiler
used to build the binaries, and the OS memory allocation functions. Since
you don't mention your OS or compiler, it's hard to say if there's anything
you can do to get better results.
Dave