Re: Regarding Xercesc++ performance

2013-06-14 Thread Boris Kolpackov
Hi Rinil,

Baxi, Rinil Rushabh rinil.b...@hp.com writes:

 I have 2 Xerces-C++ libraries available on my platform (2.4 and 3.1).
 Both are built without threads. I am trying to compare performance
 of both of them. To compare performance I am using different sized
 xml files to parse using the samples (1kb, 65kb, 256Kb, 1Mb, 2Mb,
 5Mb and 15Mb). I have put each sample in a script and run the same
 sample 1000 times to compare the parsing time.

Hm, I wouldn't do it like that. I would make the test itself perform
1000 iterations and also include a few warm up iterations. If you are
interested, CodeSynthesis XSD[1], which is based on Xerces-C++, includes
'performance' examples that show how to do this. They also show how
to configure Xerces-C++ parsers for optimal performance (things like
schema preloading, etc). The one for DOM is in examples/cxx/tree/ and
the one for SAX2 -- examples/cxx/parser/.

 
 We observed that till 1Mb xml file size performance of Xerces-C++ 3.1
 is better after that it starts deteriorating.

We definitely tested the performance difference between 2 and 3-series
and I am pretty sure Xerces-C++ 3 did consistently better.

[1] http://www.codesynthesis.com/products/xsd

Boris

-- 
Boris Kolpackov, Code Synthesishttp://codesynthesis.com/~boris/blog
Compiler-based ORM system for C++  http://codesynthesis.com/products/odb
Open-source XML data binding for C++   http://codesynthesis.com/products/xsd
XML data binding for embedded systems  http://codesynthesis.com/products/xsde

-
To unsubscribe, e-mail: c-dev-unsubscr...@xerces.apache.org
For additional commands, e-mail: c-dev-h...@xerces.apache.org



Re: Regarding Xercesc++ performance

2013-06-14 Thread Rob Cameron
Hi, Rinil.

What is your goal?   If you are considering choosing Xerces 2.4 vs 3.1, here are
some other things to think about.
(a) Xerces 3.1 has better support for later XML standards
(b) Xerces 3.1 has bug fixes over 2.4
(c) Xerces 3.1 has support for 64-bit architectures
(d) Any future developments and improvements will likely only be made to the
Xerces 3.1 line.

If performance is critical, you may want to consider icXML.   This is
a highly accelerated version of Xerces 3.1.1 that we are building
based on the systematic incorporation of parallel bit stream technology
in the underlying engine. icXML substantially speeds up both SAX-based
and DOM-based parsing.

We will be presenting our work with icXML at Balisage 2013 in Montreal
this August.

Rob Cameron
CTO, International Characters, Inc.

On Thu, Jun 13, 2013 at 10:22 PM, Baxi, Rinil Rushabh rinil.b...@hp.com wrote:
 Hi Dan,



 I have checked with both the parsers SAX and DOM and almost same result I
 got.



 Best Regards,

 Rinil



 From: Huantes, Dan F (TASC) [mailto:dan.huan...@tasc.com]
 Sent: Thursday, June 13, 2013 6:20 PM
 To: c-dev@xerces.apache.org
 Subject: RE: Regarding Xercesc++ performance



 Nice work.



 I’m curious as to whether your performance testing is DOM based, SAX based,
 or both.



 I ask because my anecdotal experience is that files exceeding 1MB experience
 large performance hits due to the inherent nature of the DOM model.  Under
 these scenarios, I have used SAX because it’s several orders of magnitude
 faster (i.e. seconds vs minutes).  We used 2.8 before but never thought to
 compare the difference in performance between different versions.  You may
 be on to something.  Thanks.



 Dan





 From: Baxi, Rinil Rushabh [mailto:rinil.b...@hp.com]
 Sent: Thursday, June 13, 2013 4:07 AM
 To: c-dev@xerces.apache.org
 Subject: Regarding Xercesc++ performance



 Hi All,



 I have 2 Xerces-C++ libraries available on my platform (2.4 and 3.1). Both
 are built without threads. I am trying to compare performance of both of
 them. To compare performance I am using different sized xml files to parse
 using the samples (1kb, 65kb, 256Kb, 1Mb, 2Mb, 5Mb and 15Mb). I have put
 each sample in a script and run the same sample 1000 times to compare the
 parsing time.



 We observed that till 1Mb xml file size performance of Xerces-C++ 3.1 is
 better after that it starts deteriorating. With 15Mb xml file 3.1 sample
 takes almost 30% more time than with 2.4 same sample.



 Please let me know whether this is the right method to measure performance
 or not. If no then how can we measure that. One more question is Why such
 performance degradation?



 Thanks in advance.



 Best Regards,

 Rinil

 CONFIDENTIALITY NOTICE: This message and any attachments or files
 transmitted with it (collectively, the Message) are intended only for the
 addressee and may contain information that is privileged, proprietary and/or
 prohibited from disclosure by law or contract. If you are not the intended
 recipient: (a) please do not read, copy or retransmit the Message; (b)
 permanently delete and/or destroy all electronic and hard copies of the
 Message; (c) notify us by return email; and (d) you are hereby notified that
 any dissemination, distribution or copying of the Message is strictly
 prohibited.

-
To unsubscribe, e-mail: c-dev-unsubscr...@xerces.apache.org
For additional commands, e-mail: c-dev-h...@xerces.apache.org