Hi Marius,
One change which was made was using RTTI. dynamic_cast is used in
DOMCasts.hpp and DOMParentNode.cpp. These were needed for safety and
correctness. It would be nice to avoid these if possible, but my
understanding is that right now there isn't an alternative which can
provide the required safety. If you have any suggestions on this point,
I'm sure we would be very happy if there was a better solution.
I would suggest that the most useful investigation you could do is to
profile 3.1.4 and 3.2.2 using a suitable call profiler, so that you can
determine where Xerces is spending its time. I've done this in the past
on Linux with valgrind (callgrind). There are presumably similar tools
on Windows. With some concrete performance data, we can look at
improving the performance of the most critical parts of the codebase.
(Last time I did this, the biggest outlier was UTF-8 to UTF-16
conversion due to XMLCh being 16 bits. Were Xerces to switch to using
UTF-8 internally, that would be a huge performance boost for me.)
Kind regards,
Roger
On 05/03/2020 15:47, Marius Cojocaru wrote:
Hello,
After switching the xerces libraries on our company project, we
noticed that the performance tests for our programs took a major hit
in terms of degradation.
*Xerces 3.2.2 seems MUCH slower than Xerces 3.1.4* (or 3.1.1)
Based on the type of performance test that was run, we noticed up to
6x performance slowdown. The degradation is on both read and write
sections.
Our projects are running on multiple operating systems, but the
performance tests run on a Linux machine. However, the performance
degradation could be reproduced on Windows systems as well.
For Linux, the Xerces libraries were built using the commands
"./configure" and "make all".
For Windows, the Xerces libraries were build using the command "cmake
-G "Visual Studio 14 2015 Win64" -Dxmlch-type=wchar_t"
On windows we actually tried disabling some flags, (like using network
resources -Dnetwork:BOOL=OFF) and the resulting visual studio project
file was further modified to optimize for speed and favor speed over
size. None of these changes made any notable improvement.
To prove that the performance issues are not related to the company
project, I took one of the sample files existent in 3.2.2 source code
and used as reference while switching libraries.
The sample file is DOMCount.cpp and the xml used for reading was also
one generated for windows from the source code: ALL_BUILD.vcxproj
DOMCount sample file has an output that shows time taken to read the
xml file provided as input. To make sure the results are more
relevant, I only added one line of code in that DOMCount program, to
read the provided xml file 1000 times instead of 1. (the results are
linear anyway)
For the test machine, the CPU was a Xenon clocked at 2.6Ghz, so based
on the CPU type, the results might differ, but the degradation is
visible. In DOMCount program, *xerces 3.2.2 was 2 times slower than
xerces 3.1.4* for reading an xml.
I attached the slightly altered DOMCount sample program, the input xml
and a makefile to build this program (taking into consideration that
the xerces includes and libs are located where the make files wants)
I noticed there is a precedence regarding this performance slowdown in
an email from March 21, 2019 that was not answered by anyone:
http://apache-xml-project.6118.n7.nabble.com/Is-Xerces-c-3-2-2-slower-than-3-1-4-tc45258.html
The questions asked in that previous email were very pertinent and I
would like to reiterate them here:
* Is 3.2.2 supposed to be slower than 3.1.4, because it has more
features and security enhancement?
* Is this a known issue? Does it have any open bugs to be dealt with
in the future? (I could not find any and the last changelog entries
related to performance optimization were from 2009)
* Are there any configuration option flags (build switches or options)
that would make Xerces 3.2.2 as fast as 3.1.4?
Best regards,
Marius Cojocaru