Hi Marius,

One change which was made was using RTTI. dynamic_cast is used in DOMCasts.hpp and DOMParentNode.cpp. These were needed for safety and correctness. It would be nice to avoid these if possible, but my understanding is that right now there isn't an alternative which can provide the required safety.  If you have any suggestions on this point, I'm sure we would be very happy if there was a better solution.

I would suggest that the most useful investigation you could do is to profile 3.1.4 and 3.2.2 using a suitable call profiler, so that you can determine where Xerces is spending its time.  I've done this in the past on Linux with valgrind (callgrind).  There are presumably similar tools on Windows.  With some concrete performance data, we can look at improving the performance of the most critical parts of the codebase.  (Last time I did this, the biggest outlier was UTF-8 to UTF-16 conversion due to XMLCh being 16 bits.  Were Xerces to switch to using UTF-8 internally, that would be a huge performance boost for me.)


Kind regards,

Roger

On 05/03/2020 15:47, Marius Cojocaru wrote:
Hello,

After switching the xerces libraries on our company project, we noticed that the performance tests for our programs took a major hit in terms of degradation.
*Xerces 3.2.2 seems MUCH slower than Xerces 3.1.4* (or 3.1.1)
Based on the type of performance test that was run, we noticed up to 6x performance slowdown. The degradation is on both read and write sections. Our projects are running on multiple operating systems, but the performance tests run on a Linux machine. However, the performance degradation could be reproduced on Windows systems as well.

For Linux, the Xerces libraries were built using the commands "./configure" and "make all". For Windows, the Xerces libraries were build using the command "cmake -G "Visual Studio 14 2015 Win64" -Dxmlch-type=wchar_t"

On windows we actually tried disabling some flags, (like using network resources -Dnetwork:BOOL=OFF) and the resulting visual studio project file was further modified to optimize for speed and favor speed over size. None of these changes made any notable improvement.

To prove that the performance issues are not related to the company project, I took one of the sample files existent in 3.2.2 source code and used as reference while switching libraries. The sample file is DOMCount.cpp and the xml used for reading was also one generated for windows from the source code: ALL_BUILD.vcxproj DOMCount sample file has an output that shows time taken to read the xml file provided as input. To make sure the results are more relevant, I only added one line of code in that DOMCount program, to read the provided xml file 1000 times instead of 1. (the results are linear anyway)

For the test machine, the CPU was a Xenon clocked at 2.6Ghz, so based on the CPU type, the results might differ, but the degradation is visible. In DOMCount program, *xerces 3.2.2 was 2 times slower than xerces 3.1.4* for reading an xml. I attached the slightly altered DOMCount sample program, the input xml and a makefile to build this program (taking into consideration that the xerces includes and libs are located where the make files wants)

I noticed there is a precedence regarding this performance slowdown in an email from March 21, 2019 that was not answered by anyone: http://apache-xml-project.6118.n7.nabble.com/Is-Xerces-c-3-2-2-slower-than-3-1-4-tc45258.html The questions asked in that previous email were very pertinent and I would like to reiterate them here:


* Is 3.2.2 supposed to be slower than 3.1.4, because it has more features and security enhancement? * Is this a known issue? Does it have any open bugs to be dealt with in the future? (I could not find any and the last changelog entries related to performance optimization were from 2009) * Are there any configuration option flags (build switches or options) that would make Xerces 3.2.2 as fast as 3.1.4?


Best regards,
Marius Cojocaru

Reply via email to