Performance problem with DOMNodeListImpl::item

Roger Leigh Tue, 07 Feb 2017 12:19:14 -0800

Hi folks,

When profiling an application to identify performance problems, I cameacross a worrying indication that there was a scalability probleminternal to xerces-c. Further profiling showed exactly where this was(I've attached screenshots of the visualisation).

The code here(https://github.com/ome/ome-model/blob/master/ome-xml/src/main/cpp/ome/xml/model/detail/OMEModelObject.cpp#L123)is looping over the nodelist of an element, but... most of the time isspent in DOMNodeListImpl::item. There's a thin RAII wrapper around theDOM classes and an STL-style iterator over the node list, but thatdoesn't account for the CPU and time usage.

Looking at the implementation inhttp://svn.apache.org/viewvc/xerces/c/branches/xerces-3.1/src/xercesc/dom/impl/DOMNodeListImpl.cpp?view=markup#l64it looks like it's due to the indexed access being O(n) rather than O(1)which becomes O(n!) when doing a linear iteration of the list. When thelist contains over 20000 nodes (in the above profiling), this thenbecomes problematic. At cachegrind shows, it's blowing away the cachedue to all the pointer walking, and this results in operations whichshould take a few tens of milliseconds taking multiple seconds, whichresults in a runtime of many minutes rather than a second or two.


Questions:

- Is this a known problem? It looks like a fairly fundamental designproblem with the node list

- Is there any known workaround?

- Is there any alternative way to do a linear walk of the node listusing the public API and avoiding the index-based access offered by theDOMNodeList API?



Many thanks,
Roger

Performance problem with DOMNodeListImpl::item

Reply via email to