Hi folks,
When profiling an application to identify performance problems, I came
across a worrying indication that there was a scalability problem
internal to xerces-c. Further profiling showed exactly where this was
(I've attached screenshots of the visualisation).
The code here
(https://github.com/ome/ome-model/blob/master/ome-xml/src/main/cpp/ome/xml/model/detail/OMEModelObject.cpp#L123)
is looping over the nodelist of an element, but... most of the time is
spent in DOMNodeListImpl::item. There's a thin RAII wrapper around the
DOM classes and an STL-style iterator over the node list, but that
doesn't account for the CPU and time usage.
Looking at the implementation in
http://svn.apache.org/viewvc/xerces/c/branches/xerces-3.1/src/xercesc/dom/impl/DOMNodeListImpl.cpp?view=markup#l64
it looks like it's due to the indexed access being O(n) rather than O(1)
which becomes O(n!) when doing a linear iteration of the list. When the
list contains over 20000 nodes (in the above profiling), this then
becomes problematic. At cachegrind shows, it's blowing away the cache
due to all the pointer walking, and this results in operations which
should take a few tens of milliseconds taking multiple seconds, which
results in a runtime of many minutes rather than a second or two.
Questions:
- Is this a known problem? It looks like a fairly fundamental design
problem with the node list
- Is there any known workaround?
- Is there any alternative way to do a linear walk of the node list
using the public API and avoiding the index-based access offered by the
DOMNodeList API?
Many thanks,
Roger