On Thu, Sep 24, 2015 at 2:27 AM, Dominic Cleal <[email protected]> wrote:
> Yes, I've seen something similar before - it was reported to us in the > context of a Puppet provider working on a huge file with many Nagios > service definitions. When lots of nodes with the same name, but > different index (e.g. service[1], service[2]) exist then Augeas is > extremely slow to traverse paths with a high index value. > > I spent a while profiling it and found a couple of very inefficient > memory operations - here's my branch: > > https://github.com/hercules-team/augeas/compare/master...domcleal:ns-filter-perf3 > > The problem is that it broke a couple of tests and I ran out of time > before I could find the root cause. > I looked at these changes for a bit, and unfortunately, the change that turns ns_add into something that blindly adds without checking for dupes breaks a few tests in test-xpath, since things like preceding-sibling can also produce the same node repeatedly. Clearly, as your test shows, the O(n^2) behavior of adding nodes to a nodeset is a big problem. I can think of a few ways to address that: - change nodeset so that it contains a hash table of the nodes to make checking for dupes fast(er) - since ns_add is only called in 4 places, and how we build nodesets is very local, make using it a little more complicated: add a flag 'added' to struct tree (pahole tells me we have a hole of 4 bytes in that data structure, i.e. plenty of space to store such a flag) and then have ns_add set that flag and check whether it is set when adding new nodes; make it mandatory to call ns_clear_added(ns) after being done with adding nodes to a nodeset, i.e. right after whatever loop that builds up the nodeset is done I also tinkered with a different way to cut down the time that `ns_filter` takes: just doing the `ns_remove` in batches (~ runs of non-matching nodes) drastically cuts down the time `ns_filter` takes and doesn't introduce any memory management headaches. The patches in https://github.com/hercules-team/augeas/pull/303 change ns_add as described above and change the internals of ns_filter. David
_______________________________________________ augeas-devel mailing list [email protected] https://www.redhat.com/mailman/listinfo/augeas-devel
