On Thu, Oct 1, 2015 at 11:44 AM, Laine Stump <[email protected]> wrote:
> On 09/22/2015 03:18 PM, Laine Stump wrote: > >> It was bound to happen eventually. Someone created a host with 514 vlan >> interfaces each connected to a host bridge, then started up virt-manager. >> [blah blah boring blah removed] >> > To update those not included in a separate thread on the topic in > netcf-devel (I'll try to keep all discussion here from now on): > > Dan Berrange pointed out that netcf was calling aug_load() on each entry > to a public netcf API, and libvirt was calling netcf APIs multiple times > for each interface. Even though aug_load() checks the mtime of files it has > already loaded, and avoids re-loading those that haven't been modified (in > this case none have been modified), it turns out that just doing a stat() > of 1100 files takes a significant amount of time. So I modified netcf to > only call aug_load() to do this check if it has been at least 1 second > since the last time it was called. This made a very large improvement, > especially when running the upstream versions of all involved packages > (virt-manager --> libvirt --> netcf --> augeas). But when running the > versions that are included in RHEL6, it wasn't so rosy. A test setup of 514 > bridge+vlan interfaces which took around 30 minutes (!!) to complete a full > startup of virt-manager (which calls netcf/augeas to list all interfaces, > then get the XML config for them) now takes 13 minutes with netcf modified > to call aug_load() only once per second. (the same operation takes "only" 8 > minutes using all upstream code). > > But 13 (or even 8) minutes is still a very long time, so I played around a > bit in gdb and found that most of the time now seems to be spent in one > call to aug_match(): > > > r = aug_match(aug, path, "/files/etc/sysconfig/network-scripts/*[ DEVICE > = 'br1' or BRIDGE = 'br1' or MASTER = 'br1' or MASTER = ../*[BRIDGE = > 'br1']/DEVICE ]/DEVICE"); > > (this is the result of a call to netcf's aug_fmt_match() in the netcf > function aug_get_xml_for_nif()) > > When I step over that call to aug_match(), there is a very noticeable > pause before the gdb prompt comes back, while continuing from that point > all the way through virt-manager's "get all interfaces" loop back to the > next call to aug_get_xml_for_nif() (including several other calls to > aug_match() that have much simpler search expressions) seems to happen > instantly. > > So apparently doing a match against all ifcfg files based on this complex > match expression is really slowing us down. Any ideas on how to either make > this expression simpler, or alternately how to get augeas doing the search > more quickly? > Was that with the performance stuff I did a few days ago ? (You'd need Augeas HEAD for that) Alternatively, can you send me your /etc/sysconfig/network-scripts ? (Fair warning: I will have no time to look into this next week) > I have two questions based on this: >> >> 1) has anyone thought about/looked into optimizing/changing the data >> structure used to store nodes in augeas to scale better with larger >> datasets (execution time seems to increase at > linear)? >> > >From what Dominic turned up, the problem doesn't seem to be so much the data structure for the tree, as the fact that there was some O(n^2) behavior in building intermediate data structures. > 2) I recall that a long time ago augeas put in code to re-read/parse files >> only if they had been modified. netcf (and thus libvirt) could take >> advantage of this info if it was available in the augeas API - the first >> time it retrieved the info for an interface it would take a hit, but all >> subsequent times could be much quicker. >> > > About this one - I'm wondering how well it would work out for augeas to > use inotify to learn about modifications to files (including the directory > that the ifcfg files live in, in case a new file is created). It works okay > for netcf to avoid calling aug_load() (as mentioned above), but it does > make me a bit uncomfortable that we sometimes have a mistaken view of the > config. > It would definitely be a possibilty - we would still need to queue notifications from inotify and only act on them when the user calls aug_load to avoid things like destroying changes the user made; IOW, it still needs to stay predictable when the tree changes based on changes in the FS. It's been a while since I've looked at inotify, but I think it would also introduce a Linux dependency; we could work around that by only using it where available, and falling back to today's behavior. David
_______________________________________________ augeas-devel mailing list [email protected] https://www.redhat.com/mailman/listinfo/augeas-devel
