On 6/26/20 11:34 PM, Chun-Yu Shei wrote: > Hi, > > I was recently interested in whether portage could be speed up, since > dependency resolution can sometimes take a while on slower machines. > After generating some flame graphs with cProfile and vmprof, I found 3 > functions which seem to be called extremely frequently with the same > arguments: catpkgsplit, use_reduce, and match_from_list. In the first > two cases, it was simple to cache the results in dicts, while > match_from_list was a bit trickier, since it seems to be a requirement > that it return actual entries from the input "candidate_list". I also > ran into some test failures if I did the caching after the > mydep.unevaluated_atom.use and mydep.repo checks towards the end of the > function, so the caching is only done up to just before that point. > > The catpkgsplit change seems to definitely be safe, and I'm pretty sure > the use_reduce one is too, since anything that could possibly change the > result is hashed. I'm a bit less certain about the match_from_list one, > although all tests are passing. > > With all 3 patches together, "emerge -uDvpU --with-bdeps=y @world" > speeds up from 43.53 seconds to 30.96 sec -- a 40.6% speedup. "emerge > -ep @world" is just a tiny bit faster, going from 18.69 to 18.22 sec > (2.5% improvement). Since the upgrade case is far more common, this > would really help in daily use, and it shaves about 30 seconds off > the time you have to wait to get to the [Yes/No] prompt (from ~90s to > 60s) on my old Sandy Bridge laptop when performing normal upgrades. > > Hopefully, at least some of these patches can be incorporated, and please > let me know if any changes are necessary. > > Thanks, > Chun-Yu
Using global variables for caches like these causes a form of memory leak for use cases involving long-running processes that need to work with many different repositories (and perhaps multiple versions of those repositories). There are at least a couple of different strategies that we can use to avoid this form of memory leak: 1) Limit the scope of the caches so that they have some sort of garbage collection life cycle. For example, it would be natural for the depgraph class to have a local cache of use_reduce results, so that the cache can be garbage collected along with the depgraph. 2) Eliminate redundant calls. For example, redundant calls to catpkgslit can be avoided by constructing more _pkg_str instances, since catpkgsplit is able to return early when its argument happens to be a _pkg_str instance. -- Thanks, Zac
signature.asc
Description: OpenPGP digital signature