Re: [gentoo-portage-dev] Add caching to a few commonly used functions
Dnia June 28, 2020 3:42:33 AM UTC, Zac Medico napisał(a): >On 6/27/20 8:12 PM, Michał Górny wrote: >> Dnia June 28, 2020 3:00:00 AM UTC, Zac Medico >napisał(a): >>> On 6/26/20 11:34 PM, Chun-Yu Shei wrote: Hi, I was recently interested in whether portage could be speed up, >since dependency resolution can sometimes take a while on slower >machines. After generating some flame graphs with cProfile and vmprof, I >found >>> 3 functions which seem to be called extremely frequently with the >same arguments: catpkgsplit, use_reduce, and match_from_list. In the >>> first two cases, it was simple to cache the results in dicts, while match_from_list was a bit trickier, since it seems to be a >>> requirement that it return actual entries from the input "candidate_list". I >>> also ran into some test failures if I did the caching after the mydep.unevaluated_atom.use and mydep.repo checks towards the end of >>> the function, so the caching is only done up to just before that point. The catpkgsplit change seems to definitely be safe, and I'm pretty >>> sure the use_reduce one is too, since anything that could possibly >change >>> the result is hashed. I'm a bit less certain about the match_from_list >>> one, although all tests are passing. With all 3 patches together, "emerge -uDvpU --with-bdeps=y @world" speeds up from 43.53 seconds to 30.96 sec -- a 40.6% speedup. >>> "emerge -ep @world" is just a tiny bit faster, going from 18.69 to 18.22 >sec (2.5% improvement). Since the upgrade case is far more common, >this would really help in daily use, and it shaves about 30 seconds off the time you have to wait to get to the [Yes/No] prompt (from ~90s >to 60s) on my old Sandy Bridge laptop when performing normal upgrades. Hopefully, at least some of these patches can be incorporated, and >>> please let me know if any changes are necessary. Thanks, Chun-Yu >>> >>> Using global variables for caches like these causes a form of memory >>> leak for use cases involving long-running processes that need to >work >>> with many different repositories (and perhaps multiple versions of >>> those >>> repositories). >>> >>> There are at least a couple of different strategies that we can use >to >>> avoid this form of memory leak: >>> >>> 1) Limit the scope of the caches so that they have some sort of >garbage >>> collection life cycle. For example, it would be natural for the >>> depgraph >>> class to have a local cache of use_reduce results, so that the cache >>> can >>> be garbage collected along with the depgraph. >>> >>> 2) Eliminate redundant calls. For example, redundant calls to >>> catpkgslit >>> can be avoided by constructing more _pkg_str instances, since >>> catpkgsplit is able to return early when its argument happens to be >a >>> _pkg_str instance. >> >> I think the weak stuff from the standard library might also be >helpful. >> >> -- >> Best regards, >> Michał Górny >> > >Hmm, maybe weak global caches are an option? It would probably be necessary to add hit/miss counter and compare results before and after. -- Best regards, Michał Górny
Re: [gentoo-portage-dev] Add caching to a few commonly used functions
On 6/27/20 8:12 PM, Michał Górny wrote: > Dnia June 28, 2020 3:00:00 AM UTC, Zac Medico napisał(a): >> On 6/26/20 11:34 PM, Chun-Yu Shei wrote: >>> Hi, >>> >>> I was recently interested in whether portage could be speed up, since >>> dependency resolution can sometimes take a while on slower machines. >>> After generating some flame graphs with cProfile and vmprof, I found >> 3 >>> functions which seem to be called extremely frequently with the same >>> arguments: catpkgsplit, use_reduce, and match_from_list. In the >> first >>> two cases, it was simple to cache the results in dicts, while >>> match_from_list was a bit trickier, since it seems to be a >> requirement >>> that it return actual entries from the input "candidate_list". I >> also >>> ran into some test failures if I did the caching after the >>> mydep.unevaluated_atom.use and mydep.repo checks towards the end of >> the >>> function, so the caching is only done up to just before that point. >>> >>> The catpkgsplit change seems to definitely be safe, and I'm pretty >> sure >>> the use_reduce one is too, since anything that could possibly change >> the >>> result is hashed. I'm a bit less certain about the match_from_list >> one, >>> although all tests are passing. >>> >>> With all 3 patches together, "emerge -uDvpU --with-bdeps=y @world" >>> speeds up from 43.53 seconds to 30.96 sec -- a 40.6% speedup. >> "emerge >>> -ep @world" is just a tiny bit faster, going from 18.69 to 18.22 sec >>> (2.5% improvement). Since the upgrade case is far more common, this >>> would really help in daily use, and it shaves about 30 seconds off >>> the time you have to wait to get to the [Yes/No] prompt (from ~90s to >>> 60s) on my old Sandy Bridge laptop when performing normal upgrades. >>> >>> Hopefully, at least some of these patches can be incorporated, and >> please >>> let me know if any changes are necessary. >>> >>> Thanks, >>> Chun-Yu >> >> Using global variables for caches like these causes a form of memory >> leak for use cases involving long-running processes that need to work >> with many different repositories (and perhaps multiple versions of >> those >> repositories). >> >> There are at least a couple of different strategies that we can use to >> avoid this form of memory leak: >> >> 1) Limit the scope of the caches so that they have some sort of garbage >> collection life cycle. For example, it would be natural for the >> depgraph >> class to have a local cache of use_reduce results, so that the cache >> can >> be garbage collected along with the depgraph. >> >> 2) Eliminate redundant calls. For example, redundant calls to >> catpkgslit >> can be avoided by constructing more _pkg_str instances, since >> catpkgsplit is able to return early when its argument happens to be a >> _pkg_str instance. > > I think the weak stuff from the standard library might also be helpful. > > -- > Best regards, > Michał Górny > Hmm, maybe weak global caches are an option? -- Thanks, Zac signature.asc Description: OpenPGP digital signature
Re: [gentoo-portage-dev] Add caching to a few commonly used functions
Dnia June 28, 2020 3:00:00 AM UTC, Zac Medico napisał(a): >On 6/26/20 11:34 PM, Chun-Yu Shei wrote: >> Hi, >> >> I was recently interested in whether portage could be speed up, since >> dependency resolution can sometimes take a while on slower machines. >> After generating some flame graphs with cProfile and vmprof, I found >3 >> functions which seem to be called extremely frequently with the same >> arguments: catpkgsplit, use_reduce, and match_from_list. In the >first >> two cases, it was simple to cache the results in dicts, while >> match_from_list was a bit trickier, since it seems to be a >requirement >> that it return actual entries from the input "candidate_list". I >also >> ran into some test failures if I did the caching after the >> mydep.unevaluated_atom.use and mydep.repo checks towards the end of >the >> function, so the caching is only done up to just before that point. >> >> The catpkgsplit change seems to definitely be safe, and I'm pretty >sure >> the use_reduce one is too, since anything that could possibly change >the >> result is hashed. I'm a bit less certain about the match_from_list >one, >> although all tests are passing. >> >> With all 3 patches together, "emerge -uDvpU --with-bdeps=y @world" >> speeds up from 43.53 seconds to 30.96 sec -- a 40.6% speedup. >"emerge >> -ep @world" is just a tiny bit faster, going from 18.69 to 18.22 sec >> (2.5% improvement). Since the upgrade case is far more common, this >> would really help in daily use, and it shaves about 30 seconds off >> the time you have to wait to get to the [Yes/No] prompt (from ~90s to >> 60s) on my old Sandy Bridge laptop when performing normal upgrades. >> >> Hopefully, at least some of these patches can be incorporated, and >please >> let me know if any changes are necessary. >> >> Thanks, >> Chun-Yu > >Using global variables for caches like these causes a form of memory >leak for use cases involving long-running processes that need to work >with many different repositories (and perhaps multiple versions of >those >repositories). > >There are at least a couple of different strategies that we can use to >avoid this form of memory leak: > >1) Limit the scope of the caches so that they have some sort of garbage >collection life cycle. For example, it would be natural for the >depgraph >class to have a local cache of use_reduce results, so that the cache >can >be garbage collected along with the depgraph. > >2) Eliminate redundant calls. For example, redundant calls to >catpkgslit >can be avoided by constructing more _pkg_str instances, since >catpkgsplit is able to return early when its argument happens to be a >_pkg_str instance. I think the weak stuff from the standard library might also be helpful. -- Best regards, Michał Górny
Re: [gentoo-portage-dev] Add caching to a few commonly used functions
On 6/26/20 11:34 PM, Chun-Yu Shei wrote: > Hi, > > I was recently interested in whether portage could be speed up, since > dependency resolution can sometimes take a while on slower machines. > After generating some flame graphs with cProfile and vmprof, I found 3 > functions which seem to be called extremely frequently with the same > arguments: catpkgsplit, use_reduce, and match_from_list. In the first > two cases, it was simple to cache the results in dicts, while > match_from_list was a bit trickier, since it seems to be a requirement > that it return actual entries from the input "candidate_list". I also > ran into some test failures if I did the caching after the > mydep.unevaluated_atom.use and mydep.repo checks towards the end of the > function, so the caching is only done up to just before that point. > > The catpkgsplit change seems to definitely be safe, and I'm pretty sure > the use_reduce one is too, since anything that could possibly change the > result is hashed. I'm a bit less certain about the match_from_list one, > although all tests are passing. > > With all 3 patches together, "emerge -uDvpU --with-bdeps=y @world" > speeds up from 43.53 seconds to 30.96 sec -- a 40.6% speedup. "emerge > -ep @world" is just a tiny bit faster, going from 18.69 to 18.22 sec > (2.5% improvement). Since the upgrade case is far more common, this > would really help in daily use, and it shaves about 30 seconds off > the time you have to wait to get to the [Yes/No] prompt (from ~90s to > 60s) on my old Sandy Bridge laptop when performing normal upgrades. > > Hopefully, at least some of these patches can be incorporated, and please > let me know if any changes are necessary. > > Thanks, > Chun-Yu Using global variables for caches like these causes a form of memory leak for use cases involving long-running processes that need to work with many different repositories (and perhaps multiple versions of those repositories). There are at least a couple of different strategies that we can use to avoid this form of memory leak: 1) Limit the scope of the caches so that they have some sort of garbage collection life cycle. For example, it would be natural for the depgraph class to have a local cache of use_reduce results, so that the cache can be garbage collected along with the depgraph. 2) Eliminate redundant calls. For example, redundant calls to catpkgslit can be avoided by constructing more _pkg_str instances, since catpkgsplit is able to return early when its argument happens to be a _pkg_str instance. -- Thanks, Zac signature.asc Description: OpenPGP digital signature
Re: [gentoo-portage-dev] Add caching to a few commonly used functions
On Sat, 27 Jun 2020 at 19:35, Fabian Groffen wrote: > > Hi Chun-Yu, > > arguments: catpkgsplit, use_reduce, and match_from_list. In the first > > two cases, it was simple to cache the results in dicts, while > > match_from_list was a bit trickier, since it seems to be a requirement > > that it return actual entries from the input "candidate_list". I also > > ran into some test failures if I did the caching after the > > mydep.unevaluated_atom.use and mydep.repo checks towards the end of the > > function, so the caching is only done up to just before that point. You may also want to investigate the version aspect parsing logic where it converts versions into a data structure, partly because the last time I tried profiling portage, every sample seemed to turn up in there. And I'd expect to see a lot of commonality in this. # qlist -I --format "%{PV}" | wc -c 14678 # qlist -I --format "%{PV}" | sort -u | wc -c 8811 And given this version-parsing path is even handled for stuff *not* installed, I suspect the real-world implications are worse # find /usr/portage/ -name "*.ebuild" | sed 's|/usr/portage/||;s|/[^/]*/|/|;s|[.]ebuild$||' | xargs qatom -CF "%{PV}" | wc -l 32604 # find /usr/portage/ -name "*.ebuild" | sed 's|/usr/portage/||;s|/[^/]*/|/|;s|[.]ebuild$||' | xargs qatom -CF "%{PVR}" | sort -u | wc -l 10362 katipo2 ~ # find /usr/portage/ -name "*.ebuild" | sed 's|/usr/portage/||;s|/[^/]*/|/|;s|[.]ebuild$||' | xargs qatom -CF "%{PV}" | sort -u | wc -l 7515 Obviously this is very crude analysis, but you see there's room to potentially no-op half of all version parses. Though the speed/memory tradeoff may not be worth it. Note, that this is not just "parse the version on the ebuild", which is fast, but my sampling seemed to indicate it was parsing the version afresh for every version comparison, which means internally, it was parsing the same version dozens of times over, which is much slower! -- Kent KENTNL - https://metacpan.org/author/KENTNL
Re: [gentoo-portage-dev] Add caching to a few commonly used functions
Hi Fabian, Just eyeballing htop's RES column while emerge is running, the max value I see during "emerge -uDvpU --with-bdeps=y @world" increases from 272 MB to 380 MB, so it looks to be around 110 MB of extra memory usage on my system (with 1,094 total packages installed). Chun-Yu On Sat, Jun 27, 2020, 12:35 AM Fabian Groffen wrote: > > Hi Chun-Yu, > > On 26-06-2020 23:34:12 -0700, Chun-Yu Shei wrote: > > Hi, > > > > I was recently interested in whether portage could be speed up, since > > dependency resolution can sometimes take a while on slower machines. > > After generating some flame graphs with cProfile and vmprof, I found 3 > > functions which seem to be called extremely frequently with the same > > arguments: catpkgsplit, use_reduce, and match_from_list. In the first > > two cases, it was simple to cache the results in dicts, while > > match_from_list was a bit trickier, since it seems to be a requirement > > that it return actual entries from the input "candidate_list". I also > > ran into some test failures if I did the caching after the > > mydep.unevaluated_atom.use and mydep.repo checks towards the end of the > > function, so the caching is only done up to just before that point. > > > > The catpkgsplit change seems to definitely be safe, and I'm pretty sure > > the use_reduce one is too, since anything that could possibly change the > > result is hashed. I'm a bit less certain about the match_from_list one, > > although all tests are passing. > > > > With all 3 patches together, "emerge -uDvpU --with-bdeps=y @world" > > speeds up from 43.53 seconds to 30.96 sec -- a 40.6% speedup. "emerge > > -ep @world" is just a tiny bit faster, going from 18.69 to 18.22 sec > > (2.5% improvement). Since the upgrade case is far more common, this > > would really help in daily use, and it shaves about 30 seconds off > > the time you have to wait to get to the [Yes/No] prompt (from ~90s to > > 60s) on my old Sandy Bridge laptop when performing normal upgrades. > > > > Hopefully, at least some of these patches can be incorporated, and please > > let me know if any changes are necessary. > > This sounds like a good job to me! Do you have any idea what the added > memory pressure for these changes are? > > Thanks, > Fabian > > > > Thanks, > > Chun-Yu > > > > > > > > -- > Fabian Groffen > Gentoo on a different level
Re: [gentoo-portage-dev] Add caching to a few commonly used functions
Hi Chun-Yu, On 26-06-2020 23:34:12 -0700, Chun-Yu Shei wrote: > Hi, > > I was recently interested in whether portage could be speed up, since > dependency resolution can sometimes take a while on slower machines. > After generating some flame graphs with cProfile and vmprof, I found 3 > functions which seem to be called extremely frequently with the same > arguments: catpkgsplit, use_reduce, and match_from_list. In the first > two cases, it was simple to cache the results in dicts, while > match_from_list was a bit trickier, since it seems to be a requirement > that it return actual entries from the input "candidate_list". I also > ran into some test failures if I did the caching after the > mydep.unevaluated_atom.use and mydep.repo checks towards the end of the > function, so the caching is only done up to just before that point. > > The catpkgsplit change seems to definitely be safe, and I'm pretty sure > the use_reduce one is too, since anything that could possibly change the > result is hashed. I'm a bit less certain about the match_from_list one, > although all tests are passing. > > With all 3 patches together, "emerge -uDvpU --with-bdeps=y @world" > speeds up from 43.53 seconds to 30.96 sec -- a 40.6% speedup. "emerge > -ep @world" is just a tiny bit faster, going from 18.69 to 18.22 sec > (2.5% improvement). Since the upgrade case is far more common, this > would really help in daily use, and it shaves about 30 seconds off > the time you have to wait to get to the [Yes/No] prompt (from ~90s to > 60s) on my old Sandy Bridge laptop when performing normal upgrades. > > Hopefully, at least some of these patches can be incorporated, and please > let me know if any changes are necessary. This sounds like a good job to me! Do you have any idea what the added memory pressure for these changes are? Thanks, Fabian > > Thanks, > Chun-Yu > > > -- Fabian Groffen Gentoo on a different level signature.asc Description: PGP signature