Re: [gentoo-portage-dev] Add caching to a few commonly used functions
Dnia June 28, 2020 3:42:33 AM UTC, Zac Medico napisał(a): >On 6/27/20 8:12 PM, Michał Górny wrote: >> Dnia June 28, 2020 3:00:00 AM UTC, Zac Medico >napisał(a): >>> On 6/26/20 11:34 PM, Chun-Yu Shei wrote: Hi, I was recently interested in whether portage could be speed up, >since dependency resolution can sometimes take a while on slower >machines. After generating some flame graphs with cProfile and vmprof, I >found >>> 3 functions which seem to be called extremely frequently with the >same arguments: catpkgsplit, use_reduce, and match_from_list. In the >>> first two cases, it was simple to cache the results in dicts, while match_from_list was a bit trickier, since it seems to be a >>> requirement that it return actual entries from the input "candidate_list". I >>> also ran into some test failures if I did the caching after the mydep.unevaluated_atom.use and mydep.repo checks towards the end of >>> the function, so the caching is only done up to just before that point. The catpkgsplit change seems to definitely be safe, and I'm pretty >>> sure the use_reduce one is too, since anything that could possibly >change >>> the result is hashed. I'm a bit less certain about the match_from_list >>> one, although all tests are passing. With all 3 patches together, "emerge -uDvpU --with-bdeps=y @world" speeds up from 43.53 seconds to 30.96 sec -- a 40.6% speedup. >>> "emerge -ep @world" is just a tiny bit faster, going from 18.69 to 18.22 >sec (2.5% improvement). Since the upgrade case is far more common, >this would really help in daily use, and it shaves about 30 seconds off the time you have to wait to get to the [Yes/No] prompt (from ~90s >to 60s) on my old Sandy Bridge laptop when performing normal upgrades. Hopefully, at least some of these patches can be incorporated, and >>> please let me know if any changes are necessary. Thanks, Chun-Yu >>> >>> Using global variables for caches like these causes a form of memory >>> leak for use cases involving long-running processes that need to >work >>> with many different repositories (and perhaps multiple versions of >>> those >>> repositories). >>> >>> There are at least a couple of different strategies that we can use >to >>> avoid this form of memory leak: >>> >>> 1) Limit the scope of the caches so that they have some sort of >garbage >>> collection life cycle. For example, it would be natural for the >>> depgraph >>> class to have a local cache of use_reduce results, so that the cache >>> can >>> be garbage collected along with the depgraph. >>> >>> 2) Eliminate redundant calls. For example, redundant calls to >>> catpkgslit >>> can be avoided by constructing more _pkg_str instances, since >>> catpkgsplit is able to return early when its argument happens to be >a >>> _pkg_str instance. >> >> I think the weak stuff from the standard library might also be >helpful. >> >> -- >> Best regards, >> Michał Górny >> > >Hmm, maybe weak global caches are an option? It would probably be necessary to add hit/miss counter and compare results before and after. -- Best regards, Michał Górny
Re: [gentoo-portage-dev] Add caching to a few commonly used functions
On 6/27/20 8:12 PM, Michał Górny wrote: > Dnia June 28, 2020 3:00:00 AM UTC, Zac Medico napisał(a): >> On 6/26/20 11:34 PM, Chun-Yu Shei wrote: >>> Hi, >>> >>> I was recently interested in whether portage could be speed up, since >>> dependency resolution can sometimes take a while on slower machines. >>> After generating some flame graphs with cProfile and vmprof, I found >> 3 >>> functions which seem to be called extremely frequently with the same >>> arguments: catpkgsplit, use_reduce, and match_from_list. In the >> first >>> two cases, it was simple to cache the results in dicts, while >>> match_from_list was a bit trickier, since it seems to be a >> requirement >>> that it return actual entries from the input "candidate_list". I >> also >>> ran into some test failures if I did the caching after the >>> mydep.unevaluated_atom.use and mydep.repo checks towards the end of >> the >>> function, so the caching is only done up to just before that point. >>> >>> The catpkgsplit change seems to definitely be safe, and I'm pretty >> sure >>> the use_reduce one is too, since anything that could possibly change >> the >>> result is hashed. I'm a bit less certain about the match_from_list >> one, >>> although all tests are passing. >>> >>> With all 3 patches together, "emerge -uDvpU --with-bdeps=y @world" >>> speeds up from 43.53 seconds to 30.96 sec -- a 40.6% speedup. >> "emerge >>> -ep @world" is just a tiny bit faster, going from 18.69 to 18.22 sec >>> (2.5% improvement). Since the upgrade case is far more common, this >>> would really help in daily use, and it shaves about 30 seconds off >>> the time you have to wait to get to the [Yes/No] prompt (from ~90s to >>> 60s) on my old Sandy Bridge laptop when performing normal upgrades. >>> >>> Hopefully, at least some of these patches can be incorporated, and >> please >>> let me know if any changes are necessary. >>> >>> Thanks, >>> Chun-Yu >> >> Using global variables for caches like these causes a form of memory >> leak for use cases involving long-running processes that need to work >> with many different repositories (and perhaps multiple versions of >> those >> repositories). >> >> There are at least a couple of different strategies that we can use to >> avoid this form of memory leak: >> >> 1) Limit the scope of the caches so that they have some sort of garbage >> collection life cycle. For example, it would be natural for the >> depgraph >> class to have a local cache of use_reduce results, so that the cache >> can >> be garbage collected along with the depgraph. >> >> 2) Eliminate redundant calls. For example, redundant calls to >> catpkgslit >> can be avoided by constructing more _pkg_str instances, since >> catpkgsplit is able to return early when its argument happens to be a >> _pkg_str instance. > > I think the weak stuff from the standard library might also be helpful. > > -- > Best regards, > Michał Górny > Hmm, maybe weak global caches are an option? -- Thanks, Zac signature.asc Description: OpenPGP digital signature
Re: [gentoo-portage-dev] Add caching to a few commonly used functions
Dnia June 28, 2020 3:00:00 AM UTC, Zac Medico napisał(a): >On 6/26/20 11:34 PM, Chun-Yu Shei wrote: >> Hi, >> >> I was recently interested in whether portage could be speed up, since >> dependency resolution can sometimes take a while on slower machines. >> After generating some flame graphs with cProfile and vmprof, I found >3 >> functions which seem to be called extremely frequently with the same >> arguments: catpkgsplit, use_reduce, and match_from_list. In the >first >> two cases, it was simple to cache the results in dicts, while >> match_from_list was a bit trickier, since it seems to be a >requirement >> that it return actual entries from the input "candidate_list". I >also >> ran into some test failures if I did the caching after the >> mydep.unevaluated_atom.use and mydep.repo checks towards the end of >the >> function, so the caching is only done up to just before that point. >> >> The catpkgsplit change seems to definitely be safe, and I'm pretty >sure >> the use_reduce one is too, since anything that could possibly change >the >> result is hashed. I'm a bit less certain about the match_from_list >one, >> although all tests are passing. >> >> With all 3 patches together, "emerge -uDvpU --with-bdeps=y @world" >> speeds up from 43.53 seconds to 30.96 sec -- a 40.6% speedup. >"emerge >> -ep @world" is just a tiny bit faster, going from 18.69 to 18.22 sec >> (2.5% improvement). Since the upgrade case is far more common, this >> would really help in daily use, and it shaves about 30 seconds off >> the time you have to wait to get to the [Yes/No] prompt (from ~90s to >> 60s) on my old Sandy Bridge laptop when performing normal upgrades. >> >> Hopefully, at least some of these patches can be incorporated, and >please >> let me know if any changes are necessary. >> >> Thanks, >> Chun-Yu > >Using global variables for caches like these causes a form of memory >leak for use cases involving long-running processes that need to work >with many different repositories (and perhaps multiple versions of >those >repositories). > >There are at least a couple of different strategies that we can use to >avoid this form of memory leak: > >1) Limit the scope of the caches so that they have some sort of garbage >collection life cycle. For example, it would be natural for the >depgraph >class to have a local cache of use_reduce results, so that the cache >can >be garbage collected along with the depgraph. > >2) Eliminate redundant calls. For example, redundant calls to >catpkgslit >can be avoided by constructing more _pkg_str instances, since >catpkgsplit is able to return early when its argument happens to be a >_pkg_str instance. I think the weak stuff from the standard library might also be helpful. -- Best regards, Michał Górny
Re: [gentoo-portage-dev] Add caching to a few commonly used functions
On 6/26/20 11:34 PM, Chun-Yu Shei wrote: > Hi, > > I was recently interested in whether portage could be speed up, since > dependency resolution can sometimes take a while on slower machines. > After generating some flame graphs with cProfile and vmprof, I found 3 > functions which seem to be called extremely frequently with the same > arguments: catpkgsplit, use_reduce, and match_from_list. In the first > two cases, it was simple to cache the results in dicts, while > match_from_list was a bit trickier, since it seems to be a requirement > that it return actual entries from the input "candidate_list". I also > ran into some test failures if I did the caching after the > mydep.unevaluated_atom.use and mydep.repo checks towards the end of the > function, so the caching is only done up to just before that point. > > The catpkgsplit change seems to definitely be safe, and I'm pretty sure > the use_reduce one is too, since anything that could possibly change the > result is hashed. I'm a bit less certain about the match_from_list one, > although all tests are passing. > > With all 3 patches together, "emerge -uDvpU --with-bdeps=y @world" > speeds up from 43.53 seconds to 30.96 sec -- a 40.6% speedup. "emerge > -ep @world" is just a tiny bit faster, going from 18.69 to 18.22 sec > (2.5% improvement). Since the upgrade case is far more common, this > would really help in daily use, and it shaves about 30 seconds off > the time you have to wait to get to the [Yes/No] prompt (from ~90s to > 60s) on my old Sandy Bridge laptop when performing normal upgrades. > > Hopefully, at least some of these patches can be incorporated, and please > let me know if any changes are necessary. > > Thanks, > Chun-Yu Using global variables for caches like these causes a form of memory leak for use cases involving long-running processes that need to work with many different repositories (and perhaps multiple versions of those repositories). There are at least a couple of different strategies that we can use to avoid this form of memory leak: 1) Limit the scope of the caches so that they have some sort of garbage collection life cycle. For example, it would be natural for the depgraph class to have a local cache of use_reduce results, so that the cache can be garbage collected along with the depgraph. 2) Eliminate redundant calls. For example, redundant calls to catpkgslit can be avoided by constructing more _pkg_str instances, since catpkgsplit is able to return early when its argument happens to be a _pkg_str instance. -- Thanks, Zac signature.asc Description: OpenPGP digital signature
Re: [gentoo-portage-dev] Passing CFLAGS="-fdebug-prefix-map=..=$(readlink -f ..)"
On Sat, 2020-06-27 at 14:00 +, Joakim Tjernlund wrote: > CAUTION: This email originated from outside of the organization. Do not click > links or open attachments unless you recognize the sender and know the > content is safe. > > > I am trying to add -fdebug-prefix-map to my CFLAGS but cannot get past > "$(readlink -f ..)" > Portage will not expand $(anything) > > Any way to make portage expand "$(readlink -f ..)" ? > > Jocke Found it /etc/portage/bashrc add: if [ $EBUILD_PHASE = configure ]; then echo "Adding -fdebug-prefix-map to CFLAGS/CXXFLAGS" CFLAGS="$CFLAGS -fdebug-prefix-map=..=$(readlink -f ..)" CXXFLAGS="$CXXFLAGS -fdebug-prefix-map=..=$(readlink -f ..)" fi
[gentoo-portage-dev] Passing CFLAGS="-fdebug-prefix-map=..=$(readlink -f ..)"
I am trying to add -fdebug-prefix-map to my CFLAGS but cannot get past "$(readlink -f ..)" Portage will not expand $(anything) Any way to make portage expand "$(readlink -f ..)" ? Jocke
Re: [gentoo-portage-dev] [PATCH 1/3] Add caching to catpkgsplit function
Dnia June 27, 2020 6:34:13 AM UTC, Chun-Yu Shei napisał(a): >According to cProfile, catpkgsplit is called up to 1-5.5 million times >during "emerge -uDvpU --with-bdeps=y @world". Adding a dict to cache >its >results reduces the time for this command from 43.53 -> 41.53 seconds >-- >a 4.8% speedup. Not saying caching is wrong for an interim solution but this is the kind of function where refactoring may yield even more gain. >--- > lib/portage/versions.py | 7 +++ > 1 file changed, 7 insertions(+) > >diff --git a/lib/portage/versions.py b/lib/portage/versions.py >index 0c21373cc..ffec316ce 100644 >--- a/lib/portage/versions.py >+++ b/lib/portage/versions.py >@@ -312,6 +312,7 @@ def _pkgsplit(mypkg, eapi=None): > > _cat_re = re.compile('^%s$' % _cat, re.UNICODE) > _missing_cat = 'null' >+_catpkgsplit_cache = {} > > def catpkgsplit(mydata, silent=1, eapi=None): > """ >@@ -331,6 +332,11 @@ def catpkgsplit(mydata, silent=1, eapi=None): > return mydata.cpv_split > except AttributeError: > pass >+ >+ cache_entry = _catpkgsplit_cache.get(mydata) >+ if cache_entry is not None: >+ return cache_entry >+ > mysplit = mydata.split('/', 1) > p_split = None > if len(mysplit) == 1: >@@ -343,6 +349,7 @@ def catpkgsplit(mydata, silent=1, eapi=None): > if not p_split: > return None > retval = (cat, p_split[0], p_split[1], p_split[2]) >+ _catpkgsplit_cache[mydata] = retval > return retval > > class _pkg_str(_unicode): -- Best regards, Michał Górny
[gentoo-portage-dev] [PATCH 2/3] Add caching to use_reduce function
This function is called extremely frequently with similar arguments, so this optimization reduces "emerge -uDvpU --with-bdeps=y @world" runtime from 43.5 -> 34.5s -- a 25.8% speedup. --- lib/portage/dep/__init__.py | 26 ++ 1 file changed, 26 insertions(+) diff --git a/lib/portage/dep/__init__.py b/lib/portage/dep/__init__.py index 72988357a..df296dd81 100644 --- a/lib/portage/dep/__init__.py +++ b/lib/portage/dep/__init__.py @@ -404,6 +404,8 @@ def paren_enclose(mylist, unevaluated_atom=False, opconvert=False): mystrparts.append(x) return " ".join(mystrparts) +_use_reduce_cache = {} + def use_reduce(depstr, uselist=(), masklist=(), matchall=False, excludeall=(), is_src_uri=False, \ eapi=None, opconvert=False, flat=False, is_valid_flag=None, token_class=None, matchnone=False, subset=None): @@ -440,6 +442,27 @@ def use_reduce(depstr, uselist=(), masklist=(), matchall=False, excludeall=(), i @rtype: List @return: The use reduced depend array """ + uselist_key = None + masklist_key = None + excludeall_key = None + subset_key = None + if uselist is not None: + uselist_key = tuple(uselist) + if masklist is not None: + masklist_key = tuple(masklist) + if excludeall is not None: + excludeall_key = tuple(excludeall) + if subset is not None: + subset_key = tuple(subset) + cache_key = (depstr, uselist_key, masklist_key, matchall, excludeall_key, \ + is_src_uri, eapi, opconvert, flat, is_valid_flag, token_class, \ + matchnone, subset_key) + + cache_entry = _use_reduce_cache.get(cache_key) + if cache_entry is not None: + # The list returned by this function may be modified, so return a copy. + return cache_entry[:] + if isinstance(depstr, list): if portage._internal_caller: warnings.warn(_("Passing paren_reduced dep arrays to %s is deprecated. " + \ @@ -767,6 +790,9 @@ def use_reduce(depstr, uselist=(), masklist=(), matchall=False, excludeall=(), i raise InvalidDependString( _("Missing file name at end of string")) + # The list returned by this function may be modified, so store a copy. + _use_reduce_cache[cache_key] = stack[0][:] + return stack[0] def dep_opconvert(deplist): -- 2.27.0.212.ge8ba1cc988-goog
[gentoo-portage-dev] Add caching to a few commonly used functions
Hi, I was recently interested in whether portage could be speed up, since dependency resolution can sometimes take a while on slower machines. After generating some flame graphs with cProfile and vmprof, I found 3 functions which seem to be called extremely frequently with the same arguments: catpkgsplit, use_reduce, and match_from_list. In the first two cases, it was simple to cache the results in dicts, while match_from_list was a bit trickier, since it seems to be a requirement that it return actual entries from the input "candidate_list". I also ran into some test failures if I did the caching after the mydep.unevaluated_atom.use and mydep.repo checks towards the end of the function, so the caching is only done up to just before that point. The catpkgsplit change seems to definitely be safe, and I'm pretty sure the use_reduce one is too, since anything that could possibly change the result is hashed. I'm a bit less certain about the match_from_list one, although all tests are passing. With all 3 patches together, "emerge -uDvpU --with-bdeps=y @world" speeds up from 43.53 seconds to 30.96 sec -- a 40.6% speedup. "emerge -ep @world" is just a tiny bit faster, going from 18.69 to 18.22 sec (2.5% improvement). Since the upgrade case is far more common, this would really help in daily use, and it shaves about 30 seconds off the time you have to wait to get to the [Yes/No] prompt (from ~90s to 60s) on my old Sandy Bridge laptop when performing normal upgrades. Hopefully, at least some of these patches can be incorporated, and please let me know if any changes are necessary. Thanks, Chun-Yu
[gentoo-portage-dev] [PATCH 1/3] Add caching to catpkgsplit function
According to cProfile, catpkgsplit is called up to 1-5.5 million times during "emerge -uDvpU --with-bdeps=y @world". Adding a dict to cache its results reduces the time for this command from 43.53 -> 41.53 seconds -- a 4.8% speedup. --- lib/portage/versions.py | 7 +++ 1 file changed, 7 insertions(+) diff --git a/lib/portage/versions.py b/lib/portage/versions.py index 0c21373cc..ffec316ce 100644 --- a/lib/portage/versions.py +++ b/lib/portage/versions.py @@ -312,6 +312,7 @@ def _pkgsplit(mypkg, eapi=None): _cat_re = re.compile('^%s$' % _cat, re.UNICODE) _missing_cat = 'null' +_catpkgsplit_cache = {} def catpkgsplit(mydata, silent=1, eapi=None): """ @@ -331,6 +332,11 @@ def catpkgsplit(mydata, silent=1, eapi=None): return mydata.cpv_split except AttributeError: pass + + cache_entry = _catpkgsplit_cache.get(mydata) + if cache_entry is not None: + return cache_entry + mysplit = mydata.split('/', 1) p_split = None if len(mysplit) == 1: @@ -343,6 +349,7 @@ def catpkgsplit(mydata, silent=1, eapi=None): if not p_split: return None retval = (cat, p_split[0], p_split[1], p_split[2]) + _catpkgsplit_cache[mydata] = retval return retval class _pkg_str(_unicode): -- 2.27.0.212.ge8ba1cc988-goog
[gentoo-portage-dev] [PATCH 3/3] Add partial caching to match_from_list
This function is called frequently with similar arguments, so cache as much of the partial results as possible. It seems that "match_from_list" must return a list containing actual entries from the "candidate_list" argument, so we store frozensets in "_match_from_list_cache" and test items from "candidate_list" for membership in these sets. The filtering performed by the mydep.unevaluated_atom.use and mydep.repo checks towards the end of the function is also not cached, since this causes some test failures. This results in a reduction of "emerge -uDvpU --with-bdeps=y @world" runtime from 43.53 -> 40.15 sec -- an 8.4% speedup. --- lib/portage/dep/__init__.py | 359 +++- 1 file changed, 189 insertions(+), 170 deletions(-) diff --git a/lib/portage/dep/__init__.py b/lib/portage/dep/__init__.py index df296dd81..dbd23bb23 100644 --- a/lib/portage/dep/__init__.py +++ b/lib/portage/dep/__init__.py @@ -2174,6 +2174,8 @@ def best_match_to_list(mypkg, mylist): return bestm +_match_from_list_cache = {} + def match_from_list(mydep, candidate_list): """ Searches list for entries that matches the package. @@ -2197,209 +2199,226 @@ def match_from_list(mydep, candidate_list): if not isinstance(mydep, Atom): mydep = Atom(mydep, allow_wildcard=True, allow_repo=True) - mycpv = mydep.cpv - mycpv_cps = catpkgsplit(mycpv) # Can be None if not specific - build_id = mydep.build_id + cache_key = (mydep, tuple(candidate_list)) + key_has_hash = True + cache_entry = None + if mydep.build_id is None and key_has_hash: + try: + cache_entry = _match_from_list_cache.get(cache_key) + except TypeError: + key_has_hash = False - if not mycpv_cps: - ver = None - rev = None - else: - cat, pkg, ver, rev = mycpv_cps - if mydep == mycpv: - raise KeyError(_("Specific key requires an operator" - " (%s) (try adding an '=')") % (mydep)) - - if ver and rev: - operator = mydep.operator - if not operator: - writemsg(_("!!! Invalid atom: %s\n") % mydep, noiselevel=-1) - return [] + if cache_entry is not None: + # Note: the list returned by this function must contain actual entries + # from "candidate_list", so store frozensets in "_match_from_list_cache" + # and test items from "candidate_list" for membership in these sets. + mylist = [x for x in candidate_list if x in cache_entry] else: - operator = None - - mylist = [] + mycpv = mydep.cpv + mycpv_cps = catpkgsplit(mycpv) # Can be None if not specific + build_id = mydep.build_id - if mydep.extended_syntax: + if not mycpv_cps: + ver = None + rev = None + else: + cat, pkg, ver, rev = mycpv_cps + if mydep == mycpv: + raise KeyError(_("Specific key requires an operator" + " (%s) (try adding an '=')") % (mydep)) - for x in candidate_list: - cp = getattr(x, "cp", None) - if cp is None: - mysplit = catpkgsplit(remove_slot(x)) - if mysplit is not None: - cp = mysplit[0] + '/' + mysplit[1] + if ver and rev: + operator = mydep.operator + if not operator: + writemsg(_("!!! Invalid atom: %s\n") % mydep, noiselevel=-1) + return [] + else: + operator = None - if cp is None: - continue + mylist = [] - if cp == mycpv or extended_cp_match(mydep.cp, cp): - mylist.append(x) + if mydep.extended_syntax: - if mylist and mydep.operator == "=*": + for x in candidate_list: + cp = getattr(x, "cp", None) + if cp is None: + mysplit = catpkgsplit(remove_slot(x)) + if mysplit is not None: + cp = mysplit[0] + '/' + mysplit[1] - candidate_list = mylist - mylist = [] - # Currently, only \*\w+\* is supported. - ver = mydep.version[1:-1] +
Re: [gentoo-portage-dev] Add caching to a few commonly used functions
On Sat, 27 Jun 2020 at 19:35, Fabian Groffen wrote: > > Hi Chun-Yu, > > arguments: catpkgsplit, use_reduce, and match_from_list. In the first > > two cases, it was simple to cache the results in dicts, while > > match_from_list was a bit trickier, since it seems to be a requirement > > that it return actual entries from the input "candidate_list". I also > > ran into some test failures if I did the caching after the > > mydep.unevaluated_atom.use and mydep.repo checks towards the end of the > > function, so the caching is only done up to just before that point. You may also want to investigate the version aspect parsing logic where it converts versions into a data structure, partly because the last time I tried profiling portage, every sample seemed to turn up in there. And I'd expect to see a lot of commonality in this. # qlist -I --format "%{PV}" | wc -c 14678 # qlist -I --format "%{PV}" | sort -u | wc -c 8811 And given this version-parsing path is even handled for stuff *not* installed, I suspect the real-world implications are worse # find /usr/portage/ -name "*.ebuild" | sed 's|/usr/portage/||;s|/[^/]*/|/|;s|[.]ebuild$||' | xargs qatom -CF "%{PV}" | wc -l 32604 # find /usr/portage/ -name "*.ebuild" | sed 's|/usr/portage/||;s|/[^/]*/|/|;s|[.]ebuild$||' | xargs qatom -CF "%{PVR}" | sort -u | wc -l 10362 katipo2 ~ # find /usr/portage/ -name "*.ebuild" | sed 's|/usr/portage/||;s|/[^/]*/|/|;s|[.]ebuild$||' | xargs qatom -CF "%{PV}" | sort -u | wc -l 7515 Obviously this is very crude analysis, but you see there's room to potentially no-op half of all version parses. Though the speed/memory tradeoff may not be worth it. Note, that this is not just "parse the version on the ebuild", which is fast, but my sampling seemed to indicate it was parsing the version afresh for every version comparison, which means internally, it was parsing the same version dozens of times over, which is much slower! -- Kent KENTNL - https://metacpan.org/author/KENTNL
Re: [gentoo-portage-dev] Add caching to a few commonly used functions
Hi Fabian, Just eyeballing htop's RES column while emerge is running, the max value I see during "emerge -uDvpU --with-bdeps=y @world" increases from 272 MB to 380 MB, so it looks to be around 110 MB of extra memory usage on my system (with 1,094 total packages installed). Chun-Yu On Sat, Jun 27, 2020, 12:35 AM Fabian Groffen wrote: > > Hi Chun-Yu, > > On 26-06-2020 23:34:12 -0700, Chun-Yu Shei wrote: > > Hi, > > > > I was recently interested in whether portage could be speed up, since > > dependency resolution can sometimes take a while on slower machines. > > After generating some flame graphs with cProfile and vmprof, I found 3 > > functions which seem to be called extremely frequently with the same > > arguments: catpkgsplit, use_reduce, and match_from_list. In the first > > two cases, it was simple to cache the results in dicts, while > > match_from_list was a bit trickier, since it seems to be a requirement > > that it return actual entries from the input "candidate_list". I also > > ran into some test failures if I did the caching after the > > mydep.unevaluated_atom.use and mydep.repo checks towards the end of the > > function, so the caching is only done up to just before that point. > > > > The catpkgsplit change seems to definitely be safe, and I'm pretty sure > > the use_reduce one is too, since anything that could possibly change the > > result is hashed. I'm a bit less certain about the match_from_list one, > > although all tests are passing. > > > > With all 3 patches together, "emerge -uDvpU --with-bdeps=y @world" > > speeds up from 43.53 seconds to 30.96 sec -- a 40.6% speedup. "emerge > > -ep @world" is just a tiny bit faster, going from 18.69 to 18.22 sec > > (2.5% improvement). Since the upgrade case is far more common, this > > would really help in daily use, and it shaves about 30 seconds off > > the time you have to wait to get to the [Yes/No] prompt (from ~90s to > > 60s) on my old Sandy Bridge laptop when performing normal upgrades. > > > > Hopefully, at least some of these patches can be incorporated, and please > > let me know if any changes are necessary. > > This sounds like a good job to me! Do you have any idea what the added > memory pressure for these changes are? > > Thanks, > Fabian > > > > Thanks, > > Chun-Yu > > > > > > > > -- > Fabian Groffen > Gentoo on a different level
Re: [gentoo-portage-dev] Add caching to a few commonly used functions
Hi Chun-Yu, On 26-06-2020 23:34:12 -0700, Chun-Yu Shei wrote: > Hi, > > I was recently interested in whether portage could be speed up, since > dependency resolution can sometimes take a while on slower machines. > After generating some flame graphs with cProfile and vmprof, I found 3 > functions which seem to be called extremely frequently with the same > arguments: catpkgsplit, use_reduce, and match_from_list. In the first > two cases, it was simple to cache the results in dicts, while > match_from_list was a bit trickier, since it seems to be a requirement > that it return actual entries from the input "candidate_list". I also > ran into some test failures if I did the caching after the > mydep.unevaluated_atom.use and mydep.repo checks towards the end of the > function, so the caching is only done up to just before that point. > > The catpkgsplit change seems to definitely be safe, and I'm pretty sure > the use_reduce one is too, since anything that could possibly change the > result is hashed. I'm a bit less certain about the match_from_list one, > although all tests are passing. > > With all 3 patches together, "emerge -uDvpU --with-bdeps=y @world" > speeds up from 43.53 seconds to 30.96 sec -- a 40.6% speedup. "emerge > -ep @world" is just a tiny bit faster, going from 18.69 to 18.22 sec > (2.5% improvement). Since the upgrade case is far more common, this > would really help in daily use, and it shaves about 30 seconds off > the time you have to wait to get to the [Yes/No] prompt (from ~90s to > 60s) on my old Sandy Bridge laptop when performing normal upgrades. > > Hopefully, at least some of these patches can be incorporated, and please > let me know if any changes are necessary. This sounds like a good job to me! Do you have any idea what the added memory pressure for these changes are? Thanks, Fabian > > Thanks, > Chun-Yu > > > -- Fabian Groffen Gentoo on a different level signature.asc Description: PGP signature