Re: [gentoo-portage-dev] Add caching to a few commonly used functions

2020-06-27 Thread Michał Górny
Dnia June 28, 2020 3:42:33 AM UTC, Zac Medico  napisał(a):
>On 6/27/20 8:12 PM, Michał Górny wrote:
>> Dnia June 28, 2020 3:00:00 AM UTC, Zac Medico 
>napisał(a):
>>> On 6/26/20 11:34 PM, Chun-Yu Shei wrote:
 Hi,

 I was recently interested in whether portage could be speed up,
>since
 dependency resolution can sometimes take a while on slower
>machines.
 After generating some flame graphs with cProfile and vmprof, I
>found
>>> 3
 functions which seem to be called extremely frequently with the
>same
 arguments: catpkgsplit, use_reduce, and match_from_list.  In the
>>> first
 two cases, it was simple to cache the results in dicts, while
 match_from_list was a bit trickier, since it seems to be a
>>> requirement
 that it return actual entries from the input "candidate_list".  I
>>> also
 ran into some test failures if I did the caching after the
 mydep.unevaluated_atom.use and mydep.repo checks towards the end of
>>> the
 function, so the caching is only done up to just before that point.

 The catpkgsplit change seems to definitely be safe, and I'm pretty
>>> sure
 the use_reduce one is too, since anything that could possibly
>change
>>> the
 result is hashed.  I'm a bit less certain about the match_from_list
>>> one,
 although all tests are passing.

 With all 3 patches together, "emerge -uDvpU --with-bdeps=y @world"
 speeds up from 43.53 seconds to 30.96 sec -- a 40.6% speedup. 
>>> "emerge
 -ep @world" is just a tiny bit faster, going from 18.69 to 18.22
>sec
 (2.5% improvement).  Since the upgrade case is far more common,
>this
 would really help in daily use, and it shaves about 30 seconds off
 the time you have to wait to get to the [Yes/No] prompt (from ~90s
>to
 60s) on my old Sandy Bridge laptop when performing normal upgrades.

 Hopefully, at least some of these patches can be incorporated, and
>>> please
 let me know if any changes are necessary.

 Thanks,
 Chun-Yu
>>>
>>> Using global variables for caches like these causes a form of memory
>>> leak for use cases involving long-running processes that need to
>work
>>> with many different repositories (and perhaps multiple versions of
>>> those
>>> repositories).
>>>
>>> There are at least a couple of different strategies that we can use
>to
>>> avoid this form of memory leak:
>>>
>>> 1) Limit the scope of the caches so that they have some sort of
>garbage
>>> collection life cycle. For example, it would be natural for the
>>> depgraph
>>> class to have a local cache of use_reduce results, so that the cache
>>> can
>>> be garbage collected along with the depgraph.
>>>
>>> 2) Eliminate redundant calls. For example, redundant calls to
>>> catpkgslit
>>> can be avoided by constructing more _pkg_str instances, since
>>> catpkgsplit is able to return early when its argument happens to be
>a
>>> _pkg_str instance.
>> 
>> I think the weak stuff from the standard library might also be
>helpful.
>> 
>> --
>> Best regards, 
>> Michał Górny
>> 
>
>Hmm, maybe weak global caches are an option?

It would probably be necessary to add hit/miss counter and compare results 
before and after.

--
Best regards, 
Michał Górny



Re: [gentoo-portage-dev] Add caching to a few commonly used functions

2020-06-27 Thread Zac Medico
On 6/27/20 8:12 PM, Michał Górny wrote:
> Dnia June 28, 2020 3:00:00 AM UTC, Zac Medico  napisał(a):
>> On 6/26/20 11:34 PM, Chun-Yu Shei wrote:
>>> Hi,
>>>
>>> I was recently interested in whether portage could be speed up, since
>>> dependency resolution can sometimes take a while on slower machines.
>>> After generating some flame graphs with cProfile and vmprof, I found
>> 3
>>> functions which seem to be called extremely frequently with the same
>>> arguments: catpkgsplit, use_reduce, and match_from_list.  In the
>> first
>>> two cases, it was simple to cache the results in dicts, while
>>> match_from_list was a bit trickier, since it seems to be a
>> requirement
>>> that it return actual entries from the input "candidate_list".  I
>> also
>>> ran into some test failures if I did the caching after the
>>> mydep.unevaluated_atom.use and mydep.repo checks towards the end of
>> the
>>> function, so the caching is only done up to just before that point.
>>>
>>> The catpkgsplit change seems to definitely be safe, and I'm pretty
>> sure
>>> the use_reduce one is too, since anything that could possibly change
>> the
>>> result is hashed.  I'm a bit less certain about the match_from_list
>> one,
>>> although all tests are passing.
>>>
>>> With all 3 patches together, "emerge -uDvpU --with-bdeps=y @world"
>>> speeds up from 43.53 seconds to 30.96 sec -- a 40.6% speedup. 
>> "emerge
>>> -ep @world" is just a tiny bit faster, going from 18.69 to 18.22 sec
>>> (2.5% improvement).  Since the upgrade case is far more common, this
>>> would really help in daily use, and it shaves about 30 seconds off
>>> the time you have to wait to get to the [Yes/No] prompt (from ~90s to
>>> 60s) on my old Sandy Bridge laptop when performing normal upgrades.
>>>
>>> Hopefully, at least some of these patches can be incorporated, and
>> please
>>> let me know if any changes are necessary.
>>>
>>> Thanks,
>>> Chun-Yu
>>
>> Using global variables for caches like these causes a form of memory
>> leak for use cases involving long-running processes that need to work
>> with many different repositories (and perhaps multiple versions of
>> those
>> repositories).
>>
>> There are at least a couple of different strategies that we can use to
>> avoid this form of memory leak:
>>
>> 1) Limit the scope of the caches so that they have some sort of garbage
>> collection life cycle. For example, it would be natural for the
>> depgraph
>> class to have a local cache of use_reduce results, so that the cache
>> can
>> be garbage collected along with the depgraph.
>>
>> 2) Eliminate redundant calls. For example, redundant calls to
>> catpkgslit
>> can be avoided by constructing more _pkg_str instances, since
>> catpkgsplit is able to return early when its argument happens to be a
>> _pkg_str instance.
> 
> I think the weak stuff from the standard library might also be helpful.
> 
> --
> Best regards, 
> Michał Górny
> 

Hmm, maybe weak global caches are an option?
-- 
Thanks,
Zac



signature.asc
Description: OpenPGP digital signature


Re: [gentoo-portage-dev] Add caching to a few commonly used functions

2020-06-27 Thread Michał Górny
Dnia June 28, 2020 3:00:00 AM UTC, Zac Medico  napisał(a):
>On 6/26/20 11:34 PM, Chun-Yu Shei wrote:
>> Hi,
>> 
>> I was recently interested in whether portage could be speed up, since
>> dependency resolution can sometimes take a while on slower machines.
>> After generating some flame graphs with cProfile and vmprof, I found
>3
>> functions which seem to be called extremely frequently with the same
>> arguments: catpkgsplit, use_reduce, and match_from_list.  In the
>first
>> two cases, it was simple to cache the results in dicts, while
>> match_from_list was a bit trickier, since it seems to be a
>requirement
>> that it return actual entries from the input "candidate_list".  I
>also
>> ran into some test failures if I did the caching after the
>> mydep.unevaluated_atom.use and mydep.repo checks towards the end of
>the
>> function, so the caching is only done up to just before that point.
>> 
>> The catpkgsplit change seems to definitely be safe, and I'm pretty
>sure
>> the use_reduce one is too, since anything that could possibly change
>the
>> result is hashed.  I'm a bit less certain about the match_from_list
>one,
>> although all tests are passing.
>> 
>> With all 3 patches together, "emerge -uDvpU --with-bdeps=y @world"
>> speeds up from 43.53 seconds to 30.96 sec -- a 40.6% speedup. 
>"emerge
>> -ep @world" is just a tiny bit faster, going from 18.69 to 18.22 sec
>> (2.5% improvement).  Since the upgrade case is far more common, this
>> would really help in daily use, and it shaves about 30 seconds off
>> the time you have to wait to get to the [Yes/No] prompt (from ~90s to
>> 60s) on my old Sandy Bridge laptop when performing normal upgrades.
>> 
>> Hopefully, at least some of these patches can be incorporated, and
>please
>> let me know if any changes are necessary.
>> 
>> Thanks,
>> Chun-Yu
>
>Using global variables for caches like these causes a form of memory
>leak for use cases involving long-running processes that need to work
>with many different repositories (and perhaps multiple versions of
>those
>repositories).
>
>There are at least a couple of different strategies that we can use to
>avoid this form of memory leak:
>
>1) Limit the scope of the caches so that they have some sort of garbage
>collection life cycle. For example, it would be natural for the
>depgraph
>class to have a local cache of use_reduce results, so that the cache
>can
>be garbage collected along with the depgraph.
>
>2) Eliminate redundant calls. For example, redundant calls to
>catpkgslit
>can be avoided by constructing more _pkg_str instances, since
>catpkgsplit is able to return early when its argument happens to be a
>_pkg_str instance.

I think the weak stuff from the standard library might also be helpful.

--
Best regards, 
Michał Górny



Re: [gentoo-portage-dev] Add caching to a few commonly used functions

2020-06-27 Thread Zac Medico
On 6/26/20 11:34 PM, Chun-Yu Shei wrote:
> Hi,
> 
> I was recently interested in whether portage could be speed up, since
> dependency resolution can sometimes take a while on slower machines.
> After generating some flame graphs with cProfile and vmprof, I found 3
> functions which seem to be called extremely frequently with the same
> arguments: catpkgsplit, use_reduce, and match_from_list.  In the first
> two cases, it was simple to cache the results in dicts, while
> match_from_list was a bit trickier, since it seems to be a requirement
> that it return actual entries from the input "candidate_list".  I also
> ran into some test failures if I did the caching after the
> mydep.unevaluated_atom.use and mydep.repo checks towards the end of the
> function, so the caching is only done up to just before that point.
> 
> The catpkgsplit change seems to definitely be safe, and I'm pretty sure
> the use_reduce one is too, since anything that could possibly change the
> result is hashed.  I'm a bit less certain about the match_from_list one,
> although all tests are passing.
> 
> With all 3 patches together, "emerge -uDvpU --with-bdeps=y @world"
> speeds up from 43.53 seconds to 30.96 sec -- a 40.6% speedup.  "emerge
> -ep @world" is just a tiny bit faster, going from 18.69 to 18.22 sec
> (2.5% improvement).  Since the upgrade case is far more common, this
> would really help in daily use, and it shaves about 30 seconds off
> the time you have to wait to get to the [Yes/No] prompt (from ~90s to
> 60s) on my old Sandy Bridge laptop when performing normal upgrades.
> 
> Hopefully, at least some of these patches can be incorporated, and please
> let me know if any changes are necessary.
> 
> Thanks,
> Chun-Yu

Using global variables for caches like these causes a form of memory
leak for use cases involving long-running processes that need to work
with many different repositories (and perhaps multiple versions of those
repositories).

There are at least a couple of different strategies that we can use to
avoid this form of memory leak:

1) Limit the scope of the caches so that they have some sort of garbage
collection life cycle. For example, it would be natural for the depgraph
class to have a local cache of use_reduce results, so that the cache can
be garbage collected along with the depgraph.

2) Eliminate redundant calls. For example, redundant calls to catpkgslit
can be avoided by constructing more _pkg_str instances, since
catpkgsplit is able to return early when its argument happens to be a
_pkg_str instance.
-- 
Thanks,
Zac



signature.asc
Description: OpenPGP digital signature


[gentoo-portage-dev] Add caching to a few commonly used functions

2020-06-27 Thread Chun-Yu Shei
Hi,

I was recently interested in whether portage could be speed up, since
dependency resolution can sometimes take a while on slower machines.
After generating some flame graphs with cProfile and vmprof, I found 3
functions which seem to be called extremely frequently with the same
arguments: catpkgsplit, use_reduce, and match_from_list.  In the first
two cases, it was simple to cache the results in dicts, while
match_from_list was a bit trickier, since it seems to be a requirement
that it return actual entries from the input "candidate_list".  I also
ran into some test failures if I did the caching after the
mydep.unevaluated_atom.use and mydep.repo checks towards the end of the
function, so the caching is only done up to just before that point.

The catpkgsplit change seems to definitely be safe, and I'm pretty sure
the use_reduce one is too, since anything that could possibly change the
result is hashed.  I'm a bit less certain about the match_from_list one,
although all tests are passing.

With all 3 patches together, "emerge -uDvpU --with-bdeps=y @world"
speeds up from 43.53 seconds to 30.96 sec -- a 40.6% speedup.  "emerge
-ep @world" is just a tiny bit faster, going from 18.69 to 18.22 sec
(2.5% improvement).  Since the upgrade case is far more common, this
would really help in daily use, and it shaves about 30 seconds off
the time you have to wait to get to the [Yes/No] prompt (from ~90s to
60s) on my old Sandy Bridge laptop when performing normal upgrades.

Hopefully, at least some of these patches can be incorporated, and please
let me know if any changes are necessary.

Thanks,
Chun-Yu





Re: [gentoo-portage-dev] Add caching to a few commonly used functions

2020-06-27 Thread Kent Fredric
On Sat, 27 Jun 2020 at 19:35, Fabian Groffen  wrote:
>
> Hi Chun-Yu,

> > arguments: catpkgsplit, use_reduce, and match_from_list.  In the first
> > two cases, it was simple to cache the results in dicts, while
> > match_from_list was a bit trickier, since it seems to be a requirement
> > that it return actual entries from the input "candidate_list".  I also
> > ran into some test failures if I did the caching after the
> > mydep.unevaluated_atom.use and mydep.repo checks towards the end of the
> > function, so the caching is only done up to just before that point.

You may also want to investigate the version aspect parsing logic
where it converts versions into a data structure, partly because the
last time I tried profiling portage, every sample seemed to turn up in
there.

And I'd expect to see a lot of commonality in this.


# qlist -I --format "%{PV}" | wc -c
14678
# qlist -I --format "%{PV}" | sort -u | wc -c
8811

And given this version-parsing path is even handled for stuff *not*
installed, I suspect the real-world implications are worse

# find /usr/portage/ -name "*.ebuild"  | sed
's|/usr/portage/||;s|/[^/]*/|/|;s|[.]ebuild$||' | xargs qatom -CF
"%{PV}" | wc -l
32604
# find /usr/portage/ -name "*.ebuild"  | sed
's|/usr/portage/||;s|/[^/]*/|/|;s|[.]ebuild$||' | xargs qatom -CF
"%{PVR}" | sort -u | wc -l
10362
katipo2 ~ # find /usr/portage/ -name "*.ebuild"  | sed
's|/usr/portage/||;s|/[^/]*/|/|;s|[.]ebuild$||' | xargs qatom -CF
"%{PV}" | sort -u | wc -l
7515

Obviously this is very crude analysis, but you see there's room to
potentially no-op half of all version parses. Though the speed/memory
tradeoff may not be worth it.

Note, that this is not just "parse the version on the ebuild", which
is fast, but my sampling seemed to indicate it was parsing the version
afresh for every version comparison, which means internally, it was
parsing the same version dozens of times over, which is much slower!




-- 
Kent

KENTNL - https://metacpan.org/author/KENTNL



Re: [gentoo-portage-dev] Add caching to a few commonly used functions

2020-06-27 Thread Chun-Yu Shei
Hi Fabian,

Just eyeballing htop's RES column while emerge is running, the max
value I see during "emerge -uDvpU --with-bdeps=y @world" increases
from 272 MB to 380 MB, so it looks to be around 110 MB of extra memory
usage on my system (with 1,094 total packages installed).

Chun-Yu

On Sat, Jun 27, 2020, 12:35 AM Fabian Groffen  wrote:
>
> Hi Chun-Yu,
>
> On 26-06-2020 23:34:12 -0700, Chun-Yu Shei wrote:
> > Hi,
> >
> > I was recently interested in whether portage could be speed up, since
> > dependency resolution can sometimes take a while on slower machines.
> > After generating some flame graphs with cProfile and vmprof, I found 3
> > functions which seem to be called extremely frequently with the same
> > arguments: catpkgsplit, use_reduce, and match_from_list.  In the first
> > two cases, it was simple to cache the results in dicts, while
> > match_from_list was a bit trickier, since it seems to be a requirement
> > that it return actual entries from the input "candidate_list".  I also
> > ran into some test failures if I did the caching after the
> > mydep.unevaluated_atom.use and mydep.repo checks towards the end of the
> > function, so the caching is only done up to just before that point.
> >
> > The catpkgsplit change seems to definitely be safe, and I'm pretty sure
> > the use_reduce one is too, since anything that could possibly change the
> > result is hashed.  I'm a bit less certain about the match_from_list one,
> > although all tests are passing.
> >
> > With all 3 patches together, "emerge -uDvpU --with-bdeps=y @world"
> > speeds up from 43.53 seconds to 30.96 sec -- a 40.6% speedup.  "emerge
> > -ep @world" is just a tiny bit faster, going from 18.69 to 18.22 sec
> > (2.5% improvement).  Since the upgrade case is far more common, this
> > would really help in daily use, and it shaves about 30 seconds off
> > the time you have to wait to get to the [Yes/No] prompt (from ~90s to
> > 60s) on my old Sandy Bridge laptop when performing normal upgrades.
> >
> > Hopefully, at least some of these patches can be incorporated, and please
> > let me know if any changes are necessary.
>
> This sounds like a good job to me!  Do you have any idea what the added
> memory pressure for these changes are?
>
> Thanks,
> Fabian
> >
> > Thanks,
> > Chun-Yu
> >
> >
> >
>
> --
> Fabian Groffen
> Gentoo on a different level



Re: [gentoo-portage-dev] Add caching to a few commonly used functions

2020-06-27 Thread Fabian Groffen
Hi Chun-Yu,

On 26-06-2020 23:34:12 -0700, Chun-Yu Shei wrote:
> Hi,
> 
> I was recently interested in whether portage could be speed up, since
> dependency resolution can sometimes take a while on slower machines.
> After generating some flame graphs with cProfile and vmprof, I found 3
> functions which seem to be called extremely frequently with the same
> arguments: catpkgsplit, use_reduce, and match_from_list.  In the first
> two cases, it was simple to cache the results in dicts, while
> match_from_list was a bit trickier, since it seems to be a requirement
> that it return actual entries from the input "candidate_list".  I also
> ran into some test failures if I did the caching after the
> mydep.unevaluated_atom.use and mydep.repo checks towards the end of the
> function, so the caching is only done up to just before that point.
> 
> The catpkgsplit change seems to definitely be safe, and I'm pretty sure
> the use_reduce one is too, since anything that could possibly change the
> result is hashed.  I'm a bit less certain about the match_from_list one,
> although all tests are passing.
> 
> With all 3 patches together, "emerge -uDvpU --with-bdeps=y @world"
> speeds up from 43.53 seconds to 30.96 sec -- a 40.6% speedup.  "emerge
> -ep @world" is just a tiny bit faster, going from 18.69 to 18.22 sec
> (2.5% improvement).  Since the upgrade case is far more common, this
> would really help in daily use, and it shaves about 30 seconds off
> the time you have to wait to get to the [Yes/No] prompt (from ~90s to
> 60s) on my old Sandy Bridge laptop when performing normal upgrades.
> 
> Hopefully, at least some of these patches can be incorporated, and please
> let me know if any changes are necessary.

This sounds like a good job to me!  Do you have any idea what the added
memory pressure for these changes are?

Thanks,
Fabian
> 
> Thanks,
> Chun-Yu
> 
> 
> 

-- 
Fabian Groffen
Gentoo on a different level


signature.asc
Description: PGP signature