Re: [gentoo-portage-dev] [PATCH 1/3] Add caching to catpkgsplit function
On 7/6/20 11:03 AM, Zac Medico wrote: > On 7/6/20 10:30 AM, Chun-Yu Shei wrote: >> I finally got a chance to try Sid's lru_cache suggestion, and the >> results were really good. Simply adding it on catpkgsplit and moving >> the body of use_reduce into a separate function (that accepts tuples >> instead of unhashable lists/sets) and decorating it with lru_cache >> gets a similar 40% overall speedup for the upgrade case I tested. It >> seems like even a relatively small cache size (1000 entries) gives >> quite a speedup, even though in the use_reduce case, the cache size >> eventually reaches almost 20,000 entries if no limit is set. With >> these two changes, adding caching to match_from_list didn't seem to >> make much/any difference. > > That's great! > >> The catch is that lru_cache is only available in Python 3.2, so would >> it make sense to add a dummy lru_cache implementation for Python < 3.2 >> that does nothing? There is also a backports-functools-lru-cache >> package that's already available in the Portage tree, but that would >> add an additional external dependency. >> >> I agree that refactoring could yield an even bigger gain, but >> hopefully this can be implemented as an interim solution to speed up >> the common emerge case of resolving upgrades. I'm happy to submit new >> patches for this, if someone can suggest how to best handle the Python >> < 3.2 case. :) >> >> Thanks, >> Chun-Yu > > We can safely drop support for < Python 3.6 at this point. Alternatively > we could add a compatibility shim for Python 2.7 that does not perform > any caching, but I really don't think it's worth the trouble to support > it any longer. We've dropped Python 2.7, so now the minimum version is Python 3.6. -- Thanks, Zac signature.asc Description: OpenPGP digital signature
Re: [gentoo-portage-dev] [PATCH] travis.yml: drop python 2.7 (bug 731114)
On 7/6/20 12:07 PM, Michał Górny wrote: > On Mon, 2020-07-06 at 11:42 -0700, Zac Medico wrote: >> It should be pretty safe to drop support for python2.7 at this point. >> > > We should probably also change the trove classifier to ... Python :: 3 > :: Only > Updated the classifier, and merged: https://gitweb.gentoo.org/proj/portage.git/commit/?id=e59ec1924d6db957a01c828ce294a7675be5b27c -- Thanks, Zac signature.asc Description: OpenPGP digital signature
Re: [gentoo-portage-dev] [PATCH] travis.yml: drop python 2.7 (bug 731114)
On Mon, 2020-07-06 at 11:42 -0700, Zac Medico wrote: > It should be pretty safe to drop support for python2.7 at this point. > We should probably also change the trove classifier to ... Python :: 3 :: Only -- Best regards, Michał Górny signature.asc Description: This is a digitally signed message part
Re: [gentoo-portage-dev] [PATCH] travis.yml: drop python 2.7 (bug 731114)
On Mon, 6 Jul 2020 11:42:06 -0700 Zac Medico wrote: > It should be pretty safe to drop support for python2.7 at this point. > > Bug: https://bugs.gentoo.org/731114 > Signed-off-by: Zac Medico > --- > .travis.yml | 1 - > tox.ini | 6 ++ > 2 files changed, 2 insertions(+), 5 deletions(-) > > diff --git a/.travis.yml b/.travis.yml > index 2132c8c87..d2935fdab 100644 > --- a/.travis.yml > +++ b/.travis.yml > @@ -1,7 +1,6 @@ > dist: bionic > language: python > python: > -- 2.7 > - 3.6 > - 3.7 > - 3.8 > diff --git a/tox.ini b/tox.ini > index 79b5b45cb..050a2c455 100644 > --- a/tox.ini > +++ b/tox.ini > @@ -1,14 +1,12 @@ > [tox] > -envlist = py27,py36,py37,py38,py39,pypy3 > +envlist = py36,py37,py38,py39,pypy3 > skipsdist = True > > [testenv] > deps = > pygost > pyyaml > - py27,py36,py37,py38,py39,pypy3: lxml!=4.2.0 > - py27: pyblake2 > - py27: pysha3 > + py36,py37,py38,py39,pypy3: lxml!=4.2.0 > setenv = > PYTHONPATH={toxinidir}/lib > commands = Go for it!
[gentoo-portage-dev] [PATCH] travis.yml: drop python 2.7 (bug 731114)
It should be pretty safe to drop support for python2.7 at this point. Bug: https://bugs.gentoo.org/731114 Signed-off-by: Zac Medico --- .travis.yml | 1 - tox.ini | 6 ++ 2 files changed, 2 insertions(+), 5 deletions(-) diff --git a/.travis.yml b/.travis.yml index 2132c8c87..d2935fdab 100644 --- a/.travis.yml +++ b/.travis.yml @@ -1,7 +1,6 @@ dist: bionic language: python python: -- 2.7 - 3.6 - 3.7 - 3.8 diff --git a/tox.ini b/tox.ini index 79b5b45cb..050a2c455 100644 --- a/tox.ini +++ b/tox.ini @@ -1,14 +1,12 @@ [tox] -envlist = py27,py36,py37,py38,py39,pypy3 +envlist = py36,py37,py38,py39,pypy3 skipsdist = True [testenv] deps = pygost pyyaml - py27,py36,py37,py38,py39,pypy3: lxml!=4.2.0 - py27: pyblake2 - py27: pysha3 + py36,py37,py38,py39,pypy3: lxml!=4.2.0 setenv = PYTHONPATH={toxinidir}/lib commands = -- 2.25.3
Re: [gentoo-portage-dev] [PATCH 1/3] Add caching to catpkgsplit function
On 7/6/20 10:30 AM, Chun-Yu Shei wrote: > I finally got a chance to try Sid's lru_cache suggestion, and the > results were really good. Simply adding it on catpkgsplit and moving > the body of use_reduce into a separate function (that accepts tuples > instead of unhashable lists/sets) and decorating it with lru_cache > gets a similar 40% overall speedup for the upgrade case I tested. It > seems like even a relatively small cache size (1000 entries) gives > quite a speedup, even though in the use_reduce case, the cache size > eventually reaches almost 20,000 entries if no limit is set. With > these two changes, adding caching to match_from_list didn't seem to > make much/any difference. That's great! > The catch is that lru_cache is only available in Python 3.2, so would > it make sense to add a dummy lru_cache implementation for Python < 3.2 > that does nothing? There is also a backports-functools-lru-cache > package that's already available in the Portage tree, but that would > add an additional external dependency. > > I agree that refactoring could yield an even bigger gain, but > hopefully this can be implemented as an interim solution to speed up > the common emerge case of resolving upgrades. I'm happy to submit new > patches for this, if someone can suggest how to best handle the Python > < 3.2 case. :) > > Thanks, > Chun-Yu We can safely drop support for < Python 3.6 at this point. Alternatively we could add a compatibility shim for Python 2.7 that does not perform any caching, but I really don't think it's worth the trouble to support it any longer. -- Thanks, Zac signature.asc Description: OpenPGP digital signature
Re: [gentoo-portage-dev] [PATCH 1/3] Add caching to catpkgsplit function
I finally got a chance to try Sid's lru_cache suggestion, and the results were really good. Simply adding it on catpkgsplit and moving the body of use_reduce into a separate function (that accepts tuples instead of unhashable lists/sets) and decorating it with lru_cache gets a similar 40% overall speedup for the upgrade case I tested. It seems like even a relatively small cache size (1000 entries) gives quite a speedup, even though in the use_reduce case, the cache size eventually reaches almost 20,000 entries if no limit is set. With these two changes, adding caching to match_from_list didn't seem to make much/any difference. The catch is that lru_cache is only available in Python 3.2, so would it make sense to add a dummy lru_cache implementation for Python < 3.2 that does nothing? There is also a backports-functools-lru-cache package that's already available in the Portage tree, but that would add an additional external dependency. I agree that refactoring could yield an even bigger gain, but hopefully this can be implemented as an interim solution to speed up the common emerge case of resolving upgrades. I'm happy to submit new patches for this, if someone can suggest how to best handle the Python < 3.2 case. :) Thanks, Chun-Yu On Mon, Jul 6, 2020 at 9:10 AM Francesco Riosa wrote: > > Il 06/07/20 17:50, Michael 'veremitz' Everitt ha scritto: > > On 06/07/20 16:26, Francesco Riosa wrote: > >> Il 29/06/20 03:58, Sid Spry ha scritto: > >>> There are libraries that provide decorators, etc, for caching and > >>> memoization. > >>> Have you evaluated any of those? One is available in the standard library: > >>> https://docs.python.org/dev/library/functools.html#functools.lru_cache > >>> > >>> I comment as this would increase code clarity. > >>> > >> I think portage developers try hard to avoid external dependancies > >> I hope hard they do > >> > >> > > I think the key word here is 'external' - anything which is part of the > > python standard library is game for inclusion in portage, and has/does > > provide much needed optimisation. Many of the issues in portage are > > so-called "solved problems" in computing terms, and as such, we should take > > advantage of these to improve performance at every available opportunity. > > Of course, there are presently only one, two or three key developers able > > to make/test these changes (indeed at scale) so progress is often slower > > than desirable in current circumstances... > > > > [sent direct due to posting restrictions...] > yes I've replied too fast and didn't notice Sid was referring to > _standard_ libraries (not even recent additions) > > sorry for the noise > > - Francesco > >
Re: [gentoo-portage-dev] [PATCH 1/3] Add caching to catpkgsplit function
Il 06/07/20 17:50, Michael 'veremitz' Everitt ha scritto: On 06/07/20 16:26, Francesco Riosa wrote: Il 29/06/20 03:58, Sid Spry ha scritto: There are libraries that provide decorators, etc, for caching and memoization. Have you evaluated any of those? One is available in the standard library: https://docs.python.org/dev/library/functools.html#functools.lru_cache I comment as this would increase code clarity. I think portage developers try hard to avoid external dependancies I hope hard they do I think the key word here is 'external' - anything which is part of the python standard library is game for inclusion in portage, and has/does provide much needed optimisation. Many of the issues in portage are so-called "solved problems" in computing terms, and as such, we should take advantage of these to improve performance at every available opportunity. Of course, there are presently only one, two or three key developers able to make/test these changes (indeed at scale) so progress is often slower than desirable in current circumstances... [sent direct due to posting restrictions...] yes I've replied too fast and didn't notice Sid was referring to _standard_ libraries (not even recent additions) sorry for the noise - Francesco
Re: [gentoo-portage-dev] [PATCH 1/3] Add caching to catpkgsplit function
Il 29/06/20 03:58, Sid Spry ha scritto: There are libraries that provide decorators, etc, for caching and memoization. Have you evaluated any of those? One is available in the standard library: https://docs.python.org/dev/library/functools.html#functools.lru_cache I comment as this would increase code clarity. I think portage developers try hard to avoid external dependancies I hope hard they do