On Sat, 27 Jun 2020 at 19:35, Fabian Groffen <grob...@gentoo.org> wrote:
>
> Hi Chun-Yu,

> > arguments: catpkgsplit, use_reduce, and match_from_list.  In the first
> > two cases, it was simple to cache the results in dicts, while
> > match_from_list was a bit trickier, since it seems to be a requirement
> > that it return actual entries from the input "candidate_list".  I also
> > ran into some test failures if I did the caching after the
> > mydep.unevaluated_atom.use and mydep.repo checks towards the end of the
> > function, so the caching is only done up to just before that point.

You may also want to investigate the version aspect parsing logic
where it converts versions into a data structure, partly because the
last time I tried profiling portage, every sample seemed to turn up in
there.

And I'd expect to see a lot of commonality in this.


# qlist -I --format "%{PV}" | wc -c
14678
# qlist -I --format "%{PV}" | sort -u | wc -c
8811

And given this version-parsing path is even handled for stuff *not*
installed, I suspect the real-world implications are worse

# find /usr/portage/ -name "*.ebuild"  | sed
's|/usr/portage/||;s|/[^/]*/|/|;s|[.]ebuild$||' | xargs qatom -CF
"%{PV}" | wc -l
32604
# find /usr/portage/ -name "*.ebuild"  | sed
's|/usr/portage/||;s|/[^/]*/|/|;s|[.]ebuild$||' | xargs qatom -CF
"%{PVR}" | sort -u | wc -l
10362
katipo2 ~ # find /usr/portage/ -name "*.ebuild"  | sed
's|/usr/portage/||;s|/[^/]*/|/|;s|[.]ebuild$||' | xargs qatom -CF
"%{PV}" | sort -u | wc -l
7515

Obviously this is very crude analysis, but you see there's room to
potentially no-op half of all version parses. Though the speed/memory
tradeoff may not be worth it.

Note, that this is not just "parse the version on the ebuild", which
is fast, but my sampling seemed to indicate it was parsing the version
afresh for every version comparison, which means internally, it was
parsing the same version dozens of times over, which is much slower!




-- 
Kent

KENTNL - https://metacpan.org/author/KENTNL

Reply via email to