Ross Patterson <m...@rpatterson.net> writes: > Ross Patterson <m...@rpatterson.net> writes: > >> Marius Gedminas <mar...@pov.lt> writes: >> >>> On Sat, Jan 21, 2012 at 02:19:03AM -0800, Ross Patterson wrote: >>>> I moved this patch to a branch of zc.buildout: >>>> >>>> svn+ssh://svn.zope.org/repos/main/zc.buildout/branches/env-cache
[...snip...] > With this procedure, here are my timings with two different buildouts, > one that is development focused with lots of different parts with > slightly different to wildly different required dists, and one that is a > production buildout with many identical parts. I did all this after > completely clearing out my egg and download caches and running each > buildout once first to re-download all eggs so that results aren't > skewed by a large, out-of-date egg cache: I re-timed everything comparing zc.buildout 1.4.4, 1.5.2 and the env-cache branch. All of these are with the trunk version of buildout.dumppickedversions which includes the optimizations from there. These are all against the production buildout from before but with a large egg cache. +------------------+-----------+-----------+-----------+ | | 1.4.4 | 1.5.2 | env-cache | +------------------+-----------+-----------+-----------+ | bin/buildout -N | 1m28.394s | 0m34.441s | 0m20.468s | +------------------+-----------+-----------+-----------+ | bin/buildout | 1m51.173s | 0m57.524s | 0m42.124s | +------------------+-----------+-----------+-----------+ | bin/buildout -vN | 2m25.618s | 1m18.324s | 1m4.896s | +------------------+-----------+-----------+-----------+ | bin/buildout -v | 2m36.758s | 1m32.049s | 1m27.575s | +------------------+-----------+-----------+-----------+ So 1.5 has a lot of improvements over 1.4 and these changes offer some incremental improvement over that but the improvement is a bigger relative difference when running with "-N" which I think is the more important case. Whatever 1.5 does differently also obviates the _log_requirements optimization, it only applies to 1.4 so I've reverted it on trunk. Now that I understand what's going on with zc.buildout.easy_install and pkg_resources, it seems like the right way to handle this is not to scan all the paths involved in a working set for all possible dists including those not required for the working set. Instead, it might be better to just scan the paths for dists we know we need. Maybe we can subclass pkg_resources.Environment that only stores the path in scan() and then scans the paths for specific project_names in __getitem__(). With the current caching, the profiling data indicates very little time spent scanning, so there's only incremental benefit to doing this in addition to the caching. It me be a more correct solution to do this *instead* of the caching. Also, with this approach, it may be possible to do global dist caching which can be updated as things are changed by hooking into zc.buildout.easy_install's develop(), install(), and build(). Thoughts? Ross _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig