On Oct 22, 2009, at 11:08 PM, Kevin Teague wrote:


On Oct 22, 2009, at 10:43 AM, Tres Seaver wrote:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Martin Aspeli wrote:
Hi,

Is there a way (apart from putting buildout in a virtualenv with -- no-site- packages) to tell buildout *not* to put site-packages as the first line in the
mangled sys.path when it generates scripts?

We have people doing horrid things to their global python, and we need the
buildout to be safe and isolated in these environments.

Using a --no-site-packages virtualenv to drive the buildout is a pretty lightweight solution, and easier than the old standby of compiling your
own Python to get isolation from the global one, whichstill highly
recommended: I build my own Python, and then use a separate virtualenv
for each project.


The idea behind Gary's branch

To be clear, *an* idea. You can also just make a "don't give me what is in site-packages" gesture. (When you do that, in the current branch, the generated scripts still have the complexities you describe below, though.)

(http://svn.zope.org/zc.buildout/branches/gary-4-include-site- packages) is that unlike the --no-site-packages option of virtualenv, which is all-or-nothing proposition, you would be able to include site-package locations in Buildout's script generation, but care would be taken that if distributions are selected from a site-package location to make sure that when site-package locations are included on sys.path, those locations don't overshadow any other paths pointing explicitly to already picked versions of distributions. e.g. If I was using Apple's System Python on Leopard (10.5), then site-packages includes zope.interface 3.3.0 and bdist_mpkg 0.4.3. If I wanted to pick 'zope.interface == 3.3.0' and 'bdist_mpkg == 0.4.4', then currently Buildout could generate a path modification that looks like:

sys.path[0:0] = [
'/System/Library/Frameworks/Python.framework/Versions/2.5/Extras/ lib/python',
 '/Users/kteague/buildouts/shared/eggs/bdist_mpkg-0.4.4-py2.5.egg',
]

Where that System path contains bdist_mpkg 0.4.3. The ordering of whether the site-package location is put before or after version- specific paths is currently dependant upon the ordering in the install_requires field (so you get the correct versions importable if those distributions which are picked from site-packages are listed after the non-site-package picked versions!) - obviously this is just a side-effect of the current path manipulation implementation.

Not exactly. I was going to go for that, but it was too hard/insane. (Do I need to update some docs on the branch?)

If you use this feature, then eggs from site-packages can be inserted cleanly along with other eggs. They can be chosen individually, without masking other eggs. Site-packages-like directories themselves--the directories that are not eggs, but collections of standard directory packages--always go at the end of the sys.path. Otherwise their contents might mask the eggs you chose.

What we actually ended up using ourselves (Launchpad) is "don't use any eggs from site-packages, but let site-packages through at the end so we can get some of the non-egg things from it that our system is providing, like Postgres-Python bindings."

One would assume that making this change is fairly easy. Just do a diff between normal sys.path and the site-package free sys.path when Python is launched with the -S flag. Which Gary's code does, but the script generation in Gary's branch right now also accounts for the fact that *.pth files have been processed, and that you are allowed to have import statements executed when *.pth files are processed, so he is generating scripts which also clean sys.modules, and then re-add site-packages locations with site.addsitedir(location) so that .pth files are properly re-processed. Which is pretty fancy, and probably "Does the Right Thing (TM)", but also greatly clutters up the generated scripts.

Mostly right, and granted that the scripts are bigger and more annoying than they are in trunk.

FWIW, the "fancy" bits are not primarily because .pth files might import. It's more because the setuptools approach to creating namespace packages in site-packages--that is, the approach that OS distributions typically use--creates fake modules for the namespace packages. These mask any sys.path eggs in the same namespace packages, at least as of c9. We have to clean the fake modules out, set up the sys.path, import pkg_resources because that magically does the right thing for any eggs on the sys.path, and *then* process .pth files.

(I hope that PEP 382 is accepted and helps.)

I quite like having script generation generate scripts which are still reasonably compact (I often open generated scripts to see what Buildout is doing, or sometimes edit them to hand-pick a different egg if I want to quickly try out a different working set)

Granted.

and I also wonder how much overhead this additional processing adds (I guess this depends upon how much you have in site-packages).

Any overhead is lost in the cost of importing pkg_resources.

Launchpad has a whole bunch of dependencies (~170 eggs last I checked). It's trivial to generate both a ``PYTHONPATH= [...dependencies...] python`` and a faux Python interpreter generated by buildout that does the tricks that you describe. To make the PYTHONPATH approach work with the namespace package problem I described above, you have to hack site.py to import pkg_resources before it processes the .pth files.

I compared the two approaches with Launchpad's ~170 dependencies using ``time ${INTERPRETER_CHOICE} -c ''``. They were equivalent in my tests. (FWIW, they were both about 20 times slower than ``time python -c ''``).

So perhaps if there was some option to still generate scripts using the existing style of script generation - maybe a "i-keep-my-site- packages-clean=true" option ... i dunno, perhaps the other way 'work- around-site-package-madness-in-script-generation=true' ... or just merge Buildout and VirtualEnv into one monolithic project so that you don't need to install two tools just to be able to use Buildout with a dirty Python! (rawr!)

I can understand the desire to make it possible to have a simpler script if you want to promise that you are going to have a clean site- packages. I'm not super-excited to add this feature and the related tests, but if that made it possible for my work to not be consigned to a branch forever, I suppose I'd sign up. Jim will be the arbiter there.

And, as usual and of course, there are other approaches possible than the one I chose.

Anyways, for those distributions which are tough to install, I think some people will find this branch quite handy in that they can apt- get the tough to install distributions, and then safely include those distributions in working sets composed by Buildout.

Seems to be working for us.

Thanks for looking at the branch, and for writing about it.

Gary
_______________________________________________
Distutils-SIG maillist  -  Distutils-SIG@python.org
http://mail.python.org/mailman/listinfo/distutils-sig

Reply via email to