On 21 August 2016 at 04:00, Jim Fulton <j...@jimfulton.info> wrote: > These really weren't goals of setuptools, which tried to fit into > site-package-based installs and ironically resorted to unsavory techniques > to wedge eggs in using elaborate .pth files.
Right, and pip took that approach further by making the site-packages friendly approach the *default* approach, rather than a selectable option. This has had the effect of making pip not only useful for component management in its own right, but also viable as a tool for assembling *downstream* packages from Python source projects. > With buildout, I chose to use eggs differently. I simply generate scripts > with dependencies explicitly listed in the Python path. This is very easy > to reason about. It makes it easy to see what's installed by looking at the > generated path. This is similar to the way Java uses class paths to > assemble jar files, which is fitting given that the design of eggs took at > least some inspiration from jar files. > > I'm not a fan of Java, but I think class paths + jar files was something it > got right. Each program has it's own distinct environment. If you need to > add/remove/update a dependency, just change the path. Want to share > installed components between programs? It's easy because only one installed > component needs to be stored and different programs point to it via their > paths. Agreed, but similar to both conda and JARs themselves, this improved isolation made buildout *less* useful to folks working on distro packages that actually *wanted* to be installing their Python components into the system Python installation. > Pip's approach OTOH, leaves me skeptical. When installing a package, it has > to unpack it and install its files into a shared directory structure where > conflicts are likely. How does it handle these conflicts? IDK. I should, > but I don't. It doesn't, since we don't have full file manifests in our metadata - if you inadvertently install both python-openid and python3-openid into the same virtualenv, they'll trample over each other's files. > I have the impression that uninstalling things can be > problematic, but maybe that's been fixed. Uninstallation is fine, as we *do* have a full file manifest after a component has been installed. > At best, this is a lot of > complexity to maintain, at worse, uninstalls/reinstalls leave things behind > that make buildout's goal of repeatability impossible to achieve. > > For isolation, pip relies on virtualenv. This has always struck me as an > extremely heavy-handed approach. I'm quite surprised that the Python > community settled on it. But whatever. The pay-off for the pip model comes in the fact that using venv is *optional* in a way that isn't generally true for other more specifically app focused systems like conda and buildout. That doesn't make conda and buildout wrong - it means they cut out a particular rare-in-number-but-large-in-influence use case (Linux based operating system development) in order to better focus on a more targeted set of use cases (data analysis for conda, network service development for buildout) > A few packages (e.g. NumPy) really > depend on the shape of the environment they're installed into so can't be > installed with buildout, but can be installed with pip+virtualenv. The distinction also goes the other way - pip can be used to add capabilities to an existing Python installation without fundamentally changing the architecture of that installation, including how it decides to handle (or not handle, as the case may be) separating different applications from each other. The key thing that pip's downstream platform tend to bring to the table is different answers to that application isolation problem, while pip just handles dependency management and build system invocation: - virtualenv mainly just manipulates sys.path to amend where site-packages is found - Linux distros offer their system package management (including full preinstall file manifests and associated conflict detection) as well as chroots and Linux containers (most famously, Docker) - *nix systems have also long offered the "modules" environment management utility (especially popular in HPC) - buildout has tailored per-application sys.path definitions - conda has its own environment management tooling > The buildout developers have discussed options for the future. We know > there's a reckoning coming, but so far, thankfully, we've been able to put > it off, but we don't want to be a burden on the rest of the Python > community. (Seriously, thank you for not breaking us. :) ) I don't think buildout's a burden. While there's no logical reason for environment isolation to be tightly coupled to dependency management and build system invocation, when designing buildout you didn't have a choice - setuptools was realistically the only game in town for those two pieces, so it made sense to structure buildout around that, and then diverge only on the isolation management side of things. > We've debated just invoking virtualenv and pip to assemble Python > applications. A model we've been discussing is to let buildout recipes do > this. No change is needed to buildout. There's at least one and probably > multiple recipes that do this, although I haven't used them myself. In this > model, a buildout could install different virtualenvs for different > programs, allowing dependencies to be kept distinct. I still worry about > the integrity of these virtualenvs over time as dependencies are added, > removed, and updated. As a competing approach to isolation management, I doubt it would make much sense for buildout to adopt virtualenv - it seems more logical to me to keep your current isolation model (which has a lot to recommend it), and instead look just to swapping in pip to replace setuptools and easy_install for the dependency management and build system invocation pieces. > If I could have my way, the path of distinct package directories approach > would still be an option for buildout, as I think it's superior. I don't see any reason for buildout to drop the per-application sys.path customisation approach in favour of venv's - there's nothing wrong with it, and it avoids several of the problems that can arise with a proliferation of venv's. > I'm > hopeful that it will be possible to use wheels this way, assuming that eggs > are retired. Wheels can already be used this way - the "officially not supported" aspect is using them as sys.path entries without unpacking them first. However, eggs also won't be retired until there's a comparable documented format that officially supports usage with zipimport (as opposed to wheel's status where it works if you know how to make it work, but there aren't any formal guarantees that what works today in that respect will continue working tomorrow) > I would also prefer that there be one library, or set of complementary > libraries, to find, download and install packages. I normally like > competition, but this is such a boring and difficult domain that I don't > really see there being interest in developing multiple solutions. Honestly, > I'd be nervous if, in the long run, buildout and pip installed things > differently, especially given security considerations. For several of the core pieces shared between pip and Warehouse, Donald has already broken out https://packaging.pypa.io > In the long run, I suspect it would be best for some buildout developers to > offer pip PRs that abstract functionality into libraries that buildout could > call (and that pip called internally), although it sounds like this may > already be happening without our help. Only for the pip/Warehouse common components, since Donald is driving the extraction of common requirements for those two projects. However, there's another important activity along similar lines that doesn't have anyone that I'm aware of actively pursuing it, which is pulling more of the pieces of pip's PyPI client and local installation management behaviour out into a more readily re-usable form. While Vinay Sajip's distlib (https://pypi.python.org/pypi/distlib ) already covers a lot of that, what's currently missing is folks looking at the common capabilities of pip and distlib to ensure that they're actually behaving the same way, with robust test suites to ensure they're also following the relevant standards (or else that we update the relevant standards to match what people are actually doing). Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig