Buildout was designed to automate project development and deployment tasks.
This encompasses assembly of software packages (Python, JavaScript, etc.)
and generation of support artifacts such as configuration files and scripts.

I never set out to make buildout a Python package installer.

I adopted setuptools because it enabled most of what I needed in terms of
fetching and assembling Python packages.  Adopting eggs was a by product of
adopting setuptools.

Some basic goals of buildout:


   - Isolation from system-provided Python installations.  This was a
   result of problems experienced when deploying applications for customers
   whos Python environments were different from ours (Zope Corporation's).

   (Thankfully, Docker mitigates these issues quite a bit.)


   - Repeatability -- Given the same buildout configuration file, running
   buildouts in separate workspaces should product the same result for a given
   platform, regardless of their history.

   One consequence of this is that it's important to be able uninstall
   things cleanly, both artifacts generated by buildout and packages installed
   by it.

   - Tight control over program dependencies.  This means that different
   programs (scripts) in a buildout could depend on different versions and
   maybe more importantly, dependencies didn't leak from one program to
   another.

   Expanding on this last point, it's common for libraries to change their
   behavior depending on whether another library is importable.  With a shared
   set of packages, installing a new program could cause introduction of
   packages that change the behavior of existing programs.  This would be bad.


These really weren't goals of setuptools, which tried to fit into
site-package-based installs and ironically resorted to unsavory techniques
to wedge eggs in using elaborate .pth files.

With buildout, I chose to use eggs differently.  I simply generate scripts
with dependencies explicitly listed in the Python path.  This is very easy
to reason about.  It makes it easy to see what's installed by looking at
the generated path.  This is similar to the way Java uses class paths to
assemble jar files, which is fitting given that the design of eggs took at
least some inspiration from jar files.

I'm not a fan of Java, but I think class paths + jar files was something it
got right.  Each program has it's own distinct environment.  If you need to
add/remove/update a dependency, just change the path.  Want to share
installed components between programs?  It's easy because only one
installed component needs to be stored and different programs point to it
via their paths.

Pip's approach OTOH, leaves me skeptical.  When installing a package, it
has to unpack it and install its files into a shared directory structure
where conflicts are likely.  How does it handle these conflicts? IDK. I
should, but I don't.  I have the impression that uninstalling things can be
problematic, but maybe that's been fixed. At best, this is a lot of
complexity to maintain, at worse, uninstalls/reinstalls leave things behind
that make buildout's goal of repeatability impossible to achieve.

For isolation, pip relies on virtualenv.  This has always struck me as an
extremely heavy-handed approach. I'm quite surprised that the Python
community settled on it.  But whatever. A few packages (e.g. NumPy) really
depend on the shape of the environment they're installed into so can't be
installed with buildout, but can be installed with pip+virtualenv.

The buildout developers have discussed options for the future. We know
there's a reckoning coming, but so far, thankfully, we've been able to put
it off, but we don't want to be a burden on the rest of the Python
community. (Seriously, thank you for not breaking us. :) )

We've debated just invoking virtualenv and pip to assemble Python
applications.  A model we've been discussing is to let buildout recipes do
this.  No change is needed to buildout.  There's at least one and probably
multiple recipes that do this, although I haven't used them myself. In this
model, a buildout could install different virtualenvs for different
programs, allowing dependencies to be kept distinct.  I still worry about
the integrity of these virtualenvs over time as dependencies are added,
removed, and updated.

If I could have my way, the path of distinct package directories approach
would still be an option for buildout, as I think it's superior.  I'm
hopeful that it will be possible to use wheels this way, assuming that eggs
are retired.

I would also prefer that there be one library, or set of complementary
libraries, to find, download and install packages.  I normally like
competition, but this is such a boring and difficult domain that I don't
really see there being interest in developing multiple solutions. Honestly,
I'd be nervous if, in the long run, buildout and pip installed things
differently, especially given security considerations.

In the long run, I suspect it would be best for some buildout developers to
offer pip PRs that abstract functionality into libraries that buildout
could call (and that pip called internally), although it sounds like this
may already be happening without our help.

Jim

-- 
Jim Fulton
http://jimfulton.info
_______________________________________________
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig

Reply via email to