I'd like to suggest a change to the Debian Policy around Python packages that
will help enable the world of Python packaging to continue to progress forward.

First, a little bit of background:

At the Python level there are three metadata formats for Python packaging:

* The original, setuptools style .egg-info directories.
* The distutils style .egg-info *file* added to distutils at some point.
* The new and improved, wheel based .dist-info directories.

The presence of any of these files will signal to Python tools that a
particular distribution has been installed, however there are two fairly major
and important differences between the distutils style, and the other two.

1. The distutils style has no provisions to record what files on the system
   belong to the installed distribution, making it appear to Python tooling
   that there *are* no files other than the metadata file itself.

2. The distutils style has no provisions to include additional metadata files
   in the metadata, making it impossible to extend the python level metadata
   with additional files.

I have a series of improvements that I'd like to make to the packaging
toolchain that will sort of build on one another, but which is not going to
function correctly with the distutils style metadata and I'm hoping that I can
convince y'all to make it policy to default to generating one of the other two
kinds (with varying methods, more on that later).

Concretely the thing that this is blocking right now, is that with the newly
released pip 8.0 I tried to make it so that pip will refuse to uninstall a
project that is installed with distutils style metadata. This is because we do
not have any way to associate the actual .py (and others) files on disk with
the installed metadata, so all we have ever done is just simply remove the
metadata file, making it appear as if the item is uninstalled but leaving
behind all of the actual files. However I'm going to be reverting this in a
pip 8.0.1 release because it caused a decent amount of breakage amongst pip's
users, almost all of them people who are attempting to upgrade OS provided
packages using pip.

Now, I know that upgrading OS provided packages using pip is less than optimal
and I would greatly prefer that people did not do it (and I'm generally in
agreement) however if we don't enable people to do it, they'll just continue to
use an old version of pip and file bugs. It's a non starter for pip to make it
impossible to do.

In addition to the uninstall bit, it also means that things like pip show -f
return junk information for packages installed in this way.

Beyond just (eventually) enabling pip to disable uninstallations of distutils
based installs this will start to allow some other future changes that I think
will be more interesting to Debian. The uninstallation of distutils based
installs comes hand in hand with pip stomping all over already existing files
willy nilly because the way upgrading a project like that works is pip
uninstalls the metadata file that says X is installed, then it just overwrites
over any of the files that happen to be in it's way when it installs the newer
version. If we can remove the need for pip to gleefully overwrite files to
support these types of installed packages, then we can make it so pip will
hard fail if it attempts to overwrite an already existing file on disk.

An additional benefit here is that by switching to using the directory based
options, we can add additional metadata files to the installed projects, much
like the INSTALLER file from PEP376 (IIRC). This file will likely be the path
to having pip refuse to touch OS owned files all together without some sort of
--force flag to override the safety switch.

As far as compatibility goes, pip has always forced everything to be installed
using setuptools and as far as I am aware, there's no real fallout from doing
so. I think in 2016 it's pretty reasonable to assume that a Python project is
capable of being installed using setuptools instead of distutils.

So without getting into the actual *method* of doing this (of which there are
several different options with different trade offs) does this sound like
something at all that Debian would be interested in?

Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

Reply via email to