Now that the basic wheels/pip/PyPI infrastructure is mostly
functional, there's been a lot of interest in improving higher-level
project workflow. We have a lot of powerful tools for this –
virtualenv, pyenv, conda, tox, pipenv, poetry, ... – and more in
development, like PEP 582 [1], which adds a support for project-local
packages directories (`__pypackages__/`) directly to the interpreter.

But to me it feels like right now, Python workflow tools are like the
blind men and the elephant [2]. Each group sees one part of the
problem, and so we end up with one set of people building legs,
another a trunk, a third some ears... and there's no overall plan for
how they can fit together.

For example, PEP 582 is trying to solve the problem that virtualenv is
really hard to use for beginners just starting out [3]. This is a
serious problem! But I don't want a solution that *only* works for
beginners starting out, so that once they get a little more
sophisticated they have to throw it out and learn something new from
scratch.

So I think now might be a time for a bit of top-down design. **I want
a picture of the elephant.** If we had that, maybe we could see how
all these different ideas could be put together into a coherent whole.
So at the Python core sprint a few weeks ago, I dragged some
interested parties [4] into a room with a whiteboard [5], and we made
a start at it. And now I'm writing it up to share with you all.

This is very much a draft, intended as a seed for discussion, not a conclusion.

[1] https://www.python.org/dev/peps/pep-0582/
[2] https://en.wikipedia.org/wiki/Blind_men_and_an_elephant
[3] https://www.python.org/dev/peps/pep-0582/#motivation
[4] I won't try to list names, because I know I'll forget someone, and
I don't know if everyone would agree with everything I wrote there.
But thank you all!
[5] https://photos.app.goo.gl/4HfY8P3ESPNi9oLMA, including special
guest appearance by Kushal's elbow


# The idealized lifecycle of a Python project

## 1. Beginner

Everyone starts out as a rank beginner. This may be the first time
they have programmed at all. At this stage, users want to:

- install *one* thing to get started (e.g. python itself)
- write and run simple scripts (standalone .py files)
- run a REPL
- install and use PyPI packages like requests or numpy
- install and use tools like jupyter
- their IDE should also be able to find these packages/tools

Over time, they'll probably end up with multiple scripts, and maybe
want to organize them into subdirectories. The above should all work
from subdirectories.

## 2. Sharing with others

Now we have a neat little script. Or maybe we've made a pretty jupyter
notebook that computes some crucial business analytics. We want to
share it with our friends or coworkers. We still need the features
above; and now we also care about:

- version control
- some way for our friend to reconstruct, on their computer:
  - the same PyPI packages that we were using
  - the same tools that we were using
  - the ways we invoked those tools

This last point is important: as projects grow in complexity, and are
used by a wider audience, they often end up with fairly complex tool
specifications that have to be shared among a team. For example:

- to run tests: in an environment that has pytest, pytest-cov, and
pytest-trio installed, and with our project working directory on
PYTHONPATH, run `pytest -Werror --cov ...`
- to format code: in an environment using python 3.6 or later, that
has black installed, run `black -l 79 *.py my-util-directory/*.py`

This kind of tool specification also puts us in a good position to set
up CI when we reach that point.

At this point our project can grow in a few different directions.


## 3a. Deployable webapp

This adds the requirement to "deploy". I think this is mostly covered
by the set-up-an-environment-to-run-a-command functionality already
described? I'm not super familiar with this, but it's pipenv's core
target, and pipenv doesn't have much more than that, so I assume
that's about right...

## 3b. Reusable library

For this we also need to:

- Build sdists and wheels
  - Which means: pyproject.toml, and some way to invoke it
- Install our library into our environments
  - Including dependency locking (best practice is to not pin
dependencies in wheel metadata, but to pin all dependencies in CI; so
there needs to be some way to track those separately, but integrated
enough that it's not a huge ceremony to add or change a dependency)

## 3c. Reusable standalone app

I think this is pretty much like the "Reusable library", except that
it'd be nice to have better tools to build/distribute standalone
applications. But if we had them, we could invoke them the same way as
we invoke other build systems?


# How do existing tools/proposals fit into this picture?

pyenv, virtualenv, and conda all solve parts of the "create an
environment" problem, but consider the other aspects out-of-scope.

tox solves the problem of keeping a shared record of how to run a
bunch of different tools in the appropriate environments, but doesn't
handle pinning or procuring appropriate python versions, and requires
a separate bootstrapping step to install tox.

`__pypackages__` (if implemented) makes it very easy for beginners to
use PyPI packages in their own scripts and from the REPL; in
particular, it would be part of python, so it meets the "install *one*
thing" criterion. But, it doesn't provide any way to run tools.
(There's no way to put `__pypackages__/bin` on PATH.) It doesn't allow
scripts to be organized into subdirectories. (For security reasons, we
can't have the python interpreter going off walking the filesystem
looking for `__pypackages__/`, so the PEP specifies that
`__pypackages__/` has to be in the same directory as the script that
uses it.) There's no way to share your `__pypackages__` environment
with a friend. So... it seems like a something that people would
outgrow very quickly.

pipenv and poetry are interesting. Their basic strategy is to say,
there is a top-level command that acts as your entry point to
performing workflow actions on on a python project (`pipenv` or
`poetry`, respectively). And this strategy at least in principle can
solve the problems that `__pypackages__/` runs into. In particular, it
doesn't rely on `$PATH`, so it can run tools; and because it's a
dedicated project management tool, it can go looking for the project
marker file.


# A fantastic elephant

So if our idealized user had an idealized tool, what would that look like?

They'll be interacting with Python through a dedicated tool, similar
to pipenv or poetry. In my little fantasy here I'll call it `pyp`,
because (a) I want to be neutral, (b) 6 characters is too long.

To get this tool, either they install Python (via python.org download,
apt, homebrew, whatever), and the tool is automatically included. Or
else, they install the tool directly, and it has the ability to
install Python interpreters when needed.

Once they have the tool, they start by making a new directory for
their project (this way they're ready to switch to version control
later).

Then they somehow mark this directory as being a "python project
root". I guess the UI would be something like `pyp new <name>` and it
just does it for you, but we have to figure out what this creates on
disk. We need some sort of marker file. Files that currently serve
this kind of role include tox.ini, Pipfile, pyproject.toml,
__pypackages__, ... But only one of these is a standard thing we're
already committed to sticking with, so, pyproject.toml it is. Let's
make it the marker for any python project, not just redistributable
libraries. (And if we do grow up into a redistributable library, then
we're already prepared.)

In the initial default configuration, there's a single default
environment. You can install things with `pyp install ...` or `pyp
uninstall ...`, and it tracks the requested packages in some
standardized way in pyproject.toml, and also pins specific versions
somewhere (could be pyproject.toml again I guess, or poetry's
pyproject.lock would work too). This way when we decide to share our
project later, our friends can recreate our environment on their
system.

However, there's also the capability to configure multiple custom
execution environments, including python version and installed
packages. And the capability to configure new aliases like `pyp test`
or `pyp reformat`, which run some specified command in a specified
environment.

Since the install/locking metadata is all standardized, you can even
switch between competing tools, and integrate with third-party tools
like pyup.io.

For redistributable libraries, we also need some way to get the wheel
metadata and the workflow metadata to play nicely together. Maybe this
means that we need a standardized install-requires field in
pyproject.toml, so that build backends and workflow tools have a
shared source of truth?


# What's wrong with pipenv?

Since pipenv is the tool that those of us in the room were most
familiar with, that comes closest to matching this vision, we
brainstormed a list of complaints about it. Some of these are more
reasonable than others.

- Not ambitious enough. This is a fuzzy sort of thing, but perception
matters, and it's right there in the name: it's a tool to use pip, to
manage an environment. If we're reconceiving this as the grand unified
entryway to all of Python, then the name starts to feel pretty weird.
The whole thing where it's only intended to work for webapp-style
projects would have to change.

- Uses Pipfile as a project marker instead of pyproject.toml.

- Not shipped with Python. (Obviously not pipenv's fault, but nonetheless.)

- Environments should be stored in project directory, not off in $HOME
somewhere. (Not sure what this is about, but some of the folks present
were quite insistent.)

- Environments should be relocatable.

- Hardcoded to only support "default" and "dev" environments, which is
insufficient.

- No mechanism for sharing prespecified commands like "run tests" or "reformat".

- Can't install Python. (There's... really no reason we *couldn't*
distribute pre-built Python interpreters on PyPI? between the
python.org installers and the manylinux image, we're already building
redistributable run-anywhere binaries for the most popular platforms
on every Python release; we just aren't zipping them up and putting
them on PyPI.)

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
--
Distutils-SIG mailing list -- distutils-sig@python.org
To unsubscribe send an email to distutils-sig-le...@python.org
https://mail.python.org/mm3/mailman3/lists/distutils-sig.python.org/
Message archived at 
https://mail.python.org/mm3/archives/list/distutils-sig@python.org/message/YFJITQB37MZOPOFJJF3OAQOY4TOAFXYM/

Reply via email to