Re: [Distutils] A possible refactor/streamlining of PEP 517

Donald Stufft Sat, 15 Jul 2017 11:33:19 -0700

> On Jul 15, 2017, at 6:54 AM, Paul Moore <p.f.mo...@gmail.com> wrote:
> 
> One particularly frustrating aspect of this discussion is that the
> worst offender for "wheel and sdist are inconsistent" is the way that
> setuptools requires developers to specify build and sdist contents
> separately (setup.py vs MANIFEST.in). That duplication is an obvious
> source of potential inconsistencies, and precisely why we get most of
> the reports we see. Ideally, new backends would not design in such
> inconsistency[1], which means it's easy to see such inconsistencies as
> "that should never happen" or "I don't understand the problem". But we
> will have to deal with the possibility of such backends, and the
> setuptools model isn't *that* unusual (setuptools didn't invent the
> file MANIFEST.in, it just reused the name for its own purpose).
> 
> [1] I don't know enough about flit to be sure, but if the developer
> forgets to check in a new source file, would it be possible for that
> source file be in the wheel but not in the sdist?



I think all of the build tools that we’ve looked at so far has this problem to 
some degree. It appears that flit is the least likely of the bunch to get 
affected by it, because it tries really hard to yell at you when you have files 
that aren’t in source control, but like Thomas has indicated that can obviously 
fail when the VCS is not available for some reason. We’re all well aware of how 
distutils/setuptools has issues in this arena, and enscons has it too with the 
fact you have two separate lists that get built, the list of files to add to 
the sdist, and the list of files that get installed.

Which is really the fundamental error case here. Whenever you have two 
different lists of files, one for the sdist and one for the install, you risk 
having areas where those two lists diverge which can give inconsistent results 
based on exactly how those two lists differ.

One thing I’d maybe push back on is the idea that a hook can’t fail— that I 
think is obviously not attainable. All of these hooks can fail for any number 
of reasons, the real question is whether it’s a fatal error to the entire build 
process or not.

If the wheel building hook fails, that is obviously a fatal error and a front 
end has to halt execution at that point because there’s nothing left for us to 
do (this is actually a distinct change from today, because today if wheel 
building fails we fall back to trying to do a direct install).

The place that we seem to be getting held up on is trying to make it so that 
building a sdist is a non-fatal error and that execution can continue in the 
case that sdist failed (or would have failed, depending on the order of 
operations). The primary driver for sdist errors that wouldn’t necessarily also 
translate to a wheel failure seems to be the lack of some external tool that 
can’t be installed via pip as a build requirement. Thinking through all of the 
tooling that currently exists, as well as any ideas in my head that I can think 
of for other tooling, the main tools that fit into the category of that are VCS 
tools (which I think is why they regularly get used as part of the example of a 
case where that can fail).

I wonder if maybe it would be more useful to simply recommend that instead of 
shelling out to random vcs binaries that these projects depend on (or bundle) 
libraries to directly interact with a repository. For instance, if your project 
supports git, then you can use dulwich or pygit2 and then the invariant of 
“building inside of a docker container without `git` installed” still remains 
functional.

This is obviously not 100% since I’m sure there are going to be some tools 
people want to use that simply aren’t going to be able to be installed as a 
Python package, however I don’t personally feel like having a fatal error 
because you haven’t satisfied some constraint the package has on the build 
system is unreasonable. That might trigger feature requests to tools to relax 
their constraints, but assuming that those constraints exist for good reason, 
then it seems easy enough to close those issues with a link to some FAQ about 
why they exist.

All of that being said, I don’t personally have a problem with the interface as 
it currently exists on https://www.python.org/dev/peps/pep-0517/ 
<https://www.python.org/dev/peps/pep-0517/> (assuming that’s the most up to 
date draft?). The inclusion of a build directory is fine with me, though the 
fact Nathaniel is concerned is somewhat concerning to me, given he has far more 
experience with random build tools than I do. I have some *other* comments 
about other parts of the PEP, but I’m going to hold off on addressing them 
until we get the interface, which is the meat of the spec, nailed down and 
decided.

One thing that I’ve thought about as I was reading this spec, is really I think 
one of the important things to do with this spec is to somewhat divorce our 
thinking from what specifically pip or tox or whatever will or won’t do with it 
as the *only* path, and instead make sure it’s flexible enough to implement all 
of the paths that we’re still going to support. While I had been a proponent of 
making VCS -> sdist -> wheel -> install be the only path, it appears I am in a 
minority about that (since a lot of the effort has been in trying to decide how 
best to support *not* going through sdist). If we’re going to support other 
ways, then I think being flexible is the right way to do it (as different tools 
will likely impose different constraints on how they process build directories).

One benefit of that is we can evolve the actual tooling faster then we can 
evolve specs (or at least, that seems to be the case!) and any spec we create 
we’re stuck with for a decade+ once it’s been implemented, but tooling itself 
lives for far fewer years. That means that tooling can initially start out 
being fairly strict or hardline, and then wait and see how the ecosystem reacts 
to that. We’re all making guesses about how likely one failure mode or another 
is going to happen with a new crop of tools designed in this decade and I don’t 
think we can really say for sure which cases are going to be more or less 
common. This is all a long winded way of saying that on the implementation 
side, it may make sense for pip to be strict VCS -> sdist -> wheel -> install 
at first, and see what issues that causes for people, and if barely anyone has 
any problems, well maybe great, we’re done. If there seems to be a number of 
folks running into issues that could/would be solved using whatever mechanism 
exists for going VCS -> wheel -> install, then we can start adding an option 
(and eventually migrating to on by default, then remove that option) to support 
doing that [2]. As long as the backend API is there, we can make decisions more 
“on the fly”.

All of that is a long winded way of saying I don’t particularly care if the VCS 
-> wheel -> install path is spelled out *always* doing in-place builds or if we 
add a build directory to specify between out of place or in place. Having a 
robust mechanism in place for doing that means we can adjust how things 
*typically* work without going back to the PEP process and throwing everything 
away.

Hopefully that all makes sense and is a useful sort of dumping of thoughts.

[1] One note, I noticed there’s still instances of prepare_wheel_metadata in 
the text.

[2] For an example, we’ve recently done with with —upgrade in order to better 
support projects like NumPy. The way pip works isn’t set in stone, and as we 
get more experience with new things we can adjust it.

—
Donald Stufft

_______________________________________________
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig

Re: [Distutils] A possible refactor/streamlining of PEP 517

Reply via email to