On Tue, 21 Aug 2018 at 12:04, Tzu-ping Chung <uranu...@gmail.com> wrote:
>
> Hi,
>
> Dan and I had been doing most of the maintenance work for Pipenv recently, 
> and as Dan mentioned,
> we have been working on some related projects that poke into pip internals 
> significantly, so I feel I
> should voice some opinions. I have significantly less experience messing with 
> pip than Dan, and might
> be able to offer a slightly different perspective.

Thanks, this is really useful.

> Pipenv mainly interacts with pip for two things: install/uninstall/upgrade 
> packages, and to gain information
> about a package (what versions are available, what dependencies does a 
> particular version has, etc.).
> For the former case, we are currently using it with subprocesses, and it is 
> likely the intended way of
> interaction. I have to say, however, that the experience is not flawless. pip 
> has a significant startup time,
> and does not offer chances for interaction once it is started on running, so 
> we really don’t have a good
> way to, for example, provide installation progress bar for the user, unless 
> we parse pip’s stdout directly.
> These are not essential to Pipenv’s functionality, however, so they are more 
> like an annoyance rather
> than glaring problems.

Yes, that's a good point. A programmatic API to do installs would
presumably give much better means of progress reporting, etc.
Unfortunately, it's nowhere near as simple as we'd like, because pip
messes with global state all over the place, so if you just called the
old "pip.main" (before we moved it) you got things like your logging
config, your IO streams and such messed up. It also broke if used in
the presence of threads. Using a subprocess isn't just to protect our
internal APIs, it's also to protect the caller's global state. So
there's more work there than it would seem, and it's likely to affect
the fundamental assumptions of a lot of pip's internal code, but I
agree it would help with a lot of use cases.

The subprocess overhead is also something I can relate to. I'm heavily
running the pip test suite at the moment, and I'm royally sick of the
runtime from all the process spawning :-(

But as you say, it's something we can live with for now.

> The other thing Pipenv uses pip for—getting package information—is more 
> troubling (to me, personally).
> Pipenv has a slightly different need from pip regarding dependency 
> resolution. pip can (and does) freely
> drop dependencies that does not match the current environment, but Pipenv 
> needs to generate a lock file
> for an abstract platform that works for, say, both macOS and Windows. This 
> means pip’s resolver is not
> useful for us, and we need to implement our own. Our own resolver, however, 
> still needs to know about
> packages it gets, and we are left with two choices: a. try re-implement the 
> same logic, or b. use pip internals
> to cobble something together.
>
> We tried to go for a. for a while, but as you’d easily imagine, our own 
> implementation is buggy, cannot
> handle edge cases nearly as well, and fielded a lot of complaints along the 
> lines of “I can do this in pip, why
> can’t I do the same in Pipenv”. One example is how package artifacts are 
> discovered. At my own first
> glance, I thought to myself this wouldn’t be that hard—we have a simple API, 
> and the naming conventions are
> there, so as long as we specify sources in Pipfile (we do), we should be able 
> to discover them no problem.
> I couldn’t be more wrong. There are find_links, dependency_links, pip.conf 
> for the user, for the machine, all
> sorts of things, and for everything quirk in pip we don’t replicate 100%, 
> issues are filed urging use to fix it.
> In the end we gave up and use pip’s internal PackageFinder instead.

This is exactly the reason a common library/API and clear spec would
be worth working on. In effect you're having to treat "how pip works"
as a de facto standard that people expect you to follow, and that's
not practical. I've hit this issue as well (luckily, only in adhoc
code) where I want to "get files like pip does" but doing anything
beyond a basic minimum is a nightmare.

One thought on the package finder - distlib implements a finder, and
while it doesn't include a lot of the things you mention, it does
represent a competing implementation, and there's likely some mileage
in trying to have the two implementations converge on a reasonable
split between "standard finder behaviour" and "application (pip)
specific details". (For example, I think it would be entirely
reasonable for pipenv to say "we're not going to respect an
environment variable PIP_FIND_LINKS", but conversely it's a reasonable
user request to say "we'd like a standard way to specify local package
directories that all tools will respect").

I've said it before but it bears repeating - I'd fully support someone
pulling chunks of pip's code out and making them into supported 3rd
party libraries that we could use as a client. I doubt any of the
other pip developers would object either. But doing it properly is far
from a simple undertaking, and I've yet to see any real sign of anyone
offering to actually do that work, rather than just talking about it.
With the exception of what you guys did on pipenv, and ultimately that
ended up with you giving up and calling pip's internals...

> This is a big problem going forward, and we are fully aware of that. The 
> strategy we are taking at the
> moment is to try to limit the surface area of pip internals usage. Dan 
> mentioned we have been building a
> resolver for Pipenv[1],

This really ought to be co-ordinated with Pradyun's work on the pip
resolver over at https://github.com/pradyunsg/zazo...

> and we took the chance to work toward centralising things interfacing with pip
> internals. Those are still internals, of course, but we now have a relatively 
> good idea what we actually need
> from pip, and I’d be extremely happy if some parts of pip can come out as 
> standalone with official blessing.
> The things I am particularly interested in (since they would be beneficial 
> for Pipenv) are:
>
> * VcsSupport
> * PackageFinder
> * WheelBuilder (and everything that comes with it like the wheel cache, 
> preparer, unpack_url, etc.)

All of those sound like reasonable things to consider. Although
"everything that comes with" the WheelBuilder is quite a lot, in
practice (you didn't explicitly mention the requirement object, and
that's a big one).

I don't think anyone has a problem with this sort of stuff in
principle, it's just getting the resources to do it. I wonder if any
sort of sponsorship would offer an option here? Although it's not just
the initial development, we'd need to make sure we had sustainable
support (or at least, it wasn't any worse than what we have now...)

Paul
--
Distutils-SIG mailing list -- distutils-sig@python.org
To unsubscribe send an email to distutils-sig-le...@python.org
https://mail.python.org/mm3/mailman3/lists/distutils-sig.python.org/
Message archived at 
https://mail.python.org/mm3/archives/list/distutils-sig@python.org/message/SZ2PMUD7G6IUOYACWNWQKOKUT7AAN77I/

Reply via email to