[Distutils] Re: pipenv and pip

Dan Ryan Tue, 21 Aug 2018 17:32:41 -0700

To be honest, I don’t feel like I have enough background in the larger 
ecosystem to speak broadly enough about what goes on the agenda for the core 
sprints or even what holes there are in library support.  Should pipenv’s usage 
be considered niche?  If you want to skip the listing below, a lot of user 
issues can be summarized as “why don’t things work the way they do in 
<node/rust/ruby/other languages>?”  There’s some validity to the questions.  
Whenever we get away from having to execute code to figure out dependencies, it 
will help a lot

·         Why can’t we pin our packages the way npm and yarn do by default?

o   I mean we *could*, but we (as in python) don’t enforce any kind of 
versioning, so pipenv doesn’t do this by default because it doesn’t imply any 
kind of actual guarantee (not that things work better for JS doing this anyway 
necessarily)

·         Why do things resolve differently on <platform x> than they do on 
<platform y>?

·         Why do things resolve differently on <python version x> than they do 
on <python version y>?

·         Why do I get only the hashes for an sdist on my windows machine but 
when I go to my mac I get a hash checking failure because it downloads a wheel

·         Marker parsing, marker merging

·         Basically anything related to having to execute setup.py/build wheels 
to figure out dependencies

·         Parsing/normalizing/Querying the index ** this is how I got involved 
with pipenv in the first place 

·         Shell specific configuration issues (locale/encoding issues, 
bashrc/fish config/etc)

> (For example, I think it would be entirely
reasonable for pipenv to say "we're not going to respect an
environment variable PIP_FIND_LINKS", but conversely it's a reasonable
user request to say "we'd like a standard way to specify local package
directories that all tools will respect").

Maybe so, but our users expect pipenv to provide all of the functionality pip 
does, plus resolution + virtualenv creation.  Since we just drop to pip on 
installation anyway, we might as well honor pip’s configuration for things like 
this.  Looking at it from a user experience perspective, I wouldn’t want to set 
the same configuration values in 2 different places just because we can’t share 
an internal API for parsing configurations – that’s why we are just using 
whatever you guys are doing, because we don’t want to force people to do this 
twice, and the information is already available.  But I guess this is the 
problem :)

We haven’t really taken the time to survey this information, but it’s certainly 
possible that a standardized configuration would make sense.

That said, we’re still finding our feet, and a lot of our issues are just bugs 
on our end.  Our biggest pain point has been some combination of sanitizing and 
parsing inputs/markers -- pip does a LOT of internal parsing which is not in 
any other library (besides requirementslib, somewhat).

Dan Ryan

gh: @techalchemy <https://github.com/techalchemy>  // e: d...@danryan.co

From: Brett Cannon [mailto:br...@python.org] 
Sent: Tuesday, August 21, 2018 1:22 PM
To: Paul Moore
Cc: Distutils
Subject: [Distutils] Re: pipenv and pip

Since this ties into what's being discussed, I'll mention that on pypa-dev I 
created an outline of where I saw holes in library support and specs in order 
to be able to re-constitute pip just from libraries (mostly for the wheel 
case): https://groups.google.com/forum/#!topic/pypa-dev/91QdZ1vxLT8 .

I'll also mention I already have a design done for PEP 425 compatibility tags 
that I hope to work on at the Python core dev sprints next month so I can try 
to get it added to 'packaging'.

On Tue, 21 Aug 2018 at 05:11 Paul Moore <p.f.mo...@gmail.com> wrote:

On Tue, 21 Aug 2018 at 12:04, Tzu-ping Chung <uranu...@gmail.com> wrote:
>
> Hi,
>
> Dan and I had been doing most of the maintenance work for Pipenv recently, 
> and as Dan mentioned,
> we have been working on some related projects that poke into pip internals 
> significantly, so I feel I
> should voice some opinions. I have significantly less experience messing with 
> pip than Dan, and might
> be able to offer a slightly different perspective.

Thanks, this is really useful.

> Pipenv mainly interacts with pip for two things: install/uninstall/upgrade 
> packages, and to gain information
> about a package (what versions are available, what dependencies does a 
> particular version has, etc.).
> For the former case, we are currently using it with subprocesses, and it is 
> likely the intended way of
> interaction. I have to say, however, that the experience is not flawless. pip 
> has a significant startup time,
> and does not offer chances for interaction once it is started on running, so 
> we really don’t have a good
> way to, for example, provide installation progress bar for the user, unless 
> we parse pip’s stdout directly.
> These are not essential to Pipenv’s functionality, however, so they are more 
> like an annoyance rather
> than glaring problems.

Yes, that's a good point. A programmatic API to do installs would
presumably give much better means of progress reporting, etc.
Unfortunately, it's nowhere near as simple as we'd like, because pip
messes with global state all over the place, so if you just called the
old "pip.main" (before we moved it) you got things like your logging
config, your IO streams and such messed up. It also broke if used in
the presence of threads. Using a subprocess isn't just to protect our
internal APIs, it's also to protect the caller's global state. So
there's more work there than it would seem, and it's likely to affect
the fundamental assumptions of a lot of pip's internal code, but I
agree it would help with a lot of use cases.

The subprocess overhead is also something I can relate to. I'm heavily
running the pip test suite at the moment, and I'm royally sick of the
runtime from all the process spawning :-(

But as you say, it's something we can live with for now.

> The other thing Pipenv uses pip for—getting package information—is more 
> troubling (to me, personally).
> Pipenv has a slightly different need from pip regarding dependency 
> resolution. pip can (and does) freely
> drop dependencies that does not match the current environment, but Pipenv 
> needs to generate a lock file
> for an abstract platform that works for, say, both macOS and Windows. This 
> means pip’s resolver is not
> useful for us, and we need to implement our own. Our own resolver, however, 
> still needs to know about
> packages it gets, and we are left with two choices: a. try re-implement the 
> same logic, or b. use pip internals
> to cobble something together.
>
> We tried to go for a. for a while, but as you’d easily imagine, our own 
> implementation is buggy, cannot
> handle edge cases nearly as well, and fielded a lot of complaints along the 
> lines of “I can do this in pip, why
> can’t I do the same in Pipenv”. One example is how package artifacts are 
> discovered. At my own first
> glance, I thought to myself this wouldn’t be that hard—we have a simple API, 
> and the naming conventions are
> there, so as long as we specify sources in Pipfile (we do), we should be able 
> to discover them no problem.
> I couldn’t be more wrong. There are find_links, dependency_links, pip.conf 
> for the user, for the machine, all
> sorts of things, and for everything quirk in pip we don’t replicate 100%, 
> issues are filed urging use to fix it.
> In the end we gave up and use pip’s internal PackageFinder instead.

This is exactly the reason a common library/API and clear spec would
be worth working on. In effect you're having to treat "how pip works"
as a de facto standard that people expect you to follow, and that's
not practical. I've hit this issue as well (luckily, only in adhoc
code) where I want to "get files like pip does" but doing anything
beyond a basic minimum is a nightmare.

One thought on the package finder - distlib implements a finder, and
while it doesn't include a lot of the things you mention, it does
represent a competing implementation, and there's likely some mileage
in trying to have the two implementations converge on a reasonable
split between "standard finder behaviour" and "application (pip)
specific details". (For example, I think it would be entirely
reasonable for pipenv to say "we're not going to respect an
environment variable PIP_FIND_LINKS", but conversely it's a reasonable
user request to say "we'd like a standard way to specify local package
directories that all tools will respect").

I've said it before but it bears repeating - I'd fully support someone
pulling chunks of pip's code out and making them into supported 3rd
party libraries that we could use as a client. I doubt any of the
other pip developers would object either. But doing it properly is far
from a simple undertaking, and I've yet to see any real sign of anyone
offering to actually do that work, rather than just talking about it.
With the exception of what you guys did on pipenv, and ultimately that
ended up with you giving up and calling pip's internals...

> This is a big problem going forward, and we are fully aware of that. The 
> strategy we are taking at the
> moment is to try to limit the surface area of pip internals usage. Dan 
> mentioned we have been building a
> resolver for Pipenv[1],

This really ought to be co-ordinated with Pradyun's work on the pip
resolver over at https://github.com/pradyunsg/zazo...

> and we took the chance to work toward centralising things interfacing with pip
> internals. Those are still internals, of course, but we now have a relatively 
> good idea what we actually need
> from pip, and I’d be extremely happy if some parts of pip can come out as 
> standalone with official blessing.
> The things I am particularly interested in (since they would be beneficial 
> for Pipenv) are:
>
> * VcsSupport
> * PackageFinder
> * WheelBuilder (and everything that comes with it like the wheel cache, 
> preparer, unpack_url, etc.)

All of those sound like reasonable things to consider. Although
"everything that comes with" the WheelBuilder is quite a lot, in
practice (you didn't explicitly mention the requirement object, and
that's a big one).

I don't think anyone has a problem with this sort of stuff in
principle, it's just getting the resources to do it. I wonder if any
sort of sponsorship would offer an option here? Although it's not just
the initial development, we'd need to make sure we had sustainable
support (or at least, it wasn't any worse than what we have now...)

--
Distutils-SIG mailing list -- distutils-sig@python.org
To unsubscribe send an email to distutils-sig-le...@python.org
https://mail.python.org/mm3/mailman3/lists/distutils-sig.python.org/
Message archived at 
https://mail.python.org/mm3/archives/list/distutils-sig@python.org/message/ATGMAUEQOXTARK45TQLEMD3LA26O2XRK/

[Distutils] Re: pipenv and pip

Reply via email to