On Wednesday, May 23, 2018, Michael Sarahan <msara...@anaconda.com> wrote:
> > > On Wed, May 23, 2018 at 3:45 PM, Wes Turner <wes.tur...@gmail.com> wrote: > >> >> >> On Wednesday, May 23, 2018, Michael Sarahan <msara...@anaconda.com> >> wrote: >> >>> Thanks for starting this discussion, Victor! This topic is something >>> we're very interested in at Anaconda. >>> >>> I'd like to generalize the problem statement to the question of "how can >>> we make pip behave well when it is sharing package management with >>> something else? Similarly, how can we make the something else behave well >>> with pip?" >>> >>> We all share the pain of trying to have two package managers effectively >>> manage the same space. The alternate-folder-for-pip approach is a good >>> idea, but ultimately has issues that you pointed out. >>> >>> After several great conversations with many people at PyCon last week, I >>> came to the conclusion that conda and pip probably won't ever interoperate >>> very well. In order for them to do so, conda must respect all of pip's >>> constraints during its solving of dependencies, and likewise, pip must >>> respect conda's constraints. We are investigating the first of those >>> options and having some promising initial results, but the inverse is not >>> something that seems feasible. It amounts to pip having a solver that is >>> at least as good as the best supported package manager, and pip learning >>> about *all* of any other package managers that it claims to be compatible >>> with. That doesn't seem like a viable project. >>> >> What's the status on this? AFAIU, depsolving setup.py packages is blocked >> because it's necessary to execute setup.py to get the conditional >> requirements? >> > > Yes, executing setup.py is a blocker. Wheels are improving this. The > main complicating factor is conditional or optional dependencies, as you > say, and how to express or execute the branch logic required. setup.py > need not be executed all the time - only enough to gather the metadata to > feed an index. It's much more of a security concern than a time concern. > A first rough approach that didn't handle optional or conditional > requirements seems reasonable as a proof of concept. > > >> >> https://www.pypa.io/en/latest/roadmap/#pip-dependency-resolution >> >>> >>> Ultimately, I believe a better approach is for the PyPA to define a >>> minimal set of functionality and interfaces to PyPI that any package >>> manager claiming to manage python packages must implement. Pip can be a >>> reference implementation of that specification. Any distributor (Red Hat, >>> Canonical, Homebrew, Anaconda) could then have their own implementations >>> that use their solvers, but also can install software from PyPI at user's >>> request, or as a fall-through when a native package is unavailable. >>> >> Metadata compatibility and adapter registration could help solve for >> this; though there's no money and not much demand. #PEP426JSONLD >> > > With adapter registration, what would the adapters be? I'm just not sure > what pip's role should be. If you define interfaces for solving and > installing/uninstalling/upgrading the solved-for set of packages, maybe > that's enough? > Plugins/adapters/interface implementations. http://zopeinterface.readthedocs.io/en/latest/adapter.html Perhaps ironically and conveniently, the well-regarded pluggy system for plugins does not depend on setuptools entry_points: https://pluggy.readthedocs.io/en/latest/ Some way to avoid adding plugin/adapter registration overhead in site.py would be necessary. > #PEP426JSONLD looks great, but do you see that as a glue between PyPI and > pip, pip and other tools, or other tools and PyPI? I am impressed and > encouraged by your research into the topic, and I'd like to know if there's > a way that we can help with it if it would further our goals of having > conda be able to either interop with pip happily or install directly from > PyPI. > I piggybacked onto PEP426 because if we were going to substantially change metadata, we might as well make it linked data; ideally with a cross-language spec that would unfortunately take years to get compliance from EVERY vendor for/with. Metadata interoperability wouldn't be strictly necessary to achieve what I think you're describing; but otherwise there'd be such a disjoint dependency graph that determining what we've installed here and where we need to be would be frustratingly complex. > > >> >>> User interface could be unified by having "pip" on distributions be a >>> wrapper around the native package manager, matching the exact minimal >>> behavior of pip. >>> >> Would `sudo pip` then be the only way? >> > > Heavens, no. I'm only concerned about the myriad blog posts out there that > tell people to [sudo] pip install something. That is the user interface > that exists and is most common. It is the universal command that might not > be the right option, but by golly it's always an option. There is great > value in having only one way to tell people to do things. I'm proposing > that we make the underlying implementation of that user interface be > vendor-specific. Vendors can and should keep their own interfaces to > package management, too, because those are broader in scope than pip. > I'm not sure whether it'd be easier to debug interleaved calls to various package managers. Or to explain why `pip install` didn't work on my machine. Though I have often wondered whether I need to do `conda skeleton` PyPI (or fork the conda-forge template), and then manually merge version changes stably. There would need to be a map between pypi package name, conda package names, (os, ver) package names; which I think I addressed in that smattering of notes on the PEP426JSONLD issue. That catalog-to-catalog mapping data would need to be hosted somewhere. Warehouse can easily serve JSON if it's defined in the package. Maybe a blockchain with per-project TUF signing keys, package checksums/signatures, and VCS GPG keyrings could also host package metadata and package name mappings someday. > >> >>> The same kind of approach may also be good for virtual environments, but >>> it seems like there's less contention there. Conda is different enough >>> from virtualenv that we get some friction, but I think and hope we can >>> smooth that out over time. >>> >> `conda install pip` and `conda export -f environment.yml` seem to work >> okay (for Python, R, nodejs but not yet npm,)? >> >> Solving dependencies in a container is still the correct way to avoid >> cruft, IMHO. >> > > I agree, but sadly I can't force that on users. We (Anaconda and other > distribution managers) still have to support people who are happily > shooting themselves in the foot with "sudo pip" even when sudo is in no way > necessary for a normal conda installation. Until we (the Python community) > solve the package problems we have, we'll lose mindshare to less useful > (IMHO) tools that are easier to manage packages with. > A warning would be good. Is checking for `uid=0` sufficient? IDK what sort of timeline would be needed to requiring a CLI flag to bypass an error message when running pip as root. Containers often do this without specifying `USER nonroot` first in their Dockerfiles. And then setting read-only or other-user permissions does require root and so is better handled by actual OS package managers like FPM, IMHO. > > >> >>> >>> Best, >>> >> Michael >>> >>> On Wednesday, May 23, 2018, Victor Stinner <vstin...@redhat.com> wrote: >>> > Hi, >>> > >>> > pip is currently not well integrated on Linux: it conflicts with the >>> > system package manager like apt or rpm. When pip writes files into >>> > /usr, it can replace files written by the system package manager and >>> > so create different kind of issues. For example, if you check the >>> > system integry, you will likely see that some Python files have been >>> > modified. >>> > >>> > I would like to open a discussion to see how each Linux vendor handles >>> > the issue, and see if a common solution can be designed. >>> > >>> > Debian uses /usr for apt-get install and /usr/local for distutils and >>> > "sudo pip". >>> > >>> > Fedora decided to change pip to install files into /usr/local by >>> > default, instead of /usr, so "sudo pip install" doesn't replace files >>> > installed by dnf (Fedora package manager): >>> > https://fedoraproject.org/wiki/Changes/Making_sudo_pip_safe >>> > >>> > It gives you 3 main places to install Python code: /usr (managed by >>> > dnf), /usr/local (managed by sudo pip), $HOME/.local (managed by pip >>> > --user). >>> > >>> > Would it make sense to make the Fedora/Debian change upstream? At >>> > least, give an opt-in option for Linux vendors to use /usr/local? >>> > >>> > I propose to make the change upstream because there are still issues, >>> > and I don't want to be alone to have to fix them :-) It should be >>> > easier if we agree on a filesystem layout and an implementation, so >>> > we can collaborate on issues! >>> > >>> > >>> > Issues with the current Fedora implementation: >>> > >>> > (1) When Python is embedded in an application, there is an issue with >>> > the current heuristic to decide if /usr/local should be added to >>> > sys.path: >>> > >>> > https://bugzilla.redhat.com/show_bug.cgi?id=1532287 >>> > >>> > (2) On Fedora, "sudo pip install -U" currently removes old code from >>> > /usr and install the new one in /usr/local. We should leave /usr >>> > unchanged, since only dnf should touch /usr. >>> > >>> > https://bugzilla.redhat.com/show_bug.cgi?id=1550368#c24 >>> > >>> > The implementation is made of a single patch on the Python site module: >>> > >>> > https://src.fedoraproject.org/rpms/python3/blob/master/f/002 >>> 51-change-user-install-location.patch >>> > >>> > -- >>> > >>> > There are two issues related to the "sudo pip" change, but they >>> > already exist when pip is installed in $HOME/.local: >>> > >>> > (3) Priority issue between PATH and PYTHONPATH directories. >>> > >>> > When the user runs "pip", the pip binary may come from /usr, >>> > /usr/local or $HOME/.local/bin, but the Python pip module ("import >>> > pip") may come from a different path. Which binary and which module >>> > should be used? >>> > >>> > Obvisouly, users can replace these two environment variables... >>> > >>> > (4) Related to (3). Running "pip" may run pip binary of one pip >>> > version, but pick the "pip" Python module of another pip version. >>> > >>> > For example, pip9 binary from /usr/bin/pip, but pip10 module from >>> /usr/local. >>> > >>> > >>> > Fedora works around issue (4) with a downstream patch on pip: >>> > >>> > https://src.fedoraproject.org/rpms/python-pip/blob/master/f/ >>> pip9-allow-pip10-import.patch >>> > >>> > -- >>> > >>> > I don't well well how Linux distributions handle the issue with "sudo >>> > pip". So don't hesitate to correct me if I'm wrong :-) My goal is >>> > just to start a discussion about a common "upstream" solution. >>> > >>> > Victor >>> > -- >>> > Distutils-SIG mailing list >>> > distutils-sig@python.org >>> > https://mail.python.org/mm3/mailman3/lists/distutils-sig.python.org/ >>> > Message archived at https://mail.python.org/mm3/ar >>> chives/list/distutils-sig@python.org/message/OLGLHTSHLEPLHUT >>> TVNU6L5QFTMNFIB6Z/ >>> > >>> >> >
-- Distutils-SIG mailing list distutils-sig@python.org https://mail.python.org/mm3/mailman3/lists/distutils-sig.python.org/ Message archived at https://mail.python.org/mm3/archives/list/distutils-sig@python.org/message/EVP6ST5CHRJDY3AIEHUNFR3VY5UT2NSN/