On Wednesday, May 23, 2018, Michael Sarahan <msara...@anaconda.com> wrote:

>
>
> On Wed, May 23, 2018 at 3:45 PM, Wes Turner <wes.tur...@gmail.com> wrote:
>
>>
>>
>> On Wednesday, May 23, 2018, Michael Sarahan <msara...@anaconda.com>
>> wrote:
>>
>>> Thanks for starting this discussion, Victor!  This topic is something
>>> we're very interested in at Anaconda.
>>>
>>> I'd like to generalize the problem statement to the question of "how can
>>> we make pip behave well when it is sharing package management with
>>> something else?  Similarly, how can we make the something else behave well
>>> with pip?"
>>>
>>> We all share the pain of trying to have two package managers effectively
>>> manage the same space.  The alternate-folder-for-pip approach is a good
>>> idea, but ultimately has issues that you pointed out.
>>>
>>> After several great conversations with many people at PyCon last week, I
>>> came to the conclusion that conda and pip probably won't ever interoperate
>>> very well.  In order for them to do so, conda must respect all of pip's
>>> constraints during its solving of dependencies, and likewise, pip must
>>> respect conda's constraints.  We are investigating the first of those
>>> options and having some promising initial results, but the inverse is not
>>> something that seems feasible.  It amounts to pip having a solver that is
>>> at least as good as the best supported package manager, and pip learning
>>> about *all* of any other package managers that it claims to be compatible
>>> with.  That doesn't seem like a viable project.
>>>
>> What's the status on this? AFAIU, depsolving setup.py packages is blocked
>> because it's necessary to execute setup.py to get the conditional
>> requirements?
>>
>
> Yes, executing setup.py is a blocker.  Wheels are improving this.  The
> main complicating factor is conditional or optional dependencies, as you
> say, and how to express or execute the branch logic required.  setup.py
> need not be executed all the time - only enough to gather the metadata to
> feed an index.  It's much more of a security concern than a time concern.
> A first rough approach that didn't handle optional or conditional
> requirements seems reasonable as a proof of concept.
>
>
>>
>> https://www.pypa.io/en/latest/roadmap/#pip-dependency-resolution
>>
>>>
>>> Ultimately, I believe a better approach is for the PyPA to define a
>>> minimal set of functionality and interfaces to PyPI that any package
>>> manager claiming to manage python packages must implement.  Pip can be a
>>> reference implementation of that specification.  Any distributor (Red Hat,
>>> Canonical, Homebrew, Anaconda) could then have their own implementations
>>> that use their solvers, but also can install software from PyPI at user's
>>> request, or as a fall-through when a native package is unavailable.
>>>
>>  Metadata compatibility and adapter registration could help solve for
>> this; though there's no money and not much demand. #PEP426JSONLD
>>
>
> With adapter registration, what would the adapters be?  I'm just not sure
> what pip's role should be.  If you define interfaces for solving and
> installing/uninstalling/upgrading the solved-for set of packages, maybe
> that's enough?
>

Plugins/adapters/interface implementations.
http://zopeinterface.readthedocs.io/en/latest/adapter.html

Perhaps ironically and conveniently, the well-regarded pluggy system for
plugins does not depend on setuptools entry_points:
https://pluggy.readthedocs.io/en/latest/

Some way to avoid adding plugin/adapter registration overhead in site.py
would be necessary.


>  #PEP426JSONLD looks great, but do you see that as a glue between PyPI and
> pip, pip and other tools, or other tools and PyPI?  I am impressed and
> encouraged by your research into the topic, and I'd like to know if there's
> a way that we can help with it if it would further our goals of having
> conda be able to either interop with pip happily or install directly from
> PyPI.
>

I piggybacked onto PEP426 because if we were going to substantially change
metadata, we might as well make it linked data; ideally with a
cross-language spec that would unfortunately take years to get compliance
from EVERY vendor for/with.

Metadata interoperability wouldn't be strictly necessary to achieve what I
think you're describing; but otherwise there'd be such a disjoint
dependency graph that determining what we've installed here and where we
need to be would be frustratingly complex.


>
>
>>
>>> User interface could be unified by having "pip" on distributions be a
>>> wrapper around the native package manager, matching the exact minimal
>>> behavior of pip.
>>>
>> Would `sudo pip` then be the only way?
>>
>
> Heavens, no. I'm only concerned about the myriad blog posts out there that
> tell people to [sudo] pip install something.  That is the user interface
> that exists and is most common.  It is the universal command that might not
> be the right option, but by golly it's always an option.  There is great
> value in having only one way to tell people to do things.  I'm proposing
> that we make the underlying implementation of that user interface be
> vendor-specific.  Vendors can and should keep their own interfaces to
> package management, too, because those are broader in scope than pip.
>

I'm not sure whether it'd be easier to debug interleaved calls to various
package managers. Or to explain why `pip install` didn't work on my machine.

Though I have often wondered whether I need to do `conda skeleton` PyPI (or
fork the conda-forge template), and then manually merge version changes
stably.

There would need to be a map between pypi package name, conda package
names, (os, ver) package names; which I think I addressed in that
smattering of notes on the PEP426JSONLD issue.
That catalog-to-catalog mapping data would need to be hosted somewhere.
Warehouse can easily serve JSON if it's defined in the package.

Maybe a blockchain with per-project TUF signing keys, package
checksums/signatures, and VCS GPG keyrings could also host package metadata
and package name mappings someday.


>
>>
>>> The same kind of approach may also be good for virtual environments, but
>>> it seems like there's less contention there.  Conda is different enough
>>> from virtualenv that we get some friction, but I think and hope we can
>>> smooth that out over time.
>>>
>> `conda install pip` and `conda export -f environment.yml` seem to work
>> okay (for Python, R, nodejs but not yet npm,)?
>>
>> Solving dependencies in a container is still the correct way to avoid
>> cruft, IMHO.
>>
>
> I agree, but sadly I can't force that on users.  We (Anaconda and other
> distribution managers) still have to support people who are happily
> shooting themselves in the foot with "sudo pip" even when sudo is in no way
> necessary for a normal conda installation.  Until we (the Python community)
> solve the package problems we have, we'll lose mindshare to less useful
> (IMHO) tools that are easier to manage packages with.
>

A warning would be good. Is checking for `uid=0` sufficient?

IDK what sort of timeline would be needed to requiring a CLI flag to bypass
an error message when running pip as root.

Containers often do this without specifying `USER nonroot` first in their
Dockerfiles. And then setting read-only or other-user permissions does
require root and so is better handled by actual OS package managers like
FPM, IMHO.


>
>
>>
>>>
>>> Best,
>>>
>> Michael
>>>
>>> On Wednesday, May 23, 2018, Victor Stinner <vstin...@redhat.com> wrote:
>>> > Hi,
>>> >
>>> > pip is currently not well integrated on Linux: it conflicts  with the
>>> > system package manager like apt or rpm. When pip writes files  into
>>> > /usr, it can replace files written by the system package manager  and
>>> > so create different kind of issues. For example, if you check the
>>> > system integry, you will likely see that some Python files have been
>>> > modified.
>>> >
>>> > I would like to open a discussion to see how each Linux vendor handles
>>> > the issue, and see if a common solution can be designed.
>>> >
>>> > Debian uses /usr for apt-get install and /usr/local for distutils and
>>> > "sudo pip".
>>> >
>>> > Fedora  decided to change pip to install files into /usr/local by
>>> > default,  instead of /usr, so "sudo pip install" doesn't replace files
>>> > installed  by dnf (Fedora package manager):
>>> > https://fedoraproject.org/wiki/Changes/Making_sudo_pip_safe
>>> >
>>> > It  gives you 3 main places to install Python code: /usr (managed by
>>> > dnf),  /usr/local (managed by sudo pip), $HOME/.local (managed by pip
>>> > --user).
>>> >
>>> > Would it make sense to make the Fedora/Debian change upstream? At
>>> > least, give an opt-in option for Linux vendors to use /usr/local?
>>> >
>>> > I  propose to make the change upstream because there are still issues,
>>> > and  I don't want to be alone to have to fix them :-) It should be
>>> > easier if  we agree on a filesystem layout and an implementation, so
>>> > we can  collaborate on issues!
>>> >
>>> >
>>> > Issues with the current Fedora implementation:
>>> >
>>> > (1)  When Python is embedded in an application, there is an issue with
>>> > the  current heuristic to decide if /usr/local should be added to
>>> > sys.path:
>>> >
>>> > https://bugzilla.redhat.com/show_bug.cgi?id=1532287
>>> >
>>> > (2)  On Fedora, "sudo pip install -U" currently removes old code from
>>> > /usr  and install the new one in /usr/local. We should leave /usr
>>> > unchanged,  since only dnf should touch /usr.
>>> >
>>> > https://bugzilla.redhat.com/show_bug.cgi?id=1550368#c24
>>> >
>>> > The implementation is made of a single patch on the Python site module:
>>> >
>>> > https://src.fedoraproject.org/rpms/python3/blob/master/f/002
>>> 51-change-user-install-location.patch
>>> >
>>> > --
>>> >
>>> > There are two issues related to the "sudo pip" change, but they
>>> > already exist when pip is installed in $HOME/.local:
>>> >
>>> > (3) Priority issue between PATH and PYTHONPATH directories.
>>> >
>>> > When  the user runs "pip", the pip binary may come from /usr,
>>> > /usr/local or  $HOME/.local/bin, but the Python pip module ("import
>>> > pip") may come from  a different path. Which binary and which module
>>> > should be used?
>>> >
>>> > Obvisouly, users can replace these two environment variables...
>>> >
>>> > (4)  Related to (3). Running "pip" may run pip binary of one pip
>>> > version,  but pick the "pip" Python module of another pip version.
>>> >
>>> > For example, pip9 binary from /usr/bin/pip, but pip10 module from
>>> /usr/local.
>>> >
>>> >
>>> > Fedora works around issue (4) with a downstream patch on pip:
>>> >
>>> > https://src.fedoraproject.org/rpms/python-pip/blob/master/f/
>>> pip9-allow-pip10-import.patch
>>> >
>>> > --
>>> >
>>> > I  don't well well how Linux distributions handle the issue with "sudo
>>> >  pip". So don't hesitate to correct me if I'm wrong :-) My goal is
>>> > just  to start a discussion about a common "upstream" solution.
>>> >
>>> > Victor
>>> > --
>>> > Distutils-SIG mailing list
>>> > distutils-sig@python.org
>>> > https://mail.python.org/mm3/mailman3/lists/distutils-sig.python.org/
>>> > Message archived at https://mail.python.org/mm3/ar
>>> chives/list/distutils-sig@python.org/message/OLGLHTSHLEPLHUT
>>> TVNU6L5QFTMNFIB6Z/
>>> >
>>>
>>
>
--
Distutils-SIG mailing list
distutils-sig@python.org
https://mail.python.org/mm3/mailman3/lists/distutils-sig.python.org/
Message archived at 
https://mail.python.org/mm3/archives/list/distutils-sig@python.org/message/EVP6ST5CHRJDY3AIEHUNFR3VY5UT2NSN/

Reply via email to