On 8 October 2015 at 11:18, Oscar Benjamin <oscar.j.benja...@gmail.com> wrote: > On 7 October 2015 at 22:41, Paul Moore <p.f.mo...@gmail.com> wrote: >> On 7 October 2015 at 22:28, Nathaniel Smith <n...@pobox.com> wrote: >>> Maybe I have misunderstood: does it actually help pip at all to have >>> static access to name and version, but not to anything else? I've been >>> assuming not, but I don't think anyone's pointed to any examples yet >>> of the problems that pip is encountering due to the lack of static >>> metadata -- would this actually be enough to solve them? >> >> The principle I am working on is that *all* metadata in a source wheel >> should be statically available - that's not just for pip, but for all >> other consumers, including distro packagers. What's not set in stone >> is precisely what (subsets of) metadata are appropriate for source >> wheels as opposed to (binary) wheels. > > A concrete example would be whether or not the numpy source wheel > depends on pyopenblas. Depending on how numpy is built the binary > wheel may or may not depend on pyopenblas. It doesn't make any sense > to say that the numpy source release depends on pyopenblas so what > should be the dependencies of the source wheel?
Well, I said this previously but I don't have any objections to the idea that binary wheels have additional dependencies - so the source wheel doesn't depend on pyopenblas but the binary does. But as I understand it, this is currently theoretical - there isn't yet any pyopenblas validate these speculations against? I say this not because I think the approach is invalid, but because I think there are probably a lot of untested questions that need answering. Let's expand the scenario a bit. The user (presumably) still just says "python -m pip install numpy". What happens then? 1. Assume there's a binary wheel that's compatible with the user's platform. 1a. If there are multiple compatible binary wheels, pip chooses the "most compatible" so we're safe to assume there's only one. [1] 2. Looking at the dependencies, say it depends on pyopenblas. So pip needs to install pyopenblas. 2a. If there's a compatible wheel for pyopenblas, pip installs that too. 2b. If there's no compatible pyopenblas wheel, pip falls back to a source wheel, builds it, and uses that. If the build fails, the whole numpy install fails. 3. If there's no compatible numpy binary wheel, pip gets the source wheel and builds it. There's no user interaction possible here [2], so the build uses whatever defaults the numpy build process identifies as "most appropriate" for the user's platform. This may be simply a lowest common denominator, or it may do some form of introspection of the user's system to get the best possible build. Either way, a wheel is generated that's known to work on the user's system, so there should be no additional dependencies injected at this point, and pip will use that wheel directly. The only constraint here is that a binary numpy wheel built with the default options on a given machine from a numpy source wheel cannot have extra dependencies that aren't known to be already satisfied by the user's system, because by the time pip generates a wheel from the source wheel, it's finished doing dependency resolution so any new dependencies won't get checked. I don't see it as a problem for any hypothetical new build system to conform to this constraint - by default a built wheel must work on the system it's built on. All it means is that to build binaries with additional dependencies must be done manually, supplying options describing your intent. [1] Dependencies are *not* considered as part of the compatibility matching, so it's correct that this step happens before the dependency checks. Maybe you're assuming that if there are two wheels, one depending on pyopenblas and one not, then if the user doesn't have pyopenblas installed the wheel that doesn't depend on it will be used? But that's not how pip works. [2] When pip runs installs, it does so non-interactively. Whatever command pip uses to build a wheel ("python setup.py bdist_wheel" at the moment) must run without user interaction and produce a wheel that is compatible with the user's environment. So unless I'm mistaken about what you're saying, I don't see any issue here. Unless you're saying that you're not willing to work under some of the constraints I describe above - but in that case, you need pip's compatibility matching, dependency resolution, or automated wheel build processes to change. That's fine but to move the discussion forwards, we'd then need to understand (and agree with) whatever changes you need in pip. At the moment, I'm not aware that anyone has asked for substantive changes to pip's behaviour in these areas as part of this proposal. > One possibility which I think is what Nathaniel is getting at is that > there is a source release and then that could be used to generate > different possible source wheels each of which would correspond to a > particular configuration of numpy. Each source wheel would correspond > to one binary wheel and have all static metadata but there still needs > to be a separate source release that is used to generate the different > source wheels. That's possible, but what would these multiple source wheels be called? They couldn't all be called "numpy" as how would the user say which one they wanted? Pip can't decide. They can't be called numpy and distinguished by versions, as then how would you decide whether "numpy with openblas" is "newer" or "older" than "numpy with MKL"? That's the issue with Christoph Gohlke's current means of versioning his MKL builds. So you're looking at multiple PyPI projects, one for each "flavour" of numpy. Or you're looking at changes to how PyPI and pip define a "project". Neither of those options sound particularly straightforward to me. > The step that turns a source wheel into a binary wheel would be > analogous to the ./configure step in a typical makefile project. > ./configure is used to specify the options corresponding to all the > different ways of compiling and installing the project. After running > ./configure the command "make" is unparametrised and performs the > actual compilation: this step is analogous to converting a source > wheel to a binary wheel. But the Python (PyPI/pip) model is different from the autoconf "typical makefile project" model. There's no configure step. If you're proposing that we add one, then that's a pretty major change in structure and would have some fairly wide-ranging impacts (on PyPI and pip, and also on 3rd party projects like bandersnatch and devpi). I don't think we're even close to understanding how we'd manage such a change. > I think this satisfies all of the requirements for static metadata and > one-to-one correspondence of source wheels and binary wheels. If numpy > followed this then I imagine that there would be a single source wheel > on PyPI corresponding to the one configuration that would be used > consistently there. However numpy still needs to separately release > the code in a form that is also usable in all of the many other > contexts that it is already used. IOW they will need to continue to > issue source releases in more or less the same form as today. It makes > sense for PyPI to host the source release archives on the project page > even if pip will simply ignore them. So you're talking about numpy only supporting one configuration via PyPI, and expecting any other configurations to be made available only via other channels? I guess you could do that, but I hope you won't. It feels to me like giving up before we've properly tried to understand the issues. Paul _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig