On Fri, Jan 26, 2018 at 8:37 PM, Nick Coghlan <ncogh...@gmail.com> wrote: > On 27 January 2018 at 13:46, Nathaniel Smith <n...@pobox.com> wrote: >> >> The advantages are: >> >> - it's a simpler way to record information the information you want >> here, without adding more special cases to dist-info: most code >> doesn't even have to know what 'extras' are, just what packages are >> >> - it opens the door to lots of more advanced features, like >> 'foo[test]' being a package that actually contains foo's tests, or >> build variants like 'numpy[mkl]' being numpy built against the MKL >> library, or maybe making it possible to track which version of numpy's >> ABI different packages use. (The latter two cases need some kind of >> provides: support, which is impossible right now because we don't want >> to allow random-other-package to say 'provides-dist: cryptography'; >> but, it would be okay if 'numpy[mkl]' said 'provides-dist: numpy', >> because we know 'numpy[mkl]' and 'numpy' are maintained by the same >> people.) >> >> I know there's a lot of precedent for this kind of clever use of >> metadata-only packages in Debian (e.g. search for "metapackages"), and >> I guess the RPM world probably has similar tricks. > > > While I agree with this idea in principle, I'll note that RPM makes it > relatively straightforward to have a single SRPM emit multiple RPMs, so > defining a metapackage is just a few extra lines in a spec file. (I'm not > sure how Debian's metapackages work, but I believe they're similarly simple > on the publisher's side). > > We don't currently have a comparable mechanism to readily allow a single > source project to expand to multiple package index entries that all share a > common sdist, but include different subsets in their respective wheel files > (defining one would definitely be possible, it's just a tricky migration > problem to work out).
Yeah, the migration is indeed the tricky part. Here's one possible approach. First, figure out what exactly an "extra" should become in the new system. I think it's: if package $PACKAGE version $VERSION defines an extra $EXTRA, then that corresponds to a wheel named "$PACKAGE[$EXTRA]" (the brackets become part of the package name), version $VERSION, and it has Requires-Dist: $PACKAGE = $VERSION, as well as whatever requirements were originally part of the extra. Now, if we didn't have to worry about migrations, we'd extend setuptools/bdist_wheel so that when they see the current syntax for defining an extra, they generate extra wheels following the formula above. (So 'setup.py bdist_wheel' generates N+1 wheels for a package with N extras.) And we'd teach PyPI that packages named like "$PACKAGE[$EXTRA]" should be collected together with packages named "$PACKAGE" (e.g. the same access control apply to both, and probably you want to display them together in the UI when their versions match). And we'd teach pip that square brackets are legal in package names. And that'd be about it. Of course, we do have to worry about migration, and in the first instance, what we care about is making pip's database of installed packages properly record these new wheels. So my proposal is: - Requirements like 'requests[security,socks]' need to be expanded to 'requests[security], requests[socks]'. More specifically, when pip processes a requirement like '$PACKAGE[$EXTRA1,$EXTRA2,...] $OP1 $VERSION1, $OP2 $VERSION2, ...', it expands it to multiple packages and then applies the constraints to each of them: ['$PACKAGE[$EXTRA1] $OP1 $VERSION1, $OP2 $VERSION2 ...', '$PACKAGE[$EXTRA2] $OP1 $VERSION1, $OP2 $VERSION2 ...', ...]. - When pip needs to find a wheel like 'requests[security]', then it first checks to see if this exact wheel (with the brackets) is available on PyPI (or whatever package sources it has available). If so, it uses that. If not, then it falls back to looking for a 'requests' wheel, and if it finds one, and that wheel has 'extra' metadata, then it *uses that metadata to generate a wheel on the spot*, and then carries on as if it had found it on PyPI. - Special case: when hash-checking mode is enabled and pip ends up doing this fallback, then pip always checks the hash against the wheel it found on PyPI – so 'requests[security] --hash=...' checks the hash of requests.whl, not the auto-generated requests[security].whl. (There is some question to discuss here about how sdists should be handled: in many cases, possibly all of them, it doesn't really make sense to have separate sdists for different square-bracket packages. 'requests[security]' will probably always be generated from the requests source tree, and for build variants like 'numpy[mkl]' you definitely want to build that from numpy.tar.gz, with some special flag telling it to use the "mkl" configuration. So maybe when pip fails to find a wheel it should always go straight to the un-adorned sdist like requests.tar.gz, instead of checking for requests[security].tar.gz. But this isn't going to make-or-break a v1 implementation.) If we implement just these two things, then I think that's enough for pip to start immediately tracking all the proper metadata for existing extras packages, and also provide a smooth onramp to the eventual features. Once this was working, we could enable uploading square-bracket packages to PyPI, and pip would start automatically picking them up where present. Then we could flip the switch for setuptools to start generating such packages. We'd probably also want to tweak PEP 517 so that the build backend is informed of exactly which package pip is looking for, to handle the case where numpy.tar.gz is expected to produce numpy[mkl].whl. And after that we could enable real Provides-Dist: and Provides-Dist: metadata... -n -- Nathaniel J. Smith -- https://vorpus.org _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig