Hi all, I'm finding it impossible to keep track of that other thread, and I guess I'm probably not the only one, so I figured I'd try splitting a few of the more specific discussions out :-).
One thing that seems to be a key issue, but where I remain very confused, is the question of what pip actually needs from an sdist. (Not PyPI, just pip or other package install tools.) Right now, IIUC, there are three times that pip install touches sdist-related metadata: 1) it uses the name+version that are embedded in the sdist filename to select an sdist from an index like PyPI 2) after unpacking this sdist it then calls 'setup.py egg_info' to get the full metadata for the wheel (or wheel equivalent) that this sdist will eventually produce. Specifically what it does with this is extract the setup_requires and install_requires fields, and uses them to go find other wheels/sdists that also need to be installed 3) eventually it actually builds the package, and this produces a wheel (or wheel equivalent) that has its own metadata (which often matches the metadata from egg_info in step (2), but not always) Is that a correct description of current behavior? Is there anything that pip ever looks at besides name, version, dependencies? Paul says that this is broken, and that pip gets lots of bug reports that "can be traced back to needing to run setup.py egg-info to get metadata" [1]. Since AFAICT the only metadata that pip actually touches is name, version, and dependencies, and it already knows the name and version before it runs egg_info, I assume that what this means is that it's crucial for pip to have static access to dependency information? OTOH in another email Paul says that name and version are the minimum he wants [2], so maybe I'm reading too much into this :-). >From the discussion so far, it sounds like the particularly crucial question is whether pip needs to statically know dependencies before building a wheel. Trying to reason through from first principles, I can't see any reason why it would. It would be somewhat convenient if sdists did list their binary dependencies: if that were the case, then pip could take a strictly phased approach: 1) solve the complete dependency graph to find a set of packages to install / remove 2) for all the packages-to-be-installed that are sdists, turn them into wheels 3) install all the wheels OTOH if sdists have only name and version statically, but not dependency information, then you need to do something like: 1) create a fake dependency graph that contains accurate information for all known wheels, and for each sdist add a fake node that has the right name and version number but pretends not to have any dependencies. 2) solve this graph to find the set of packages to install / remove 3) if any of the packages-to-be-installed are sdists, then fetch them, run egg_info or build them or whatever to get their real dependencies, add these to the graph, and go to step 1 4) else, we have wheels for everything; install them. (This works because dependencies are constraints -- adding dependencies can only reduce the space of possible solutions, never enlarge it. Also, because by the time we decide to fetch and build any sdists, we already know that we're very likely to want to install them, so the performance penalty for building packages we turn out not to want is not high. And, crucially, we know that there exists some set of dependency metadata which would convince us to install these sdists, and dependency metadata is under the package author's control, so we already have established a trust route to the author of this package -- if they don't declare any dependencies, then we'll be installing and running arbitrary code of theirs, so running arbitrary code to check their dependencies doesn't require any additional trust.) But there's often a large difference between what we work out from first principles and how things actually work :-). Is there anything I'm missing in the analysis above? Do the relevant pip maintainers even read this mailing list? :-) -n [1] https://mail.python.org/pipermail/distutils-sig/2015-October/026960.html [2] https://mail.python.org/pipermail/distutils-sig/2015-October/026942.html -- Nathaniel J. Smith -- http://vorpus.org _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig