Re: [Catalog-sig] V2 pre-PEP: transitioning to release file hosting on PYPI
On Tue, Mar 12, 2013 at 12:59 PM, M.-A. Lemburg m...@egenix.com wrote: I think we should establish a versioned API like that for PyPI to make progress easier. All major web APIs use versioning for this reason. Why set up versioning for something we want to phase out? There will never be a simple-v3, so this is really overengineering the proposed change. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
[Catalog-sig] A modest proposal for securing PyPI with TUF
Hello everyone, I am pleased to announce our demonstration of PyPI and pip with TUF. Firstly, we solicit your thoughts and comments on our design document for integrating PyPI with TUF: https://docs.google.com/document/d/1sHMhgrGXNCvBZdmjVJzuoN5uMaUAUDWBmn3jo7vxjjw/edit?usp=sharing Secondly, you may wish to test our demo of PyPI and pip with TUF: https://github.com/dachshund/pip/wiki/pip-over-TUF Thirdly, this is how little it takes to secure pip with TUF: https://github.com/dachshund/pip/compare/develop...tuf Finally, you may be interested to learn about how one might manually secure a PyPI package index with TUF: https://github.com/dachshund/pip/wiki/PyPI-over-TUF We are excited to be able to show this to you now, and in person at our lightning talk at PyCon this Friday. We think that there is great potential for the PyPI and TUF community to work together to secure Python package management. This is just the beginning, and there is some work left to do, but we are confident that we have demonstrated to you that PyPI could be secured with TUF in the very near future. We would be happy to discuss with you how we compare with other proposals. We look forward to your questions and feedback! Thanks, Trishank ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] V2 pre-PEP: transitioning to release file hosting on PYPI
On 13.03.2013 07:28, Nick Coghlan wrote: On Tue, Mar 12, 2013 at 12:59 PM, M.-A. Lemburg m...@egenix.com wrote: I think we should establish a versioned API like that for PyPI to make progress easier. All major web APIs use versioning for this reason. Why set up versioning for something we want to phase out? There will never be a simple-v3, so this is really overengineering the proposed change. Who says that we want to phase out the /simple/ index ? FWIW, I don't think that two or three small changes to the PyPI (see my email to Holger) server warrants calling this over-engineering. This is about moving forward in a backwards compatible and future proof way. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 13 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] A modest proposal for securing PyPI with TUF
On Tue, Mar 12, 2013 at 11:41 PM, Trishank Karthik Kuppusamy t...@students.poly.edu wrote: Hello everyone, I am pleased to announce our demonstration of PyPI and pip with TUF. Firstly, we solicit your thoughts and comments on our design document for integrating PyPI with TUF: https://docs.google.com/document/d/1sHMhgrGXNCvBZdmjVJzuoN5uMaUAUDWBmn3jo7vxjjw/edit?usp=sharing Thanks for putting this together! Just a few notes regarding key management: - the PSF board generally stays out of the technical details of running the python.org infrastructure, so it's likely that any root keys would be handled by the PSF infrastructure committee. A (2, 4) or (3, 5) trust configuration would likely be manageable at this level. - at the target delegation level, PyPI supports the registration of new projects through the web service (see http://docs.python.org/2/distutils/packageindex.html). If my understanding of target delegation is correct, this means the simple and packages/source/letter delegations will need to be (1, 1) and online. - higher levels of the target delegation hierarchy could conceivably be kept offline, but there seems little value in doing so if they're trusting on online (1, 1) key - many PyPI packages are maintained by single developers, so (1, 1) or (1, n) is likely to be the only generally feasible level of signing at the project level. With the current focus being on getting an improvement from the status quo that we can successfully deploy in a reasonable period of time, the target delegation side of things probably needs to be substantially simpler in the initial iteration. Yes, it leaves us open to certain vulnerabilities we would like to remove in the long run, but we need to be very cautious in the additional demands we place on the users uploading to PyPI. It may even mean the initial iteration allows projects to rely on a PyPI provided signing key for their TUF metadata, using the existing upload mechanisms to add the files to PyPI. Regards, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] A modest proposal for securing PyPI with TUF
Hello Nick, On 3/13/13 4:09 AM, Nick Coghlan wrote: - the PSF board generally stays out of the technical details of running the python.org infrastructure, so it's likely that any root keys would be handled by the PSF infrastructure committee. A (2, 4) or (3, 5) trust configuration would likely be manageable at this level. Understood. We think a higher (t, n) [where t out of n signatures are needed to trust the metadata for a role] is better for the root role simply because its crucial metadata (the authorized keys for top-level roles) should change very rarely. - at the target delegation level, PyPI supports the registration of new projects through the web service (see http://docs.python.org/2/distutils/packageindex.html). If my understanding of target delegation is correct, this means the simple and packages/source/letter delegations will need to be (1, 1) and online. - higher levels of the target delegation hierarchy could conceivably be kept offline, but there seems little value in doing so if they're trusting on online (1, 1) key Fortunately, the targets/simple and targets/packages/(version)/(letter)/ roles should not require (1, 1) online keys, as their metadata (simply target delegations and no actual target files) should also fluctuate fairly rarely. I should make this clearer in our design document. - many PyPI packages are maintained by single developers, so (1, 1) or (1, n) is likely to be the only generally feasible level of signing at the project level. Yes, the package developers themselves could choose any (t, n) they like. In our design, we propose that PyPI could eventually delegate to stable packages which need little change (and use more security with more offline keys) and to unstable packages which need frequent change (and use less security with more online keys). With the current focus being on getting an improvement from the status quo that we can successfully deploy in a reasonable period of time, the target delegation side of things probably needs to be substantially simpler in the initial iteration. Yes, it leaves us open to certain vulnerabilities we would like to remove in the long run, but we need to be very cautious in the additional demands we place on the users uploading to PyPI. It may even mean the initial iteration allows projects to rely on a PyPI provided signing key for their TUF metadata, using the existing upload mechanisms to add the files to PyPI. I agree that there is a delicate problem of balancing security with usability here, especially in the beginning. You raised a very good issue there: on first migration, how would PyPI accommodate packages which have not had their target files delegated to their developers? We imagine that in this case, PyPI could assume initial responsibility for these packages, and later PyPI would delegate those packages to their respective developers. Thanks for your input, Trishank ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
[Catalog-sig] V3 PEP-draft for transitioning to pypi-hosting of release files
Hi all, after some more discussions and hours spend by Carl Meyer (who is now co-authoring the PEP) and me, here is a new V3 pre-submit draft. It is now more ambitious than the previous draft as should be obvious from the modified abstract (and Carl Meyers and Philip's earlier interactions on this list). There also are more details of how the current link-scraping works among other improvements and incorporations of feedback from discussions here. We intend to submit this draft tonight to the PEP editors. Feedback now and later remains welcome. I am sure there are issues to be sorted and clarified, among them the versioning-API suggestion by Marc-Andre. Thanks for everybody's support and feedback so far, holger PEP: XXX Title: Transitioning to release-file hosting on PyPI Version: $Revision$ Last-Modified: $Date$ Author: Holger Krekel hol...@merlinux.eu, Carl Meyer c...@oddbird.net Discussions-To: catalog-sig@python.org Status: Draft (PRE-submit V3) Type: Process Content-Type: text/x-rst Created: 10-Mar-2013 Post-History: Abstract This PEP proposes a backward-compatible two-phase transition process to speed up, simplify and robustify installing from the pypi.python.org (PyPI) package index. To ease the transition and minimize client-side friction, **no changes to distutils or existing installation tools are required in order to benefit from the transition phases, which is to result in faster, more reliable installs for most existing packages**. The first transition phase implements easy and explicit means for a package maintainter to control which release file links are served to present-day installation tools. The first phase also includes the implementation of analysis tools for present-day packages, to support communication with package maintainers and the automated setting of default modes for controling release file links. The second transition phase will result in the current PYPI index to only serve PYPI-hosted files by default. Externally hosted files will still be automatically discoverable through a second index. Present-day installation tools will be able to continue working by specifying this second index. New versions of installation tools shall default to only install packages from PYPI unless the user explicitely wishes to include non-PYPI sites. Rationale = .. _history: History and motivations for external hosting When PyPI went online, it offered release registration but had no facility to host release files itself. When hosting was added, no automated downloading tool existed yet. When Philip Eby implemented automated downloading (through setuptools), he made the choice to allow people to use download hosts of their choice. The finding of externally-hosted packages was implemented as follows: #. The PyPI ``simple/`` index for a package contains all links found anywhere in that package's metadata for any release. Links in the Download-URL and Home-page metadata fields are given ``rel=download`` and ``rel=homepage`` attributes, respectively. #. Any of these links whose target is a file whose name appears to be in the form of an installable source or binary distribution, with basename in the form packagename-version.ARCHIVEEXT, is considered a potential installation candidate. #. Similarly, any links suffixed with an #egg=packagename-version fragment are considered an installation candidate. #. Additionally, the ``rel=homepage`` and ``rel=download`` links are followed and, if HTML, are themselves scraped for release-file links in the above formats. Today, most packages released on PyPI host their release files on PyPI, but a small percentage (XXX need updated data) rely on external hosting. There are many reasons [2]_ why people have chosen external hosting. To cite just a few: - release processes and scripts have been developed already and upload to external sites - it takes too long to upload large files from some places in the world - export restrictions e.g. for crypto-related software - company policies which require offering open source packages through own sites - problems with integrating uploading to PYPI into one's release process (because of release policies) - desiring download statistics different from those maintained by PyPI - perceived bad reliability of PYPI - not aware that PyPI offers file-hosting Irrespective of the present-day validity of these reasons, there clearly is a history why people choose to host files externally and it even was for some time the only way you could do things. Problem --- **Today, python package installers (pip, easy_install, buildout, and others) often need to query many non-PyPI URLs even if there are no externally hosted files**. Apart from querying pypi.python.org's simple index pages, also all homepages and download pages ever specified with any release of a package are crawled by an
Re: [Catalog-sig] V3 PEP-draft for transitioning to pypi-hosting of release files
On Wed, Mar 13, 2013 at 7:21 AM, holger krekel hol...@merlinux.eu wrote: Hi all, after some more discussions and hours spend by Carl Meyer (who is now co-authoring the PEP) and me, here is a new V3 pre-submit draft. It is now more ambitious than the previous draft as should be obvious from the modified abstract (and Carl Meyers and Philip's earlier interactions on this list). There also are more details of how the current link-scraping works among other improvements and incorporations of feedback from discussions here. We intend to submit this draft tonight to the PEP editors. Feedback now and later remains welcome. I am sure there are issues to be sorted and clarified, among them the versioning-API suggestion by Marc-Andre. Thanks for everybody's support and feedback so far, holger Looks good to me! Setuptools' two releases will probably look like this: 1. Default to externals index, warn when fetching URLs that are not the same host as the index 2. Default to externals index, reject URLs that are not the same host as the index unless --allow-hosts is configured (IOW, default allow-hosts to equal index-url host) That way, external URLs can still be discovered by the user, but the default configuration is still secure. ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] pre-PEP: transition to release-file hosting at pypi site
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 03/12/2013 03:57 PM, holger krekel wrote: Nobody should be lead to think that PYPI is a trusted or reviewed source of software even if we got rid of external hosting completely. Amen. I still boggle at the amount of sky is falling stuff here over MITM / external links / whatever, given the potential damaage from explicitly malicious uploads (trojans, viruses, whatever). Package signing might help here, but only for consumers who willing to think hard enough about the problem to manage a web of trust (frankly, a vanishingly small minority). And then there are these problems: - - Backward-imcompatible releases (even those which make appropriate signals in their version numbers). - - Removal of distributions / releases / projects. - - Re-upload of new distributions which sliently replace previous distributions *of the same release* (Yes, Virginia, there are people out there who do this). which are deal-killers for the folks who want always-on, reliable, repeatable, automatic installation from PyPI (instead of creating their own indexes). Adding HTTPS or removing external links does nothing to mitigate those issues. Tres. - -- === Tres Seaver +1 540-429-0999 tsea...@palladion.com Palladion Software Excellence by Designhttp://palladion.com -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iEYEARECAAYFAlFArywACgkQ+gerLs4ltQ7zLACgluGTMdUYheeMGoFgAUH1VZja VJYAnjBPXbs8yeQ1FYa0mNZhAkTlcJQf =8KSF -END PGP SIGNATURE- ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] pre-PEP: transition to release-file hosting at pypi site
On Mar 13, 2013, at 12:54 PM, Tres Seaver tsea...@palladion.com wrote: Signed PGP part On 03/12/2013 03:57 PM, holger krekel wrote: Nobody should be lead to think that PYPI is a trusted or reviewed source of software even if we got rid of external hosting completely. Amen. I still boggle at the amount of sky is falling stuff here over MITM / external links / whatever, given the potential damaage from explicitly malicious uploads (trojans, viruses, whatever). Package signing might help here, but only for consumers who willing to think hard enough about the problem to manage a web of trust (frankly, a vanishingly small minority). Really now? Let's see I can easily protect against malicous uploads by only installing from trusted authors. I cannot easily prevent a MITM or a compromised external host if the tools don't protect me against it. Without the tooling and infrastructure moving to close this gap the only way to do it is to not use that tooling or infrastructure at all. Namely even if the author of the package is myself I cannot be secure installing it using the current toolchain and infrastructure unless I bend over backwards to make sure that no installable link appears anywhere in my long description, and I don't have a homepage, and I don't have a download url. And then there are these problems: - - Backward-imcompatible releases (even those which make appropriate signals in their version numbers). - - Removal of distributions / releases / projects. - - Re-upload of new distributions which sliently replace previous distributions *of the same release* (Yes, Virginia, there are people out there who do this). which are deal-killers for the folks who want always-on, reliable, repeatable, automatic installation from PyPI (instead of creating their own indexes). Adding HTTPS or removing external links does nothing to mitigate those issues. Yes there are other problems, so let's just throw our hands in the air and say fuck it instead of iteratively working to secure the system. Tres. - -- === Tres Seaver +1 540-429-0999 tsea...@palladion.com Palladion Software Excellence by Designhttp://palladion.com ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig - Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA signature.asc Description: Message signed with OpenPGP using GPGMail ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] V3 PEP-draft for transitioning to pypi-hosting of release files
On Mar 13, 2013, at 10:26 AM, PJ Eby p...@telecommunity.com wrote: On Wed, Mar 13, 2013 at 7:21 AM, holger krekel hol...@merlinux.eu wrote: Hi all, after some more discussions and hours spend by Carl Meyer (who is now co-authoring the PEP) and me, here is a new V3 pre-submit draft. It is now more ambitious than the previous draft as should be obvious from the modified abstract (and Carl Meyers and Philip's earlier interactions on this list). There also are more details of how the current link-scraping works among other improvements and incorporations of feedback from discussions here. We intend to submit this draft tonight to the PEP editors. Feedback now and later remains welcome. I am sure there are issues to be sorted and clarified, among them the versioning-API suggestion by Marc-Andre. Thanks for everybody's support and feedback so far, holger Looks good to me! Setuptools' two releases will probably look like this: 1. Default to externals index, warn when fetching URLs that are not the same host as the index 2. Default to externals index, reject URLs that are not the same host as the index unless --allow-hosts is configured (IOW, default allow-hosts to equal index-url host) That way, external URLs can still be discovered by the user, but the default configuration is still secure. ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig For the record I support the PEP and these 2 steps sound ok to me. My only suggestion is an additional rel attribute for indexes to indicate this is index hosted file incase the index domain and the package host domain differ (as is the case with Crate). - Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA signature.asc Description: Message signed with OpenPGP using GPGMail ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] pre-PEP: transition to release-file hosting at pypi site
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 03/13/2013 01:06 PM, Donald Stufft wrote: Really now? Let's see I can easily protect against malicous uploads by only installing from trusted authors How do you know who to trust? What if an author you trust adds a dependency to a package to an author you have no konwledege of, or one you actively distrust? What if an author you trust commits one of the other changes I outlined (removes a release / distribution, makes backward-incompatible changes, re-uploads a changed distribution over an existing one?) The only way to implement only install from trusted authors is to run your own index, and explicitly review / curate the package set maintained there. In that scenario, you run a script from time to time which looks for new versions of your packages on PyPI and puts them into a queue for review. Bob, a casual reviewer, might install the new verison from PyPI into a fresh virtualenv and test it there before pushing it into the curated index. Carol, more pranoid^Wsecurity mindex, downloads the package, verifies its signature, unpacks the tarball, diffs it against the curated version, compares that diff against the changelog, looks at new / changed dependencies, and installs it into a hardened sandbox for testing. Only after that kind of review does she push the newly-reviewed distribution into the curated index. Adding an entirely new package to the curated index is a similar process, but requires more effort from either Bob or Carol. Tres. - -- === Tres Seaver +1 540-429-0999 tsea...@palladion.com Palladion Software Excellence by Designhttp://palladion.com -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iEYEARECAAYFAlFAtakACgkQ+gerLs4ltQ5O4wCcC92ew66wVGEPBM/Jr8z1bYU8 e9AAoNXmaiuBHQOIFQlT0SRemI43hoG7 =idDp -END PGP SIGNATURE- ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] pre-PEP: transition to release-file hosting at pypi site
On Mar 13, 2013, at 1:21 PM, Tres Seaver tsea...@palladion.com wrote: Signed PGP part On 03/13/2013 01:06 PM, Donald Stufft wrote: Really now? Let's see I can easily protect against malicous uploads by only installing from trusted authors How do you know who to trust? What if an author you trust adds a dependency to a package to an author you have no konwledege of, or one you actively distrust? What if an author you trust commits one of the other changes I outlined (removes a release / distribution, makes backward-incompatible changes, re-uploads a changed distribution over an existing one?) The only way to implement only install from trusted authors is to run your own index, and explicitly review / curate the package set maintained there. In that scenario, you run a script from time to time which looks for new versions of your packages on PyPI and puts them into a queue for review. Bob, a casual reviewer, might install the new verison from PyPI into a fresh virtualenv and test it there before pushing it into the curated index. Carol, more pranoid^Wsecurity mindex, downloads the package, verifies its signature, unpacks the tarball, diffs it against the curated version, compares that diff against the changelog, looks at new / changed dependencies, and installs it into a hardened sandbox for testing. Only after that kind of review does she push the newly-reviewed distribution into the curated index. Adding an entirely new package to the curated index is a similar process, but requires more effort from either Bob or Carol. Tres. - -- === Tres Seaver +1 540-429-0999 tsea...@palladion.com Palladion Software Excellence by Designhttp://palladion.com ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig Threat models are a thing. It the way it *should* work in PyPI is you ask for X, you get X and it was not modified in transit (and ideally not on the repository as well but that is more difficult). PyPI is not and will never be a curated index. However if I trust Author A, then I implicity trust his actions. I trust that he won't do your stated issues. Now is a curated index *more secure*? Well again it depends on what your threat model is. PyPI isn't going to protect you from a malicious or incompetent author. For the threat model that PyPI is able to deliver on your system is no more or less secure. In fact without the sort of things you dismiss here your proposal is also just as insecure unless you only ever access it on a protected network which you can be sure no attacker has gained access too. Even your 3 issues are far less concerning than the fact MiTM on either PyPI (fixed now with pip 1.3) or an external url allows a random guy at PyCon to execute arbitrary code on your machine if you install a package from PyPI at pycon, or at a coffee shop, or on any wifi ever that could have someone else on it. - Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA signature.asc Description: Message signed with OpenPGP using GPGMail ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] pre-PEP: transition to release-file hosting at pypi site
On 14 March 2013 05:54, Tres Seaver tsea...@palladion.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 03/12/2013 03:57 PM, holger krekel wrote: Nobody should be lead to think that PYPI is a trusted or reviewed source of software even if we got rid of external hosting completely. Amen. I still boggle at the amount of sky is falling stuff here over MITM / external links / whatever, given the potential damaage from explicitly malicious uploads (trojans, viruses, whatever). Package signing might help here, but only for consumers who willing to think hard enough about the problem to manage a web of trust (frankly, a vanishingly small minority). Well yes HTTPS and external links are problems which it is necessary to solve, and not sufficient to make 'pypi secure' - but that doesn't mean we should do a poor job solving them. -Rob -- Robert Collins rbtcoll...@hp.com Distinguished Technologist HP Cloud Services ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] A modest proposal for securing PyPI with TUF
On Wed, Mar 13, 2013 at 5:13 AM, Trishank Karthik Kuppusamy t...@students.poly.edu wrote: Hello Nick, On 3/13/13 4:09 AM, Nick Coghlan wrote: - the PSF board generally stays out of the technical details of running the python.org infrastructure, so it's likely that any root keys would be handled by the PSF infrastructure committee. A (2, 4) or (3, 5) trust configuration would likely be manageable at this level. Understood. We think a higher (t, n) [where t out of n signatures are needed to trust the metadata for a role] is better for the root role simply because its crucial metadata (the authorized keys for top-level roles) should change very rarely. - at the target delegation level, PyPI supports the registration of new projects through the web service (see http://docs.python.org/2/distutils/packageindex.html). If my understanding of target delegation is correct, this means the simple and packages/source/letter delegations will need to be (1, 1) and online. - higher levels of the target delegation hierarchy could conceivably be kept offline, but there seems little value in doing so if they're trusting on online (1, 1) key Fortunately, the targets/simple and targets/packages/(version)/(letter)/ roles should not require (1, 1) online keys, as their metadata (simply target delegations and no actual target files) should also fluctuate fairly rarely. I should make this clearer in our design document. - many PyPI packages are maintained by single developers, so (1, 1) or (1, n) is likely to be the only generally feasible level of signing at the project level. Yes, the package developers themselves could choose any (t, n) they like. In our design, we propose that PyPI could eventually delegate to stable packages which need little change (and use more security with more offline keys) and to unstable packages which need frequent change (and use less security with more online keys). With the current focus being on getting an improvement from the status quo that we can successfully deploy in a reasonable period of time, the target delegation side of things probably needs to be substantially simpler in the initial iteration. Yes, it leaves us open to certain vulnerabilities we would like to remove in the long run, but we need to be very cautious in the additional demands we place on the users uploading to PyPI. It may even mean the initial iteration allows projects to rely on a PyPI provided signing key for their TUF metadata, using the existing upload mechanisms to add the files to PyPI. I agree that there is a delicate problem of balancing security with usability here, especially in the beginning. You raised a very good issue there: on first migration, how would PyPI accommodate packages which have not had their target files delegated to their developers? We imagine that in this case, PyPI could assume initial responsibility for these packages, and later PyPI would delegate those packages to their respective developers. Thanks for your input, Trishank With all the different kinds of metadata, It's interesting to note that currently TUF seems to only be concerned with the available file names and their integrity. (Some of us will think of PEP 426 PKG-INFO first when we hear the word metadata.) It looks like the D metadata lists all the filenames for Django, and then Django lists them again with hashes and signatures. Why all the lists? Does every Django release re-assert all the versions of Django that are available on the index? How might I deal with producing the official source distribution myself and having a friend produce the official Windows build of a package? As an aside PyPI has been doubling in size every 1.5 - 2 years. Thanks Daniel Holth ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] A modest proposal for securing PyPI with TUF
We may have something unclear in the doc. We definitely don't just worry about package names. (In between meetings, will send a longer response in a bit.) Thanks, Justin On Wed, Mar 13, 2013 at 2:15 PM, Daniel Holth dho...@gmail.com wrote: On Wed, Mar 13, 2013 at 5:13 AM, Trishank Karthik Kuppusamy t...@students.poly.edu wrote: Hello Nick, On 3/13/13 4:09 AM, Nick Coghlan wrote: - the PSF board generally stays out of the technical details of running the python.org infrastructure, so it's likely that any root keys would be handled by the PSF infrastructure committee. A (2, 4) or (3, 5) trust configuration would likely be manageable at this level. Understood. We think a higher (t, n) [where t out of n signatures are needed to trust the metadata for a role] is better for the root role simply because its crucial metadata (the authorized keys for top-level roles) should change very rarely. - at the target delegation level, PyPI supports the registration of new projects through the web service (see http://docs.python.org/2/distutils/packageindex.html). If my understanding of target delegation is correct, this means the simple and packages/source/letter delegations will need to be (1, 1) and online. - higher levels of the target delegation hierarchy could conceivably be kept offline, but there seems little value in doing so if they're trusting on online (1, 1) key Fortunately, the targets/simple and targets/packages/(version)/(letter)/ roles should not require (1, 1) online keys, as their metadata (simply target delegations and no actual target files) should also fluctuate fairly rarely. I should make this clearer in our design document. - many PyPI packages are maintained by single developers, so (1, 1) or (1, n) is likely to be the only generally feasible level of signing at the project level. Yes, the package developers themselves could choose any (t, n) they like. In our design, we propose that PyPI could eventually delegate to stable packages which need little change (and use more security with more offline keys) and to unstable packages which need frequent change (and use less security with more online keys). With the current focus being on getting an improvement from the status quo that we can successfully deploy in a reasonable period of time, the target delegation side of things probably needs to be substantially simpler in the initial iteration. Yes, it leaves us open to certain vulnerabilities we would like to remove in the long run, but we need to be very cautious in the additional demands we place on the users uploading to PyPI. It may even mean the initial iteration allows projects to rely on a PyPI provided signing key for their TUF metadata, using the existing upload mechanisms to add the files to PyPI. I agree that there is a delicate problem of balancing security with usability here, especially in the beginning. You raised a very good issue there: on first migration, how would PyPI accommodate packages which have not had their target files delegated to their developers? We imagine that in this case, PyPI could assume initial responsibility for these packages, and later PyPI would delegate those packages to their respective developers. Thanks for your input, Trishank With all the different kinds of metadata, It's interesting to note that currently TUF seems to only be concerned with the available file names and their integrity. (Some of us will think of PEP 426 PKG-INFO first when we hear the word metadata.) It looks like the D metadata lists all the filenames for Django, and then Django lists them again with hashes and signatures. Why all the lists? Does every Django release re-assert all the versions of Django that are available on the index? How might I deal with producing the official source distribution myself and having a friend produce the official Windows build of a package? As an aside PyPI has been doubling in size every 1.5 - 2 years. Thanks Daniel Holth ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] V3 PEP-draft for transitioning to pypi-hosting of release files
On 13.03.2013 12:21, holger krekel wrote: Hi all, after some more discussions and hours spend by Carl Meyer (who is now co-authoring the PEP) and me, here is a new V3 pre-submit draft. It is now more ambitious than the previous draft as should be obvious from the modified abstract (and Carl Meyers and Philip's earlier interactions on this list). There also are more details of how the current link-scraping works among other improvements and incorporations of feedback from discussions here. We intend to submit this draft tonight to the PEP editors. Feedback now and later remains welcome. I am sure there are issues to be sorted and clarified, among them the versioning-API suggestion by Marc-Andre. Thanks for everybody's support and feedback so far, holger PEP: XXX Title: Transitioning to release-file hosting on PyPI Version: $Revision$ Last-Modified: $Date$ Author: Holger Krekel hol...@merlinux.eu, Carl Meyer c...@oddbird.net Discussions-To: catalog-sig@python.org Status: Draft (PRE-submit V3) Type: Process Content-Type: text/x-rst Created: 10-Mar-2013 Post-History: Abstract This PEP proposes a backward-compatible two-phase transition process to speed up, simplify and robustify installing from the pypi.python.org (PyPI) package index. To ease the transition and minimize client-side friction, **no changes to distutils or existing installation tools are required in order to benefit from the transition phases, which is to result in faster, more reliable installs for most existing packages**. The first transition phase implements easy and explicit means for a package maintainter to control which release file links are served to present-day installation tools. The first phase also includes the implementation of analysis tools for present-day packages, to support communication with package maintainers and the automated setting of default modes for controling release file links. The second transition phase will result in the current PYPI index to only serve PYPI-hosted files by default. Externally hosted files will still be automatically discoverable through a second index. Present-day installation tools will be able to continue working by specifying this second index. New versions of installation tools shall default to only install packages from PYPI unless the user explicitely wishes to include non-PYPI sites. I must say, don't like this change in motivation compared to V1 and V2. The original of the discussion was to make PyPI more secure and the installation process faster and more reliable by moving away from crawling arbitrary external web pages. Both can be had by: * limiting the crawling to package author defined specific URLs, with added hashes to make sure that the URLs and their target content is not modified (this is the securing external downloads part - see here for an example: https://pypi.python.org/pypi/egenix-pyopenssl/0.13.1.1.0.1.5), and * adding a way for the package authors to say PyPI, please go ahead and cache/copy my distributions files (this is the increase download reliability part - can be had by doing opt-in CDN caching/proxying of external links via PyPI) Now, with V3 of the proposal, you are moving towards a system that basically says do it this way, or stay out of our eco system, which, in my book, is not what the Python eco system is all about. Your V2 was much more inviting in this respect. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 13 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] A modest proposal for securing PyPI with TUF
We use the simple directory and filenames because that is what pip uses. You have a nice suggestion to include other metadata in the TUF metadata. We certainly could do this if desirable. This required a redesign of the PyPI API and we weren't sure if this was wanted. Our current doc / prototype is trying to minimize the changes needed all around. Thanks, Justin On Wed, Mar 13, 2013 at 2:15 PM, Daniel Holth dho...@gmail.com wrote: On Wed, Mar 13, 2013 at 5:13 AM, Trishank Karthik Kuppusamy t...@students.poly.edu wrote: Hello Nick, On 3/13/13 4:09 AM, Nick Coghlan wrote: - the PSF board generally stays out of the technical details of running the python.org infrastructure, so it's likely that any root keys would be handled by the PSF infrastructure committee. A (2, 4) or (3, 5) trust configuration would likely be manageable at this level. Understood. We think a higher (t, n) [where t out of n signatures are needed to trust the metadata for a role] is better for the root role simply because its crucial metadata (the authorized keys for top-level roles) should change very rarely. - at the target delegation level, PyPI supports the registration of new projects through the web service (see http://docs.python.org/2/distutils/packageindex.html). If my understanding of target delegation is correct, this means the simple and packages/source/letter delegations will need to be (1, 1) and online. - higher levels of the target delegation hierarchy could conceivably be kept offline, but there seems little value in doing so if they're trusting on online (1, 1) key Fortunately, the targets/simple and targets/packages/(version)/(letter)/ roles should not require (1, 1) online keys, as their metadata (simply target delegations and no actual target files) should also fluctuate fairly rarely. I should make this clearer in our design document. - many PyPI packages are maintained by single developers, so (1, 1) or (1, n) is likely to be the only generally feasible level of signing at the project level. Yes, the package developers themselves could choose any (t, n) they like. In our design, we propose that PyPI could eventually delegate to stable packages which need little change (and use more security with more offline keys) and to unstable packages which need frequent change (and use less security with more online keys). With the current focus being on getting an improvement from the status quo that we can successfully deploy in a reasonable period of time, the target delegation side of things probably needs to be substantially simpler in the initial iteration. Yes, it leaves us open to certain vulnerabilities we would like to remove in the long run, but we need to be very cautious in the additional demands we place on the users uploading to PyPI. It may even mean the initial iteration allows projects to rely on a PyPI provided signing key for their TUF metadata, using the existing upload mechanisms to add the files to PyPI. I agree that there is a delicate problem of balancing security with usability here, especially in the beginning. You raised a very good issue there: on first migration, how would PyPI accommodate packages which have not had their target files delegated to their developers? We imagine that in this case, PyPI could assume initial responsibility for these packages, and later PyPI would delegate those packages to their respective developers. Thanks for your input, Trishank With all the different kinds of metadata, It's interesting to note that currently TUF seems to only be concerned with the available file names and their integrity. (Some of us will think of PEP 426 PKG-INFO first when we hear the word metadata.) It looks like the D metadata lists all the filenames for Django, and then Django lists them again with hashes and signatures. Why all the lists? Does every Django release re-assert all the versions of Django that are available on the index? How might I deal with producing the official source distribution myself and having a friend produce the official Windows build of a package? As an aside PyPI has been doubling in size every 1.5 - 2 years. Thanks Daniel Holth ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] V3 PEP-draft for transitioning to pypi-hosting of release files
On Mar 13, 2013, at 2:57 PM, M.-A. Lemburg m...@egenix.com wrote: On 13.03.2013 12:21, holger krekel wrote: Hi all, after some more discussions and hours spend by Carl Meyer (who is now co-authoring the PEP) and me, here is a new V3 pre-submit draft. It is now more ambitious than the previous draft as should be obvious from the modified abstract (and Carl Meyers and Philip's earlier interactions on this list). There also are more details of how the current link-scraping works among other improvements and incorporations of feedback from discussions here. We intend to submit this draft tonight to the PEP editors. Feedback now and later remains welcome. I am sure there are issues to be sorted and clarified, among them the versioning-API suggestion by Marc-Andre. Thanks for everybody's support and feedback so far, holger PEP: XXX Title: Transitioning to release-file hosting on PyPI Version: $Revision$ Last-Modified: $Date$ Author: Holger Krekel hol...@merlinux.eu, Carl Meyer c...@oddbird.net Discussions-To: catalog-sig@python.org Status: Draft (PRE-submit V3) Type: Process Content-Type: text/x-rst Created: 10-Mar-2013 Post-History: Abstract This PEP proposes a backward-compatible two-phase transition process to speed up, simplify and robustify installing from the pypi.python.org (PyPI) package index. To ease the transition and minimize client-side friction, **no changes to distutils or existing installation tools are required in order to benefit from the transition phases, which is to result in faster, more reliable installs for most existing packages**. The first transition phase implements easy and explicit means for a package maintainter to control which release file links are served to present-day installation tools. The first phase also includes the implementation of analysis tools for present-day packages, to support communication with package maintainers and the automated setting of default modes for controling release file links. The second transition phase will result in the current PYPI index to only serve PYPI-hosted files by default. Externally hosted files will still be automatically discoverable through a second index. Present-day installation tools will be able to continue working by specifying this second index. New versions of installation tools shall default to only install packages from PYPI unless the user explicitely wishes to include non-PYPI sites. I must say, don't like this change in motivation compared to V1 and V2. The original of the discussion was to make PyPI more secure and the installation process faster and more reliable by moving away from crawling arbitrary external web pages. Both can be had by: * limiting the crawling to package author defined specific URLs, with added hashes to make sure that the URLs and their target content is not modified (this is the securing external downloads part - see here for an example: https://pypi.python.org/pypi/egenix-pyopenssl/0.13.1.1.0.1.5), and * adding a way for the package authors to say PyPI, please go ahead and cache/copy my distributions files (this is the increase download reliability part - can be had by doing opt-in CDN caching/proxying of external links via PyPI) Now, with V3 of the proposal, you are moving towards a system that basically says do it this way, or stay out of our eco system, which, in my book, is not what the Python eco system is all about. I don't see how? The -with-externals index will still contain all the existing links, and indeed PJ Elby has already stated that setuptools will move to support this index by default but with proper warnings to people so they know they are installing a package off site. This allows existing tools to be moved to a secure by default position. Allows future tools to choose if they want to enable the existing behavior through use of -with-externals (hopefully with a warning or opt-in sort of thing like laid out by PJE, but it's certainly not required). And even allows users of existing tools to opt into the old behavior via the -i option. Maybe i'm missing it but in what way does this force authors to do it this way or stay out of our eco system since all the same options are available as there are today? Your V2 was much more inviting in this respect. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 13 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO
Re: [Catalog-sig] V3 PEP-draft for transitioning to pypi-hosting of release files
On 13.03.2013 20:08, Donald Stufft wrote: On Mar 13, 2013, at 2:57 PM, M.-A. Lemburg m...@egenix.com wrote: On 13.03.2013 12:21, holger krekel wrote: [V3 proposal] I must say, don't like this change in motivation compared to V1 and V2. The original of the discussion was to make PyPI more secure and the installation process faster and more reliable by moving away from crawling arbitrary external web pages. Both can be had by: * limiting the crawling to package author defined specific URLs, with added hashes to make sure that the URLs and their target content is not modified (this is the securing external downloads part - see here for an example: https://pypi.python.org/pypi/egenix-pyopenssl/0.13.1.1.0.1.5), and * adding a way for the package authors to say PyPI, please go ahead and cache/copy my distributions files (this is the increase download reliability part - can be had by doing opt-in CDN caching/proxying of external links via PyPI) Now, with V3 of the proposal, you are moving towards a system that basically says do it this way, or stay out of our eco system, which, in my book, is not what the Python eco system is all about. I don't see how? The -with-externals index will still contain all the existing links, and indeed PJ Elby has already stated that setuptools will move to support this index by default but with proper warnings to people so they know they are installing a package off site. This allows existing tools to be moved to a secure by default position. Allows future tools to choose if they want to enable the existing behavior through use of -with-externals (hopefully with a warning or opt-in sort of thing like laid out by PJE, but it's certainly not required). And even allows users of existing tools to opt into the old behavior via the -i option. Maybe i'm missing it but in what way does this force authors to do it this way or stay out of our eco system since all the same options are available as there are today? The proposal marks all external links as evil, and instead of making external links more secure, the user is left with the option to either not enable external links at all, or to let the devil in :-) That's not nice. It's also security theater. The real problem is unreviewed code getting executed by users, or worse, automated build systems. Yet, we let users believe that everything is secured on PyPI. Taking an extreme position, it would probably be better just leave everything as it is and instead educate users about the risk they are taking with a pip install AngryBirds, signed with keys issued by the PSF on the official PyPI server, delivered straight to your drive via the latest in crypto technology, only to wipe your notebook... But then, I don't like extreme positions, so would rather like to incrementally improve the situation both from the server and the client side, both addressing user and author concerns, and keeping the Python eco system a friendly place to be. Your V2 was much more inviting in this respect. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 13 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] V3 PEP-draft for transitioning to pypi-hosting of release files
On Wed, Mar 13, 2013 at 3:33 PM, M.-A. Lemburg m...@egenix.com wrote: On 13.03.2013 20:08, Donald Stufft wrote: On Mar 13, 2013, at 2:57 PM, M.-A. Lemburg m...@egenix.com wrote: On 13.03.2013 12:21, holger krekel wrote: [V3 proposal] I must say, don't like this change in motivation compared to V1 and V2. The original of the discussion was to make PyPI more secure and the installation process faster and more reliable by moving away from crawling arbitrary external web pages. Both can be had by: * limiting the crawling to package author defined specific URLs, with added hashes to make sure that the URLs and their target content is not modified (this is the securing external downloads part - see here for an example: https://pypi.python.org/pypi/egenix-pyopenssl/0.13.1.1.0.1.5), and * adding a way for the package authors to say PyPI, please go ahead and cache/copy my distributions files (this is the increase download reliability part - can be had by doing opt-in CDN caching/proxying of external links via PyPI) Now, with V3 of the proposal, you are moving towards a system that basically says do it this way, or stay out of our eco system, which, in my book, is not what the Python eco system is all about. I don't see how? The -with-externals index will still contain all the existing links, and indeed PJ Elby has already stated that setuptools will move to support this index by default but with proper warnings to people so they know they are installing a package off site. This allows existing tools to be moved to a secure by default position. Allows future tools to choose if they want to enable the existing behavior through use of -with-externals (hopefully with a warning or opt-in sort of thing like laid out by PJE, but it's certainly not required). And even allows users of existing tools to opt into the old behavior via the -i option. Maybe i'm missing it but in what way does this force authors to do it this way or stay out of our eco system since all the same options are available as there are today? The proposal marks all external links as evil, and instead of making external links more secure, the user is left with the option to either not enable external links at all, or to let the devil in :-) That's not nice. It's also security theater. The real problem is unreviewed code getting executed by users, or worse, automated build systems. Yet, we let users believe that everything is secured on PyPI. Taking an extreme position, it would probably be better just leave everything as it is and instead educate users about the risk they are taking with a pip install AngryBirds, signed with keys issued by the PSF on the official PyPI server, delivered straight to your drive via the latest in crypto technology, only to wipe your notebook... But then, I don't like extreme positions, so would rather like to incrementally improve the situation both from the server and the client side, both addressing user and author concerns, and keeping the Python eco system a friendly place to be. Your V2 was much more inviting in this respect. Perhaps it would be better to decide whether it is reliability theater and concentrate on consistency rather than whether the code actually does what you want. It is nice to have a system that at least prevents targeted third party bad-package attacks. ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] V3 PEP-draft for transitioning to pypi-hosting of release files
On Mar 13, 2013, at 3:33 PM, M.-A. Lemburg m...@egenix.com wrote: On 13.03.2013 20:08, Donald Stufft wrote: On Mar 13, 2013, at 2:57 PM, M.-A. Lemburg m...@egenix.com wrote: On 13.03.2013 12:21, holger krekel wrote: [V3 proposal] I must say, don't like this change in motivation compared to V1 and V2. The original of the discussion was to make PyPI more secure and the installation process faster and more reliable by moving away from crawling arbitrary external web pages. Both can be had by: * limiting the crawling to package author defined specific URLs, with added hashes to make sure that the URLs and their target content is not modified (this is the securing external downloads part - see here for an example: https://pypi.python.org/pypi/egenix-pyopenssl/0.13.1.1.0.1.5), and * adding a way for the package authors to say PyPI, please go ahead and cache/copy my distributions files (this is the increase download reliability part - can be had by doing opt-in CDN caching/proxying of external links via PyPI) Now, with V3 of the proposal, you are moving towards a system that basically says do it this way, or stay out of our eco system, which, in my book, is not what the Python eco system is all about. I don't see how? The -with-externals index will still contain all the existing links, and indeed PJ Elby has already stated that setuptools will move to support this index by default but with proper warnings to people so they know they are installing a package off site. This allows existing tools to be moved to a secure by default position. Allows future tools to choose if they want to enable the existing behavior through use of -with-externals (hopefully with a warning or opt-in sort of thing like laid out by PJE, but it's certainly not required). And even allows users of existing tools to opt into the old behavior via the -i option. Maybe i'm missing it but in what way does this force authors to do it this way or stay out of our eco system since all the same options are available as there are today? The proposal marks all external links as evil, and instead of making external links more secure, the user is left with the option to either not enable external links at all, or to let the devil in :-) It doesn't mark them as evil, it marks them as requiring users to opt into them. Authors are free to not publish their packages directly to PyPI and users are free to opt in to installing the external urls that the authors haven chosen to publish. Further more it gives package authors complete control over what urls appear on their simple index page. ISTM that this is even friendlier than before because now both sides have explicitly decided to use those urls, instead of it being completely implicit on one said, and partially implicit on the other. That's not nice. It's also security theater. It's not security theater, it moves the defaults to more secure. Further work can (and will be) to ensure that for those users and authors who opt into the external urls it's still secure while again requiring both sides to explicitly opt into it. The real problem is unreviewed code getting executed by users, or worse, automated build systems. Yet, we let users believe that everything is secured on PyPI. We? I' don't think anyones ever said that *everything is secured on pypi*. The best the PyPI infrastructure and tooling can do (security wise) is to try and make as sure as possible then when you ask for foo==X.Y PyPI currently can't make that claim for external links. On top of that many users (and i'd wager most users) are not aware that when they install something it reaches outwardly to other hosts. This proposal makes it so they *are* aware so they opt into potentially lowering their downtime and they opt into exposing details to external hosts (which may or may not be SSL secured). Taking an extreme position, it would probably be better just leave everything as it is and instead educate users about the risk they are taking with a pip install AngryBirds, signed with keys issued by the PSF on the official PyPI server, delivered straight to your drive via the latest in crypto technology, only to wipe your notebook... But then, I don't like extreme positions, so would rather like to incrementally improve the situation both from the server and the client side, both addressing user and author concerns, and keeping the Python eco system a friendly place to be. Your V2 was much more inviting in this respect. This gives _all_ the abilities of the current system (besides spidering random urls) with *more* control given to the authors as to what exists on their various index pages. This is a net win for everyone involved. The only loss is that projects that choose to host externally to PyPI will have people trying to install it told to explicitly allow it (as mentioned by PJ Elby). -- Marc-Andre
Re: [Catalog-sig] A modest proposal for securing PyPI with TUF
On 03/13/2013 02:15 PM, Daniel Holth wrote: With all the different kinds of metadata, It's interesting to note that currently TUF seems to only be concerned with the available file names and their integrity. (Some of us will think of PEP 426 PKG-INFO first when we hear the word metadata.) Yes, you are right that the many different kinds of metadata in this discussion (TUF metadata, PyPI metadata) makes things a little confusing sometimes! :)) My understanding of PEP 426 is that the distribution metadata is specified by the developer with the setup.py script. To take the running Django example, since the Django developers will sign everything under the Django role with their own keys that the D role will talk about, setup.py, as well as the generated PKG-INFO, will be signed by the Django developers. This means that pip + TUF will be able to verify these distribution metadata indirectly via the source distribution package. Does this answer your question? It looks like the D metadata lists all the filenames for Django, and then Django lists them again with hashes and signatures. Why all the lists? Does every Django release re-assert all the versions of Django that are available on the index? Good observation. For D, you are talking about the paths attribute here: https://updateframework.com/pypi/repository/metadata/targets/packages/source/D.txt For Django, you are talking about the targets attribute here: https://updateframework.com/pypi/repository/metadata/targets/packages/source/D/Django.txt Why is paths in D listing all the targets that Django already talks about? Presently, this is because our target delegation tool (signercli.py) is being paranoid and making sure that D is explicitly delegating only targets matching these paths. However, the TUF specification allows for D to simply say, I delegate any target whatsoever under Django, by settings paths to packages/source/D/Django/**: https://www.updateframework.com/browser/specs/tuf-spec.txt#L525 How might I deal with producing the official source distribution myself and having a friend produce the official Windows build of a package? There are a few solutions. You could have your friend produce the official Windows build for a package, and then you could sign it, implicitly trusting your friend but not publishing that trust. A more secure solution would have you delegate that target to your friend. As an aside PyPI has been doubling in size every 1.5 - 2 years. Exponential growth strikes again! We have anticipated this, and we have a few solutions to curb the growth of TUF metadata. Since TUF metadata is simply text, GZIP compression would go a long way. Alternatively, we could implement delta updates of TUF metadata. The more difficult problem is how to ensure that target delegation structure scales with PyPI growth. A good design will keep this in mind and plan accordingly. Speaking of which, it may be the case that our design document for integrating PyPI with TUF may not be terribly easy to understand. (After all, you do need to understand TUF first, but TUF is fairly easy once you understand its main ideas.) I plan to publish a friendlier document which introduce TUF at a very high-level and instead discuss more pragmatic issues (such as workflows). ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] A modest proposal for securing PyPI with TUF
Speaking of which, it may be the case that our design document for integrating PyPI with TUF may not be terribly easy to understand. (After all, you do need to understand TUF first, but TUF is fairly easy once you understand its main ideas.) I plan to publish a friendlier document which introduce TUF at a very high-level and instead discuss more pragmatic issues (such as workflows). Feel free to chime in if you'd rather see something else or want us to focus on clarifying a specific topic. Thanks, Justin ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] V3 PEP-draft for transitioning to pypi-hosting of release files
On 03/13/2013 01:33 PM, M.-A. Lemburg wrote: The proposal marks all external links as evil, I'm sorry the text of the PEP gave you that impression. I can see how you'd have gotten it from some of the comments here on catalog-sig, but we went to some lengths to avoid it in the PEP text, and plan to further revise the text to try harder to avoid that implication. In the proposed PEP, we are attempting to balance two things that I believe to be true: 1) There are good and valid reasons for some package owners to prefer external hosting, and it is good for automated installers to easily be able to install such packages (on user request). 2) Installing non-PyPI-hosted packages should not be the *default* behavior of installer tools, for many reasons, among them because that is unusual and surprising behavior to many newcomers to the Python ecosystem, and often leads to concerns on their part about the stability of the ecosystem. These are the axioms, if you will, of this proposal, and while I'd guess many people in this discussion are at least slightly uncomfortable with one or the other of them, I think accepting both is the most likely path to a compromise everyone can live with. I think we can find a solution that embraces both these axioms and maintains good backwards-compatibility and usability. Holger and I had a long talk this evening about that, and here are some of our thoughts: A) You mentioned opt-in PyPI caching of externally-hosted files as a means to improve reliability. We basically agree, but implementing this on the PyPI side adds complexity to the PyPI implementation that we are hesitant to propose. Rather, we propose that this is better handled by a client-side tool that you point at a PyPI release with externally-hosted files, and it simply copies those release files onto PyPI. This has essentially the same effect. We envision this being a simple enough tool that it could reasonably be run for every release of a project in an ongoing way, not just as a one-time project-wide migration. We plan to change the line in the PEP that says the existence of this tool is NOT REQUIRED to begin the phase 2 transition to instead say that the existence of this tool IS REQUIRED before the phase 2 transition begins. (Holger already has a partial implementation of this tool.) B) We also plan to change the PEP to say even more strongly that installer tools should provide an easy option for installing externally-hosted projects, and that our definition of easy includes the ability for an installer to automatically tell a user what options they can use to install a specific externally-hosted package that the tool is refusing to install by default. C) To make that latter part of (B) easier, we also propose that the basic simple index include a link with a distinct rel attribute that points to the -with-externals index page for that project, only for a package that has external links. This way even tools using the no-externals index by default can notify users of the existence of external links for a project when they try to install it. There's also another possible change, a bit more significant, that we discussed that I'd be curious to hear your thoughts on. The initial motivation for separating external links from the main simple/ index was twofold: 1) Allow future tools to distinguish between internal and external links without every tool needing to implement host-comparison algorithms (which may break indexes that host internal files on a CDN), and 2) Allow today's installers, without upgrade, to automatically migrate eventually to no-external-installs-by-default. Some things have caused us to re-evaluate these points: - PyPI can automatically tag internal/external links in the simple index with rel=internal and rel=external, which gives future tools a more reliable marker than host-comparison. So this takes care of #1. - It may be that giving up #2 is acceptable in the interest of better backward-compatibility. Old tools will still gain most of the benefits of this PEP due to the eventual elimination of automatic link-scraping (both from metadata and external pages) and the move to explicit submission of external links, only for those projects that want them. And old tools will not be able to provide a useful error message to users trying to install an externally-hosted package that is no longer listed in the main simple/ index, which is a bad usability breakage. Given that, we are thinking of perhaps simplifying the PEP to eliminate the separate -with-externals index, and list external links in the main simple/ index, clearly marked with rel=external. The PEP would still recommend that future installer tools not follow rel=external links without specific user authorization. Old tools still get many of the benefits, without the breakage. and instead of making external links more secure, the user is left with the option to either not enable external links at all, or to let the devil in
Re: [Catalog-sig] A modest proposal for securing PyPI with TUF
On Wed, Mar 13, 2013 at 8:11 PM, Trishank Karthik Kuppusamy t...@students.poly.edu wrote: On 03/13/2013 02:15 PM, Daniel Holth wrote: With all the different kinds of metadata, It's interesting to note that currently TUF seems to only be concerned with the available file names and their integrity. (Some of us will think of PEP 426 PKG-INFO first when we hear the word metadata.) Yes, you are right that the many different kinds of metadata in this discussion (TUF metadata, PyPI metadata) makes things a little confusing sometimes! :)) My understanding of PEP 426 is that the distribution metadata is specified by the developer with the setup.py script. To take the running Django example, since the Django developers will sign everything under the Django role with their own keys that the D role will talk about, setup.py, as well as the generated PKG-INFO, will be signed by the Django developers. This means that pip + TUF will be able to verify these distribution metadata indirectly via the source distribution package. Does this answer your question? Thanks, yes. The individual .tar.gz distributions do contain PKG-INFO but we would eventually like to expose it in a more efficient way. Then to be suitably paranoid you would also have to check that it matched the package you downloaded! :( Also note that on http://crate.io the simple index works the same way as on pypi, except that the actual packages are on a different (CDN) host. Thanks, Daniel It looks like the D metadata lists all the filenames for Django, and then Django lists them again with hashes and signatures. Why all the lists? Does every Django release re-assert all the versions of Django that are available on the index? Good observation. For D, you are talking about the paths attribute here: https://updateframework.com/pypi/repository/metadata/targets/packages/source/D.txt For Django, you are talking about the targets attribute here: https://updateframework.com/pypi/repository/metadata/targets/packages/source/D/Django.txt Why is paths in D listing all the targets that Django already talks about? Presently, this is because our target delegation tool (signercli.py) is being paranoid and making sure that D is explicitly delegating only targets matching these paths. However, the TUF specification allows for D to simply say, I delegate any target whatsoever under Django, by settings paths to packages/source/D/Django/**: https://www.updateframework.com/browser/specs/tuf-spec.txt#L525 How might I deal with producing the official source distribution myself and having a friend produce the official Windows build of a package? There are a few solutions. You could have your friend produce the official Windows build for a package, and then you could sign it, implicitly trusting your friend but not publishing that trust. A more secure solution would have you delegate that target to your friend. As an aside PyPI has been doubling in size every 1.5 - 2 years. Exponential growth strikes again! We have anticipated this, and we have a few solutions to curb the growth of TUF metadata. Since TUF metadata is simply text, GZIP compression would go a long way. Alternatively, we could implement delta updates of TUF metadata. The more difficult problem is how to ensure that target delegation structure scales with PyPI growth. A good design will keep this in mind and plan accordingly. Speaking of which, it may be the case that our design document for integrating PyPI with TUF may not be terribly easy to understand. (After all, you do need to understand TUF first, but TUF is fairly easy once you understand its main ideas.) I plan to publish a friendlier document which introduce TUF at a very high-level and instead discuss more pragmatic issues (such as workflows). ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
[Catalog-sig] ResponseNotReady error while trying to do fresh sync
Hello, I'm maintaining e.pypi.python.org (with Aron Xu). We met some issues on our network attached storage, so we decided to do a fresh sync of pypi. We met an issue while doing that, we got an exception httplib.ResponseNotReady similar to this mail http://mail.python.org/pipermail/catalog-sig/2013-February/005224.html; Currently, we ignored all packages with that issues, and finish the sync. But there would be some files missing. The three packages which cause that exception are listed below: https://pypi.python.org/simple/iterator/ https://pypi.python.org/simple/nester_test_ling/ https://pypi.python.org/simple/nesterswe/ Please notify us when it get fixed, so that we can update it and make it completed. Best Regards, Qijiang Fan ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] A modest proposal for securing PyPI with TUF
On 3/13/13 9:19 PM, Daniel Holth wrote: Thanks, yes. The individual .tar.gz distributions do contain PKG-INFO but we would eventually like to expose it in a more efficient way. Then to be suitably paranoid you would also have to check that it matched the package you downloaded! :( Great, glad we could help. Well, at least the paranoid would just need an extra download :)) Also note that on http://crate.io the simple index works the same way as on pypi, except that the actual packages are on a different (CDN) host. Got it. I'll take a look at crate.io to see how it works. Conceivably, the TUF metadata and the PyPI files could live in separate locations altogether and we would just have to check that the TUF metadata matches the PyPI files. ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig