Re: [Catalog-sig] V2 pre-PEP: transitioning to release file hosting on PYPI

2013-03-13 Thread Nick Coghlan
On Tue, Mar 12, 2013 at 12:59 PM, M.-A. Lemburg m...@egenix.com wrote:
 I think we should establish a versioned API like that for PyPI
 to make progress easier. All major web APIs use versioning
 for this reason.

Why set up versioning for something we want to phase out? There will
never be a simple-v3, so this is really overengineering the proposed
change.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


[Catalog-sig] A modest proposal for securing PyPI with TUF

2013-03-13 Thread Trishank Karthik Kuppusamy

Hello everyone,

I am pleased to announce our demonstration of PyPI and pip with TUF.

Firstly, we solicit your thoughts and comments on our design document 
for integrating PyPI with TUF:


https://docs.google.com/document/d/1sHMhgrGXNCvBZdmjVJzuoN5uMaUAUDWBmn3jo7vxjjw/edit?usp=sharing

Secondly, you may wish to test our demo of PyPI and pip with TUF:

https://github.com/dachshund/pip/wiki/pip-over-TUF

Thirdly, this is how little it takes to secure pip with TUF:

https://github.com/dachshund/pip/compare/develop...tuf

Finally, you may be interested to learn about how one might manually 
secure a PyPI package index with TUF:


https://github.com/dachshund/pip/wiki/PyPI-over-TUF

We are excited to be able to show this to you now, and in person at our 
lightning talk at PyCon this Friday.


We think that there is great potential for the PyPI and TUF community to 
work together to secure Python package management. This is just the 
beginning, and there is some work left to do, but we are confident that 
we have demonstrated to you that PyPI could be secured with TUF in the 
very near future. We would be happy to discuss with you how we compare 
with other proposals.


We look forward to your questions and feedback!

Thanks,
Trishank

___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] V2 pre-PEP: transitioning to release file hosting on PYPI

2013-03-13 Thread M.-A. Lemburg
On 13.03.2013 07:28, Nick Coghlan wrote:
 On Tue, Mar 12, 2013 at 12:59 PM, M.-A. Lemburg m...@egenix.com wrote:
 I think we should establish a versioned API like that for PyPI
 to make progress easier. All major web APIs use versioning
 for this reason.
 
 Why set up versioning for something we want to phase out? There will
 never be a simple-v3, so this is really overengineering the proposed
 change.

Who says that we want to phase out the /simple/ index ?

FWIW, I don't think that two or three small changes to the PyPI
(see my email to Holger) server warrants calling this over-engineering.
This is about moving forward in a backwards compatible and future
proof way.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 13 2013)
 Python Projects, Consulting and Support ...   http://www.egenix.com/
 mxODBC.Zope/Plone.Database.Adapter ...   http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


: Try our mxODBC.Connect Python Database Interface for free ! ::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] A modest proposal for securing PyPI with TUF

2013-03-13 Thread Nick Coghlan
On Tue, Mar 12, 2013 at 11:41 PM, Trishank Karthik Kuppusamy
t...@students.poly.edu wrote:
 Hello everyone,

 I am pleased to announce our demonstration of PyPI and pip with TUF.

 Firstly, we solicit your thoughts and comments on our design document for
 integrating PyPI with TUF:

 https://docs.google.com/document/d/1sHMhgrGXNCvBZdmjVJzuoN5uMaUAUDWBmn3jo7vxjjw/edit?usp=sharing

Thanks for putting this together!

Just a few notes regarding key management:
- the PSF board generally stays out of the technical details of
running the python.org infrastructure, so it's likely that any root
keys would be handled by the PSF infrastructure committee. A (2, 4) or
(3, 5) trust configuration would likely be manageable at this level.
- at the target delegation level, PyPI supports the registration of
new projects through the web service (see
http://docs.python.org/2/distutils/packageindex.html). If my
understanding of target delegation is correct, this means the simple
and packages/source/letter delegations will need to be (1, 1) and
online.
- higher levels of the target delegation hierarchy could conceivably
be kept offline, but there seems little value in doing so if they're
trusting on online (1, 1) key
- many PyPI packages are maintained by single developers, so (1, 1) or
(1, n) is likely to be the only generally feasible level of signing at
the project level.

With the current focus being on getting an improvement from the status
quo that we can successfully deploy in a reasonable period of time,
the target delegation side of things probably needs to be
substantially simpler in the initial iteration. Yes, it leaves us open
to certain vulnerabilities we would like to remove in the long run,
but we need to be very cautious in the additional demands we place on
the users uploading to PyPI. It may even mean the initial iteration
allows projects to rely on a PyPI provided signing key for their TUF
metadata, using the existing upload mechanisms to add the files to
PyPI.

Regards,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] A modest proposal for securing PyPI with TUF

2013-03-13 Thread Trishank Karthik Kuppusamy

Hello Nick,

On 3/13/13 4:09 AM, Nick Coghlan wrote:


- the PSF board generally stays out of the technical details of
running the python.org infrastructure, so it's likely that any root
keys would be handled by the PSF infrastructure committee. A (2, 4) or
(3, 5) trust configuration would likely be manageable at this level.


Understood. We think a higher (t, n) [where t out of n signatures are 
needed to trust the metadata for a role] is better for the root role 
simply because its crucial metadata (the authorized keys for top-level 
roles) should change very rarely.



- at the target delegation level, PyPI supports the registration of
new projects through the web service (see
http://docs.python.org/2/distutils/packageindex.html). If my
understanding of target delegation is correct, this means the simple
and packages/source/letter delegations will need to be (1, 1) and
online.
- higher levels of the target delegation hierarchy could conceivably
be kept offline, but there seems little value in doing so if they're
trusting on online (1, 1) key


Fortunately, the targets/simple and 
targets/packages/(version)/(letter)/ roles should not require (1, 1) 
online keys, as their metadata (simply target delegations and no actual 
target files) should also fluctuate fairly rarely. I should make this 
clearer in our design document.



- many PyPI packages are maintained by single developers, so (1, 1) or
(1, n) is likely to be the only generally feasible level of signing at
the project level.


Yes, the package developers themselves could choose any (t, n) they 
like. In our design, we propose that PyPI could eventually delegate to 
stable packages which need little change (and use more security with 
more offline keys) and to unstable packages which need frequent change 
(and use less security with more online keys).



With the current focus being on getting an improvement from the status
quo that we can successfully deploy in a reasonable period of time,
the target delegation side of things probably needs to be
substantially simpler in the initial iteration. Yes, it leaves us open
to certain vulnerabilities we would like to remove in the long run,
but we need to be very cautious in the additional demands we place on
the users uploading to PyPI. It may even mean the initial iteration
allows projects to rely on a PyPI provided signing key for their TUF
metadata, using the existing upload mechanisms to add the files to
PyPI.


I agree that there is a delicate problem of balancing security with 
usability here, especially in the beginning.


You raised a very good issue there: on first migration, how would PyPI 
accommodate packages which have not had their target files delegated to 
their developers? We imagine that in this case, PyPI could assume 
initial responsibility for these packages, and later PyPI would delegate 
those packages to their respective developers.


Thanks for your input,
Trishank

___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


[Catalog-sig] V3 PEP-draft for transitioning to pypi-hosting of release files

2013-03-13 Thread holger krekel
Hi all,

after some more discussions and hours spend by Carl Meyer (who is now
co-authoring the PEP) and me, here is a new V3 pre-submit draft.  
It is now more ambitious than the previous draft as should be obvious
from the modified abstract (and Carl Meyers and Philip's earlier
interactions on this list).  There also are more details of how
the current link-scraping works among other improvements and incorporations
of feedback from discussions here.

We intend to submit this draft tonight to the PEP editors.  

Feedback now and later remains welcome.  I am sure there are issues to 
be sorted and clarified, among them the versioning-API suggestion by 
Marc-Andre.

Thanks for everybody's support and feedback so far,
holger


PEP: XXX
Title: Transitioning to release-file hosting on PyPI
Version: $Revision$
Last-Modified: $Date$
Author: Holger Krekel hol...@merlinux.eu, Carl Meyer c...@oddbird.net
Discussions-To: catalog-sig@python.org
Status: Draft (PRE-submit V3)
Type: Process
Content-Type: text/x-rst
Created: 10-Mar-2013
Post-History:


Abstract


This PEP proposes a backward-compatible two-phase transition process to speed
up, simplify and robustify installing from the pypi.python.org (PyPI)
package index.  To ease the transition and minimize client-side
friction, **no changes to distutils or existing installation tools are
required in order to benefit from the transition phases, which is to
result in faster, more reliable installs for most existing packages**.

The first transition phase implements easy and explicit means for
a package maintainter to control which release file links are 
served to present-day installation tools.  The first phase also
includes the implementation of analysis tools for present-day packages,
to support communication with package maintainers and the automated
setting of default modes for controling release file links.   

The second transition phase will result in the current PYPI index 
to only serve PYPI-hosted files by default.  Externally hosted files
will still be automatically discoverable through a second index. 
Present-day installation tools will be able to continue working
by specifying this second index.  New versions of installation
tools shall default to only install packages from PYPI unless
the user explicitely wishes to include non-PYPI sites.



Rationale
=

.. _history:

History and motivations for external hosting


When PyPI went online, it offered release registration but had no
facility to host release files itself.  When hosting was added, no
automated downloading tool existed yet.  When Philip Eby implemented
automated downloading (through setuptools), he made the choice to
allow people to use download hosts of their choice.  The finding of
externally-hosted packages was implemented as follows:

#. The PyPI ``simple/`` index for a package contains all links found
   anywhere in that package's metadata for any release. Links in the
   Download-URL and Home-page metadata fields are given
   ``rel=download`` and ``rel=homepage`` attributes, respectively.

#. Any of these links whose target is a file whose name appears to be
   in the form of an installable source or binary distribution, with
   basename in the form packagename-version.ARCHIVEEXT, is considered 
   a potential installation candidate.

#. Similarly, any links suffixed with an #egg=packagename-version
   fragment are considered an installation candidate.

#. Additionally, the ``rel=homepage`` and ``rel=download`` links are
   followed and, if HTML, are themselves scraped for release-file links
   in the above formats.

Today, most packages released on PyPI host their release files on
PyPI, but a small percentage (XXX need updated data) rely on external
hosting.

There are many reasons [2]_ why people have chosen external
hosting. To cite just a few:

- release processes and scripts have been developed already and upload
  to external sites

- it takes too long to upload large files from some places in the
  world

- export restrictions e.g. for crypto-related software

- company policies which require offering open source packages
  through own sites

- problems with integrating uploading to PYPI into one's release
  process (because of release policies)

- desiring download statistics different from those maintained by PyPI

- perceived bad reliability of PYPI

- not aware that PyPI offers file-hosting

Irrespective of the present-day validity of these reasons, there
clearly is a history why people choose to host files externally and it
even was for some time the only way you could do things.


Problem
---

**Today, python package installers (pip, easy_install, buildout, and
others) often need to query many non-PyPI URLs even if there are no
externally hosted files**.  Apart from querying pypi.python.org's
simple index pages, also all homepages and download pages ever
specified with any release of a package are crawled by an 

Re: [Catalog-sig] V3 PEP-draft for transitioning to pypi-hosting of release files

2013-03-13 Thread PJ Eby
On Wed, Mar 13, 2013 at 7:21 AM, holger krekel hol...@merlinux.eu wrote:
 Hi all,

 after some more discussions and hours spend by Carl Meyer (who is now
 co-authoring the PEP) and me, here is a new V3 pre-submit draft.
 It is now more ambitious than the previous draft as should be obvious
 from the modified abstract (and Carl Meyers and Philip's earlier
 interactions on this list).  There also are more details of how
 the current link-scraping works among other improvements and incorporations
 of feedback from discussions here.

 We intend to submit this draft tonight to the PEP editors.

 Feedback now and later remains welcome.  I am sure there are issues to
 be sorted and clarified, among them the versioning-API suggestion by
 Marc-Andre.

 Thanks for everybody's support and feedback so far,
 holger

Looks good to me!

Setuptools' two releases will probably look like this:

1. Default to externals index, warn when fetching URLs that are not
the same host as the index
2. Default to externals index, reject URLs that are not the same host
as the index unless --allow-hosts is configured  (IOW, default
allow-hosts to equal index-url host)

That way, external URLs can still be discovered by the user, but the
default configuration is still secure.
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] pre-PEP: transition to release-file hosting at pypi site

2013-03-13 Thread Tres Seaver
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 03/12/2013 03:57 PM, holger krekel wrote:
 Nobody should be lead to think that PYPI is a trusted or reviewed
 source of software even if we got rid of external hosting completely.

Amen.  I still boggle at the amount of sky is falling stuff here over
MITM / external links / whatever, given the potential damaage from
explicitly malicious uploads (trojans, viruses, whatever).  Package
signing might help here, but only for consumers who willing to think hard
enough about the problem to manage a web of trust (frankly, a vanishingly
small minority).

And then there are these problems:

- - Backward-imcompatible releases (even those which make appropriate
  signals in their version numbers).

- - Removal of distributions / releases / projects.

- - Re-upload of new distributions which sliently replace previous
  distributions *of the same release* (Yes, Virginia, there are
  people out there who do this).

which are deal-killers for the folks who want always-on, reliable,
repeatable, automatic installation from PyPI (instead of creating their
own indexes).

Adding HTTPS or removing external links does nothing to mitigate those
issues.


Tres.
- -- 
===
Tres Seaver  +1 540-429-0999  tsea...@palladion.com
Palladion Software   Excellence by Designhttp://palladion.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with undefined - http://www.enigmail.net/

iEYEARECAAYFAlFArywACgkQ+gerLs4ltQ7zLACgluGTMdUYheeMGoFgAUH1VZja
VJYAnjBPXbs8yeQ1FYa0mNZhAkTlcJQf
=8KSF
-END PGP SIGNATURE-

___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] pre-PEP: transition to release-file hosting at pypi site

2013-03-13 Thread Donald Stufft

On Mar 13, 2013, at 12:54 PM, Tres Seaver tsea...@palladion.com wrote:

 Signed PGP part
 On 03/12/2013 03:57 PM, holger krekel wrote:
  Nobody should be lead to think that PYPI is a trusted or reviewed
  source of software even if we got rid of external hosting completely.
 
 Amen.  I still boggle at the amount of sky is falling stuff here over
 MITM / external links / whatever, given the potential damaage from
 explicitly malicious uploads (trojans, viruses, whatever).  Package
 signing might help here, but only for consumers who willing to think hard
 enough about the problem to manage a web of trust (frankly, a vanishingly
 small minority).

Really now? Let's see I can easily protect against malicous uploads by only 
installing from trusted authors. I cannot easily prevent a MITM or a 
compromised external host if the tools don't protect me against it. Without the 
tooling and infrastructure moving to close this gap the only way to do it is to 
not use that tooling or infrastructure at all. Namely even if the author of the 
package is myself I cannot be secure installing it using the current toolchain 
and infrastructure unless I bend over backwards to make sure that no 
installable link appears anywhere in my long description, and I don't have a 
homepage, and I don't have a download url.

 
 And then there are these problems:
 
 - - Backward-imcompatible releases (even those which make appropriate
   signals in their version numbers).
 
 - - Removal of distributions / releases / projects.
 
 - - Re-upload of new distributions which sliently replace previous
   distributions *of the same release* (Yes, Virginia, there are
   people out there who do this).
 
 which are deal-killers for the folks who want always-on, reliable,
 repeatable, automatic installation from PyPI (instead of creating their
 own indexes).
 
 Adding HTTPS or removing external links does nothing to mitigate those
 issues.

Yes there are other problems, so let's just throw our hands in the air and say 
fuck it instead of iteratively working to secure the system.

 
 
 Tres.
 - -- 
 ===
 Tres Seaver  +1 540-429-0999  tsea...@palladion.com
 Palladion Software   Excellence by Designhttp://palladion.com
 
 
 ___
 Catalog-SIG mailing list
 Catalog-SIG@python.org
 http://mail.python.org/mailman/listinfo/catalog-sig

-
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] V3 PEP-draft for transitioning to pypi-hosting of release files

2013-03-13 Thread Donald Stufft
On Mar 13, 2013, at 10:26 AM, PJ Eby p...@telecommunity.com wrote:

 On Wed, Mar 13, 2013 at 7:21 AM, holger krekel hol...@merlinux.eu wrote:
 Hi all,
 
 after some more discussions and hours spend by Carl Meyer (who is now
 co-authoring the PEP) and me, here is a new V3 pre-submit draft.
 It is now more ambitious than the previous draft as should be obvious
 from the modified abstract (and Carl Meyers and Philip's earlier
 interactions on this list).  There also are more details of how
 the current link-scraping works among other improvements and incorporations
 of feedback from discussions here.
 
 We intend to submit this draft tonight to the PEP editors.
 
 Feedback now and later remains welcome.  I am sure there are issues to
 be sorted and clarified, among them the versioning-API suggestion by
 Marc-Andre.
 
 Thanks for everybody's support and feedback so far,
 holger
 
 Looks good to me!
 
 Setuptools' two releases will probably look like this:
 
 1. Default to externals index, warn when fetching URLs that are not
 the same host as the index
 2. Default to externals index, reject URLs that are not the same host
 as the index unless --allow-hosts is configured  (IOW, default
 allow-hosts to equal index-url host)
 
 That way, external URLs can still be discovered by the user, but the
 default configuration is still secure.
 ___
 Catalog-SIG mailing list
 Catalog-SIG@python.org
 http://mail.python.org/mailman/listinfo/catalog-sig


For the record I support the PEP and these 2 steps sound ok to me.

My only suggestion is an additional rel attribute for indexes to indicate this 
is index hosted file incase the index domain and the package host domain differ 
(as is the case with Crate).

-
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] pre-PEP: transition to release-file hosting at pypi site

2013-03-13 Thread Tres Seaver
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 03/13/2013 01:06 PM, Donald Stufft wrote:
 Really now? Let's see I can easily protect against malicous uploads
 by only installing from trusted authors

How do you know who to trust?  What if an author you trust adds a
dependency to a package to an author you have no konwledege of, or one
you actively distrust?  What if an author you trust commits one of the
other changes I outlined (removes a release / distribution, makes
backward-incompatible changes, re-uploads a changed distribution over an
existing one?)

The only way to implement only install from trusted authors is to run
your own index, and explicitly review / curate the package set maintained
there.   In that scenario, you run a script from time to time which looks
for new versions of your packages on PyPI and puts them into a queue for
review.

Bob, a casual reviewer, might install the new verison from PyPI into a
fresh virtualenv and test it there before pushing it into the curated index.

Carol, more pranoid^Wsecurity mindex, downloads the package, verifies its
signature, unpacks the tarball, diffs it against the curated version,
compares that diff against the changelog, looks at new / changed
dependencies, and installs it into a hardened sandbox for testing.  Only
after that kind of review does she push the newly-reviewed distribution
into the curated index.

Adding an entirely new package to the curated index is a similar process,
but requires more effort from either Bob or Carol.


Tres.
- -- 
===
Tres Seaver  +1 540-429-0999  tsea...@palladion.com
Palladion Software   Excellence by Designhttp://palladion.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with undefined - http://www.enigmail.net/

iEYEARECAAYFAlFAtakACgkQ+gerLs4ltQ5O4wCcC92ew66wVGEPBM/Jr8z1bYU8
e9AAoNXmaiuBHQOIFQlT0SRemI43hoG7
=idDp
-END PGP SIGNATURE-

___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] pre-PEP: transition to release-file hosting at pypi site

2013-03-13 Thread Donald Stufft

On Mar 13, 2013, at 1:21 PM, Tres Seaver tsea...@palladion.com wrote:

 Signed PGP part
 On 03/13/2013 01:06 PM, Donald Stufft wrote:
  Really now? Let's see I can easily protect against malicous uploads
  by only installing from trusted authors
 
 How do you know who to trust?  What if an author you trust adds a
 dependency to a package to an author you have no konwledege of, or one
 you actively distrust?  What if an author you trust commits one of the
 other changes I outlined (removes a release / distribution, makes
 backward-incompatible changes, re-uploads a changed distribution over an
 existing one?)
 
 The only way to implement only install from trusted authors is to run
 your own index, and explicitly review / curate the package set maintained
 there.   In that scenario, you run a script from time to time which looks
 for new versions of your packages on PyPI and puts them into a queue for
 review.
 
 Bob, a casual reviewer, might install the new verison from PyPI into a
 fresh virtualenv and test it there before pushing it into the curated index.
 
 Carol, more pranoid^Wsecurity mindex, downloads the package, verifies its
 signature, unpacks the tarball, diffs it against the curated version,
 compares that diff against the changelog, looks at new / changed
 dependencies, and installs it into a hardened sandbox for testing.  Only
 after that kind of review does she push the newly-reviewed distribution
 into the curated index.
 
 Adding an entirely new package to the curated index is a similar process,
 but requires more effort from either Bob or Carol.
 
 
 Tres.
 - -- 
 ===
 Tres Seaver  +1 540-429-0999  tsea...@palladion.com
 Palladion Software   Excellence by Designhttp://palladion.com
 
 
 ___
 Catalog-SIG mailing list
 Catalog-SIG@python.org
 http://mail.python.org/mailman/listinfo/catalog-sig


Threat models are a thing. It the way it *should* work in PyPI is you ask for 
X, you get X and it was not modified in transit (and ideally not on the 
repository as well but that is more difficult). PyPI is not and will never be a 
curated index. However if I trust Author A, then I implicity trust his actions. 
I trust that he won't do your stated issues. 

Now is a curated index *more secure*? Well again it depends on what your threat 
model is. PyPI isn't going to protect you from a malicious or incompetent 
author. For the threat model that PyPI is able to deliver on your system is no 
more or less secure. In fact without the sort of things you dismiss here your 
proposal is also just as insecure unless you only ever access it on a protected 
network which you can be sure no attacker has gained access too.

Even your 3 issues are far less concerning than the fact MiTM on either PyPI 
(fixed now with pip 1.3) or an external url allows a random guy at PyCon to 
execute arbitrary code on your machine if you install a package from PyPI at 
pycon, or at a coffee shop, or on any wifi ever that could have someone else on 
it.

-
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] pre-PEP: transition to release-file hosting at pypi site

2013-03-13 Thread Robert Collins
On 14 March 2013 05:54, Tres Seaver tsea...@palladion.com wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 On 03/12/2013 03:57 PM, holger krekel wrote:
 Nobody should be lead to think that PYPI is a trusted or reviewed
 source of software even if we got rid of external hosting completely.

 Amen.  I still boggle at the amount of sky is falling stuff here over
 MITM / external links / whatever, given the potential damaage from
 explicitly malicious uploads (trojans, viruses, whatever).  Package
 signing might help here, but only for consumers who willing to think hard
 enough about the problem to manage a web of trust (frankly, a vanishingly
 small minority).

Well yes HTTPS and external links are problems which it is necessary
to solve, and not sufficient to make 'pypi secure' - but that doesn't
mean we should do a poor job solving them.

-Rob
-- 
Robert Collins rbtcoll...@hp.com
Distinguished Technologist
HP Cloud Services
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] A modest proposal for securing PyPI with TUF

2013-03-13 Thread Daniel Holth
On Wed, Mar 13, 2013 at 5:13 AM, Trishank Karthik Kuppusamy
t...@students.poly.edu wrote:
 Hello Nick,


 On 3/13/13 4:09 AM, Nick Coghlan wrote:


 - the PSF board generally stays out of the technical details of
 running the python.org infrastructure, so it's likely that any root
 keys would be handled by the PSF infrastructure committee. A (2, 4) or
 (3, 5) trust configuration would likely be manageable at this level.


 Understood. We think a higher (t, n) [where t out of n signatures are needed
 to trust the metadata for a role] is better for the root role simply because
 its crucial metadata (the authorized keys for top-level roles) should change
 very rarely.


 - at the target delegation level, PyPI supports the registration of
 new projects through the web service (see
 http://docs.python.org/2/distutils/packageindex.html). If my
 understanding of target delegation is correct, this means the simple
 and packages/source/letter delegations will need to be (1, 1) and
 online.
 - higher levels of the target delegation hierarchy could conceivably
 be kept offline, but there seems little value in doing so if they're
 trusting on online (1, 1) key


 Fortunately, the targets/simple and targets/packages/(version)/(letter)/
 roles should not require (1, 1) online keys, as their metadata (simply
 target delegations and no actual target files) should also fluctuate fairly
 rarely. I should make this clearer in our design document.


 - many PyPI packages are maintained by single developers, so (1, 1) or
 (1, n) is likely to be the only generally feasible level of signing at
 the project level.


 Yes, the package developers themselves could choose any (t, n) they like. In
 our design, we propose that PyPI could eventually delegate to stable
 packages which need little change (and use more security with more offline
 keys) and to unstable packages which need frequent change (and use less
 security with more online keys).


 With the current focus being on getting an improvement from the status
 quo that we can successfully deploy in a reasonable period of time,
 the target delegation side of things probably needs to be
 substantially simpler in the initial iteration. Yes, it leaves us open
 to certain vulnerabilities we would like to remove in the long run,
 but we need to be very cautious in the additional demands we place on
 the users uploading to PyPI. It may even mean the initial iteration
 allows projects to rely on a PyPI provided signing key for their TUF
 metadata, using the existing upload mechanisms to add the files to
 PyPI.


 I agree that there is a delicate problem of balancing security with
 usability here, especially in the beginning.

 You raised a very good issue there: on first migration, how would PyPI
 accommodate packages which have not had their target files delegated to
 their developers? We imagine that in this case, PyPI could assume initial
 responsibility for these packages, and later PyPI would delegate those
 packages to their respective developers.

 Thanks for your input,
 Trishank

With all the different kinds of metadata, It's interesting to note
that currently TUF seems to only be concerned with the available file
names and their integrity. (Some of us will think of PEP 426
PKG-INFO first when we hear the word metadata.)

It looks like the D metadata lists all the filenames for Django, and
then Django lists them again with hashes and signatures. Why all the
lists? Does every Django release re-assert all the versions of Django
that are available on the index?

How might I deal with producing the official source distribution
myself and having a friend produce the official Windows build of a
package?

As an aside PyPI has been doubling in size every 1.5 - 2 years.

Thanks

Daniel Holth
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] A modest proposal for securing PyPI with TUF

2013-03-13 Thread Justin Cappos
We may have something unclear in the doc.   We definitely don't just worry
about package names.

(In between meetings, will send a longer response in a bit.)

Thanks,
Justin


On Wed, Mar 13, 2013 at 2:15 PM, Daniel Holth dho...@gmail.com wrote:

 On Wed, Mar 13, 2013 at 5:13 AM, Trishank Karthik Kuppusamy
 t...@students.poly.edu wrote:
  Hello Nick,
 
 
  On 3/13/13 4:09 AM, Nick Coghlan wrote:
 
 
  - the PSF board generally stays out of the technical details of
  running the python.org infrastructure, so it's likely that any root
  keys would be handled by the PSF infrastructure committee. A (2, 4) or
  (3, 5) trust configuration would likely be manageable at this level.
 
 
  Understood. We think a higher (t, n) [where t out of n signatures are
 needed
  to trust the metadata for a role] is better for the root role simply
 because
  its crucial metadata (the authorized keys for top-level roles) should
 change
  very rarely.
 
 
  - at the target delegation level, PyPI supports the registration of
  new projects through the web service (see
  http://docs.python.org/2/distutils/packageindex.html). If my
  understanding of target delegation is correct, this means the simple
  and packages/source/letter delegations will need to be (1, 1) and
  online.
  - higher levels of the target delegation hierarchy could conceivably
  be kept offline, but there seems little value in doing so if they're
  trusting on online (1, 1) key
 
 
  Fortunately, the targets/simple and
 targets/packages/(version)/(letter)/
  roles should not require (1, 1) online keys, as their metadata (simply
  target delegations and no actual target files) should also fluctuate
 fairly
  rarely. I should make this clearer in our design document.
 
 
  - many PyPI packages are maintained by single developers, so (1, 1) or
  (1, n) is likely to be the only generally feasible level of signing at
  the project level.
 
 
  Yes, the package developers themselves could choose any (t, n) they
 like. In
  our design, we propose that PyPI could eventually delegate to stable
  packages which need little change (and use more security with more
 offline
  keys) and to unstable packages which need frequent change (and use less
  security with more online keys).
 
 
  With the current focus being on getting an improvement from the status
  quo that we can successfully deploy in a reasonable period of time,
  the target delegation side of things probably needs to be
  substantially simpler in the initial iteration. Yes, it leaves us open
  to certain vulnerabilities we would like to remove in the long run,
  but we need to be very cautious in the additional demands we place on
  the users uploading to PyPI. It may even mean the initial iteration
  allows projects to rely on a PyPI provided signing key for their TUF
  metadata, using the existing upload mechanisms to add the files to
  PyPI.
 
 
  I agree that there is a delicate problem of balancing security with
  usability here, especially in the beginning.
 
  You raised a very good issue there: on first migration, how would PyPI
  accommodate packages which have not had their target files delegated to
  their developers? We imagine that in this case, PyPI could assume initial
  responsibility for these packages, and later PyPI would delegate those
  packages to their respective developers.
 
  Thanks for your input,
  Trishank

 With all the different kinds of metadata, It's interesting to note
 that currently TUF seems to only be concerned with the available file
 names and their integrity. (Some of us will think of PEP 426
 PKG-INFO first when we hear the word metadata.)

 It looks like the D metadata lists all the filenames for Django, and
 then Django lists them again with hashes and signatures. Why all the
 lists? Does every Django release re-assert all the versions of Django
 that are available on the index?

 How might I deal with producing the official source distribution
 myself and having a friend produce the official Windows build of a
 package?

 As an aside PyPI has been doubling in size every 1.5 - 2 years.

 Thanks

 Daniel Holth

___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] V3 PEP-draft for transitioning to pypi-hosting of release files

2013-03-13 Thread M.-A. Lemburg
On 13.03.2013 12:21, holger krekel wrote:
 Hi all,
 
 after some more discussions and hours spend by Carl Meyer (who is now
 co-authoring the PEP) and me, here is a new V3 pre-submit draft.  
 It is now more ambitious than the previous draft as should be obvious
 from the modified abstract (and Carl Meyers and Philip's earlier
 interactions on this list).  There also are more details of how
 the current link-scraping works among other improvements and incorporations
 of feedback from discussions here.
 
 We intend to submit this draft tonight to the PEP editors.  
 
 Feedback now and later remains welcome.  I am sure there are issues to 
 be sorted and clarified, among them the versioning-API suggestion by 
 Marc-Andre.
 
 Thanks for everybody's support and feedback so far,
 holger
 
 
 PEP: XXX
 Title: Transitioning to release-file hosting on PyPI
 Version: $Revision$
 Last-Modified: $Date$
 Author: Holger Krekel hol...@merlinux.eu, Carl Meyer c...@oddbird.net
 Discussions-To: catalog-sig@python.org
 Status: Draft (PRE-submit V3)
 Type: Process
 Content-Type: text/x-rst
 Created: 10-Mar-2013
 Post-History:
 
 
 Abstract
 
 
 This PEP proposes a backward-compatible two-phase transition process to speed
 up, simplify and robustify installing from the pypi.python.org (PyPI)
 package index.  To ease the transition and minimize client-side
 friction, **no changes to distutils or existing installation tools are
 required in order to benefit from the transition phases, which is to
 result in faster, more reliable installs for most existing packages**.
 
 The first transition phase implements easy and explicit means for
 a package maintainter to control which release file links are 
 served to present-day installation tools.  The first phase also
 includes the implementation of analysis tools for present-day packages,
 to support communication with package maintainers and the automated
 setting of default modes for controling release file links.   
 
 The second transition phase will result in the current PYPI index 
 to only serve PYPI-hosted files by default.  Externally hosted files
 will still be automatically discoverable through a second index. 
 Present-day installation tools will be able to continue working
 by specifying this second index.  New versions of installation
 tools shall default to only install packages from PYPI unless
 the user explicitely wishes to include non-PYPI sites.

I must say, don't like this change in motivation compared
to V1 and V2.

The original of the discussion was to make PyPI more secure
and the installation process faster and more reliable
by moving away from crawling arbitrary external web pages.

Both can be had by:

* limiting the crawling to package author defined specific
  URLs, with added hashes to make sure that the URLs and
  their target content is not modified (this is the securing
  external downloads part - see here for an example:
  https://pypi.python.org/pypi/egenix-pyopenssl/0.13.1.1.0.1.5),
  and

* adding a way for the package authors to say PyPI, please go
  ahead and cache/copy my distributions files (this is the
  increase download reliability part - can be had by doing
  opt-in CDN caching/proxying of external links via PyPI)

Now, with V3 of the proposal, you are moving towards a system
that basically says do it this way, or stay out of our eco
system, which, in my book, is not what the Python eco system
is all about.

Your V2 was much more inviting in this respect.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 13 2013)
 Python Projects, Consulting and Support ...   http://www.egenix.com/
 mxODBC.Zope/Plone.Database.Adapter ...   http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


: Try our mxODBC.Connect Python Database Interface for free ! ::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] A modest proposal for securing PyPI with TUF

2013-03-13 Thread Justin Cappos
We use the simple directory and filenames because that is what pip uses.

You have a nice suggestion to include other metadata in the TUF metadata.
We certainly could do this if desirable.   This required a redesign of the
PyPI API and we weren't sure if this was wanted.   Our current doc /
prototype is trying to minimize the changes needed all around.

Thanks,
Justin


On Wed, Mar 13, 2013 at 2:15 PM, Daniel Holth dho...@gmail.com wrote:

 On Wed, Mar 13, 2013 at 5:13 AM, Trishank Karthik Kuppusamy
 t...@students.poly.edu wrote:
  Hello Nick,
 
 
  On 3/13/13 4:09 AM, Nick Coghlan wrote:
 
 
  - the PSF board generally stays out of the technical details of
  running the python.org infrastructure, so it's likely that any root
  keys would be handled by the PSF infrastructure committee. A (2, 4) or
  (3, 5) trust configuration would likely be manageable at this level.
 
 
  Understood. We think a higher (t, n) [where t out of n signatures are
 needed
  to trust the metadata for a role] is better for the root role simply
 because
  its crucial metadata (the authorized keys for top-level roles) should
 change
  very rarely.
 
 
  - at the target delegation level, PyPI supports the registration of
  new projects through the web service (see
  http://docs.python.org/2/distutils/packageindex.html). If my
  understanding of target delegation is correct, this means the simple
  and packages/source/letter delegations will need to be (1, 1) and
  online.
  - higher levels of the target delegation hierarchy could conceivably
  be kept offline, but there seems little value in doing so if they're
  trusting on online (1, 1) key
 
 
  Fortunately, the targets/simple and
 targets/packages/(version)/(letter)/
  roles should not require (1, 1) online keys, as their metadata (simply
  target delegations and no actual target files) should also fluctuate
 fairly
  rarely. I should make this clearer in our design document.
 
 
  - many PyPI packages are maintained by single developers, so (1, 1) or
  (1, n) is likely to be the only generally feasible level of signing at
  the project level.
 
 
  Yes, the package developers themselves could choose any (t, n) they
 like. In
  our design, we propose that PyPI could eventually delegate to stable
  packages which need little change (and use more security with more
 offline
  keys) and to unstable packages which need frequent change (and use less
  security with more online keys).
 
 
  With the current focus being on getting an improvement from the status
  quo that we can successfully deploy in a reasonable period of time,
  the target delegation side of things probably needs to be
  substantially simpler in the initial iteration. Yes, it leaves us open
  to certain vulnerabilities we would like to remove in the long run,
  but we need to be very cautious in the additional demands we place on
  the users uploading to PyPI. It may even mean the initial iteration
  allows projects to rely on a PyPI provided signing key for their TUF
  metadata, using the existing upload mechanisms to add the files to
  PyPI.
 
 
  I agree that there is a delicate problem of balancing security with
  usability here, especially in the beginning.
 
  You raised a very good issue there: on first migration, how would PyPI
  accommodate packages which have not had their target files delegated to
  their developers? We imagine that in this case, PyPI could assume initial
  responsibility for these packages, and later PyPI would delegate those
  packages to their respective developers.
 
  Thanks for your input,
  Trishank

 With all the different kinds of metadata, It's interesting to note
 that currently TUF seems to only be concerned with the available file
 names and their integrity. (Some of us will think of PEP 426
 PKG-INFO first when we hear the word metadata.)

 It looks like the D metadata lists all the filenames for Django, and
 then Django lists them again with hashes and signatures. Why all the
 lists? Does every Django release re-assert all the versions of Django
 that are available on the index?

 How might I deal with producing the official source distribution
 myself and having a friend produce the official Windows build of a
 package?

 As an aside PyPI has been doubling in size every 1.5 - 2 years.

 Thanks

 Daniel Holth

___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] V3 PEP-draft for transitioning to pypi-hosting of release files

2013-03-13 Thread Donald Stufft

On Mar 13, 2013, at 2:57 PM, M.-A. Lemburg m...@egenix.com wrote:

 On 13.03.2013 12:21, holger krekel wrote:
 Hi all,
 
 after some more discussions and hours spend by Carl Meyer (who is now
 co-authoring the PEP) and me, here is a new V3 pre-submit draft.  
 It is now more ambitious than the previous draft as should be obvious
 from the modified abstract (and Carl Meyers and Philip's earlier
 interactions on this list).  There also are more details of how
 the current link-scraping works among other improvements and incorporations
 of feedback from discussions here.
 
 We intend to submit this draft tonight to the PEP editors.  
 
 Feedback now and later remains welcome.  I am sure there are issues to 
 be sorted and clarified, among them the versioning-API suggestion by 
 Marc-Andre.
 
 Thanks for everybody's support and feedback so far,
 holger
 
 
 PEP: XXX
 Title: Transitioning to release-file hosting on PyPI
 Version: $Revision$
 Last-Modified: $Date$
 Author: Holger Krekel hol...@merlinux.eu, Carl Meyer c...@oddbird.net
 Discussions-To: catalog-sig@python.org
 Status: Draft (PRE-submit V3)
 Type: Process
 Content-Type: text/x-rst
 Created: 10-Mar-2013
 Post-History:
 
 
 Abstract
 
 
 This PEP proposes a backward-compatible two-phase transition process to speed
 up, simplify and robustify installing from the pypi.python.org (PyPI)
 package index.  To ease the transition and minimize client-side
 friction, **no changes to distutils or existing installation tools are
 required in order to benefit from the transition phases, which is to
 result in faster, more reliable installs for most existing packages**.
 
 The first transition phase implements easy and explicit means for
 a package maintainter to control which release file links are 
 served to present-day installation tools.  The first phase also
 includes the implementation of analysis tools for present-day packages,
 to support communication with package maintainers and the automated
 setting of default modes for controling release file links.   
 
 The second transition phase will result in the current PYPI index 
 to only serve PYPI-hosted files by default.  Externally hosted files
 will still be automatically discoverable through a second index. 
 Present-day installation tools will be able to continue working
 by specifying this second index.  New versions of installation
 tools shall default to only install packages from PYPI unless
 the user explicitely wishes to include non-PYPI sites.
 
 I must say, don't like this change in motivation compared
 to V1 and V2.
 
 The original of the discussion was to make PyPI more secure
 and the installation process faster and more reliable
 by moving away from crawling arbitrary external web pages.
 
 Both can be had by:
 
 * limiting the crawling to package author defined specific
  URLs, with added hashes to make sure that the URLs and
  their target content is not modified (this is the securing
  external downloads part - see here for an example:
  https://pypi.python.org/pypi/egenix-pyopenssl/0.13.1.1.0.1.5),
  and
 
 * adding a way for the package authors to say PyPI, please go
  ahead and cache/copy my distributions files (this is the
  increase download reliability part - can be had by doing
  opt-in CDN caching/proxying of external links via PyPI)
 
 Now, with V3 of the proposal, you are moving towards a system
 that basically says do it this way, or stay out of our eco
 system, which, in my book, is not what the Python eco system
 is all about.
 

I don't see how? The -with-externals index will still contain all the existing 
links, and indeed PJ Elby has already stated that setuptools will move to 
support this index by default but with proper warnings to people so they know 
they are installing a package off site.

This allows existing tools to be moved to a secure by default position. Allows 
future tools to choose if they want to enable the existing behavior through use 
of -with-externals (hopefully with a warning or opt-in sort of thing like laid 
out by PJE, but it's certainly not required). And even allows users of existing 
tools to opt into the old behavior via the -i option.

Maybe i'm missing it but in what way does this force authors to do it this way 
or stay out of our eco system since all the same options are available as 
there are today?

 Your V2 was much more inviting in this respect.
 
 -- 
 Marc-Andre Lemburg
 eGenix.com
 
 Professional Python Services directly from the Source  (#1, Mar 13 2013)
 Python Projects, Consulting and Support ...   http://www.egenix.com/
 mxODBC.Zope/Plone.Database.Adapter ...   http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/
 
 
 : Try our mxODBC.Connect Python Database Interface for free ! ::
 
   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO 

Re: [Catalog-sig] V3 PEP-draft for transitioning to pypi-hosting of release files

2013-03-13 Thread M.-A. Lemburg
On 13.03.2013 20:08, Donald Stufft wrote:
 
 On Mar 13, 2013, at 2:57 PM, M.-A. Lemburg m...@egenix.com wrote:
 
 On 13.03.2013 12:21, holger krekel wrote:
 [V3 proposal]

 I must say, don't like this change in motivation compared
 to V1 and V2.

 The original of the discussion was to make PyPI more secure
 and the installation process faster and more reliable
 by moving away from crawling arbitrary external web pages.

 Both can be had by:

 * limiting the crawling to package author defined specific
  URLs, with added hashes to make sure that the URLs and
  their target content is not modified (this is the securing
  external downloads part - see here for an example:
  https://pypi.python.org/pypi/egenix-pyopenssl/0.13.1.1.0.1.5),
  and

 * adding a way for the package authors to say PyPI, please go
  ahead and cache/copy my distributions files (this is the
  increase download reliability part - can be had by doing
  opt-in CDN caching/proxying of external links via PyPI)

 Now, with V3 of the proposal, you are moving towards a system
 that basically says do it this way, or stay out of our eco
 system, which, in my book, is not what the Python eco system
 is all about.

 
 I don't see how? The -with-externals index will still contain all the 
 existing links, and indeed PJ Elby has already stated that setuptools will 
 move to support this index by default but with proper warnings to people so 
 they know they are installing a package off site.

 This allows existing tools to be moved to a secure by default position. 
 Allows future tools to choose if they want to enable the existing behavior 
 through use of -with-externals (hopefully with a warning or opt-in sort of 
 thing like laid out by PJE, but it's certainly not required). And even allows 
 users of existing tools to opt into the old behavior via the -i option.
 
 Maybe i'm missing it but in what way does this force authors to do it this 
 way or stay out of our eco system since all the same options are available 
 as there are today?

The proposal marks all external links as evil, and instead of
making external links more secure, the user is left with the option
to either not enable external links at all, or to let the
devil in :-)

That's not nice. It's also security theater.

The real problem is unreviewed code getting executed by users,
or worse, automated build systems. Yet, we let users believe
that everything is secured on PyPI.

Taking an extreme position, it would probably be better just
leave everything as it is and instead educate users about the
risk they are taking with a pip install AngryBirds, signed
with keys issued by the PSF on the official PyPI server,
delivered straight to your drive via the latest in crypto
technology, only to wipe your notebook...

But then, I don't like extreme positions, so would rather
like to incrementally improve the situation both from the
server and the client side, both addressing user and author
concerns, and keeping the Python eco system a friendly place
to be.

 Your V2 was much more inviting in this respect.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 13 2013)
 Python Projects, Consulting and Support ...   http://www.egenix.com/
 mxODBC.Zope/Plone.Database.Adapter ...   http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


: Try our mxODBC.Connect Python Database Interface for free ! ::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] V3 PEP-draft for transitioning to pypi-hosting of release files

2013-03-13 Thread Daniel Holth
On Wed, Mar 13, 2013 at 3:33 PM, M.-A. Lemburg m...@egenix.com wrote:
 On 13.03.2013 20:08, Donald Stufft wrote:

 On Mar 13, 2013, at 2:57 PM, M.-A. Lemburg m...@egenix.com wrote:

 On 13.03.2013 12:21, holger krekel wrote:
 [V3 proposal]

 I must say, don't like this change in motivation compared
 to V1 and V2.

 The original of the discussion was to make PyPI more secure
 and the installation process faster and more reliable
 by moving away from crawling arbitrary external web pages.

 Both can be had by:

 * limiting the crawling to package author defined specific
  URLs, with added hashes to make sure that the URLs and
  their target content is not modified (this is the securing
  external downloads part - see here for an example:
  https://pypi.python.org/pypi/egenix-pyopenssl/0.13.1.1.0.1.5),
  and

 * adding a way for the package authors to say PyPI, please go
  ahead and cache/copy my distributions files (this is the
  increase download reliability part - can be had by doing
  opt-in CDN caching/proxying of external links via PyPI)

 Now, with V3 of the proposal, you are moving towards a system
 that basically says do it this way, or stay out of our eco
 system, which, in my book, is not what the Python eco system
 is all about.


 I don't see how? The -with-externals index will still contain all the 
 existing links, and indeed PJ Elby has already stated that setuptools will 
 move to support this index by default but with proper warnings to people so 
 they know they are installing a package off site.

 This allows existing tools to be moved to a secure by default position. 
 Allows future tools to choose if they want to enable the existing behavior 
 through use of -with-externals (hopefully with a warning or opt-in sort of 
 thing like laid out by PJE, but it's certainly not required). And even 
 allows users of existing tools to opt into the old behavior via the -i 
 option.

 Maybe i'm missing it but in what way does this force authors to do it this 
 way or stay out of our eco system since all the same options are available 
 as there are today?

 The proposal marks all external links as evil, and instead of
 making external links more secure, the user is left with the option
 to either not enable external links at all, or to let the
 devil in :-)

 That's not nice. It's also security theater.

 The real problem is unreviewed code getting executed by users,
 or worse, automated build systems. Yet, we let users believe
 that everything is secured on PyPI.

 Taking an extreme position, it would probably be better just
 leave everything as it is and instead educate users about the
 risk they are taking with a pip install AngryBirds, signed
 with keys issued by the PSF on the official PyPI server,
 delivered straight to your drive via the latest in crypto
 technology, only to wipe your notebook...

 But then, I don't like extreme positions, so would rather
 like to incrementally improve the situation both from the
 server and the client side, both addressing user and author
 concerns, and keeping the Python eco system a friendly place
 to be.

 Your V2 was much more inviting in this respect.

Perhaps it would be better to decide whether it is reliability
theater and concentrate on consistency rather than whether the code
actually does what you want. It is nice to have a system that at least
prevents targeted third party bad-package attacks.
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] V3 PEP-draft for transitioning to pypi-hosting of release files

2013-03-13 Thread Donald Stufft

On Mar 13, 2013, at 3:33 PM, M.-A. Lemburg m...@egenix.com wrote:

 On 13.03.2013 20:08, Donald Stufft wrote:
 
 On Mar 13, 2013, at 2:57 PM, M.-A. Lemburg m...@egenix.com wrote:
 
 On 13.03.2013 12:21, holger krekel wrote:
 [V3 proposal]
 
 I must say, don't like this change in motivation compared
 to V1 and V2.
 
 The original of the discussion was to make PyPI more secure
 and the installation process faster and more reliable
 by moving away from crawling arbitrary external web pages.
 
 Both can be had by:
 
 * limiting the crawling to package author defined specific
 URLs, with added hashes to make sure that the URLs and
 their target content is not modified (this is the securing
 external downloads part - see here for an example:
 https://pypi.python.org/pypi/egenix-pyopenssl/0.13.1.1.0.1.5),
 and
 
 * adding a way for the package authors to say PyPI, please go
 ahead and cache/copy my distributions files (this is the
 increase download reliability part - can be had by doing
 opt-in CDN caching/proxying of external links via PyPI)
 
 Now, with V3 of the proposal, you are moving towards a system
 that basically says do it this way, or stay out of our eco
 system, which, in my book, is not what the Python eco system
 is all about.
 
 
 I don't see how? The -with-externals index will still contain all the 
 existing links, and indeed PJ Elby has already stated that setuptools will 
 move to support this index by default but with proper warnings to people so 
 they know they are installing a package off site.
 
 This allows existing tools to be moved to a secure by default position. 
 Allows future tools to choose if they want to enable the existing behavior 
 through use of -with-externals (hopefully with a warning or opt-in sort of 
 thing like laid out by PJE, but it's certainly not required). And even 
 allows users of existing tools to opt into the old behavior via the -i 
 option.
 
 Maybe i'm missing it but in what way does this force authors to do it this 
 way or stay out of our eco system since all the same options are available 
 as there are today?
 
 The proposal marks all external links as evil, and instead of
 making external links more secure, the user is left with the option
 to either not enable external links at all, or to let the
 devil in :-)

It doesn't mark them as evil, it marks them as requiring users to opt into 
them. Authors are free to not publish their packages directly to PyPI and users 
are free to opt in to installing the external urls that the authors haven 
chosen to publish. Further more it gives package authors complete control over 
what urls appear on their simple index page.

ISTM that this is even friendlier than before because now both sides have 
explicitly decided to use those urls, instead of it being completely implicit 
on one said, and partially implicit on the other.

 
 That's not nice. It's also security theater.

It's not security theater, it moves the defaults to more secure. Further work 
can (and will be) to ensure that for those users and authors who opt into the 
external urls it's still secure while again requiring both sides to explicitly 
opt into it.

 
 The real problem is unreviewed code getting executed by users,
 or worse, automated build systems. Yet, we let users believe
 that everything is secured on PyPI.

We? I' don't think anyones ever said that *everything is secured on pypi*. 
The best the PyPI infrastructure and tooling can do (security wise) is to try 
and make as sure as possible then when you ask for foo==X.Y PyPI currently 
can't make that claim for external links.

On top of that many users (and i'd wager most users) are not aware that when 
they install something it reaches outwardly to other hosts. This proposal makes 
it so they *are* aware so they opt into potentially lowering their downtime and 
they opt into exposing details to external hosts (which may or may not be SSL 
secured).

 
 Taking an extreme position, it would probably be better just
 leave everything as it is and instead educate users about the
 risk they are taking with a pip install AngryBirds, signed
 with keys issued by the PSF on the official PyPI server,
 delivered straight to your drive via the latest in crypto
 technology, only to wipe your notebook...
 
 But then, I don't like extreme positions, so would rather
 like to incrementally improve the situation both from the
 server and the client side, both addressing user and author
 concerns, and keeping the Python eco system a friendly place
 to be.
 
 Your V2 was much more inviting in this respect.

This gives _all_ the abilities of the current system (besides spidering random 
urls) with *more* control given to the authors as to what exists on their 
various index pages. This is a net win for everyone involved. The only loss 
is that projects that choose to host externally to PyPI will have people trying 
to install it told to explicitly allow it (as mentioned by PJ Elby).

 
 -- 
 Marc-Andre 

Re: [Catalog-sig] A modest proposal for securing PyPI with TUF

2013-03-13 Thread Trishank Karthik Kuppusamy

On 03/13/2013 02:15 PM, Daniel Holth wrote:


With all the different kinds of metadata, It's interesting to note
that currently TUF seems to only be concerned with the available file
names and their integrity. (Some of us will think of PEP 426
PKG-INFO first when we hear the word metadata.)


Yes, you are right that the many different kinds of metadata in this 
discussion (TUF metadata, PyPI metadata) makes things a little confusing 
sometimes! :))


My understanding of PEP 426 is that the distribution metadata is 
specified by the developer with the setup.py script.


To take the running Django example, since the Django developers will 
sign everything under the Django role with their own keys that the D 
role will talk about, setup.py, as well as the generated PKG-INFO, 
will be signed by the Django developers. This means that pip + TUF will 
be able to verify these distribution metadata indirectly via the source 
distribution package.


Does this answer your question?


It looks like the D metadata lists all the filenames for Django, and
then Django lists them again with hashes and signatures. Why all the
lists? Does every Django release re-assert all the versions of Django
that are available on the index?


Good observation. For D, you are talking about the paths attribute here:

https://updateframework.com/pypi/repository/metadata/targets/packages/source/D.txt

For Django, you are talking about the targets attribute here:

https://updateframework.com/pypi/repository/metadata/targets/packages/source/D/Django.txt

Why is paths in D listing all the targets that Django already talks 
about? Presently, this is because our target delegation tool 
(signercli.py) is being paranoid and making sure that D is explicitly 
delegating only targets matching these paths.


However, the TUF specification allows for D to simply say, I delegate 
any target whatsoever under Django, by settings paths to 
packages/source/D/Django/**:


https://www.updateframework.com/browser/specs/tuf-spec.txt#L525


How might I deal with producing the official source distribution
myself and having a friend produce the official Windows build of a
package?


There are a few solutions. You could have your friend produce the 
official Windows build for a package, and then you could sign it, 
implicitly trusting your friend but not publishing that trust.


A more secure solution would have you delegate that target to your friend.


As an aside PyPI has been doubling in size every 1.5 - 2 years.


Exponential growth strikes again! We have anticipated this, and we have 
a few solutions to curb the growth of TUF metadata. Since TUF metadata 
is simply text, GZIP compression would go a long way. Alternatively, we 
could implement delta updates of TUF metadata.


The more difficult problem is how to ensure that target delegation 
structure scales with PyPI growth. A good design will keep this in mind 
and plan accordingly.


Speaking of which, it may be the case that our design document for 
integrating PyPI with TUF may not be terribly easy to understand. (After 
all, you do need to understand TUF first, but TUF is fairly easy once 
you understand its main ideas.) I plan to publish a friendlier document 
which introduce TUF at a very high-level and instead discuss more 
pragmatic issues (such as workflows).


___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] A modest proposal for securing PyPI with TUF

2013-03-13 Thread Justin Cappos
 Speaking of which, it may be the case that our design document for
 integrating PyPI with TUF may not be terribly easy to understand. (After
 all, you do need to understand TUF first, but TUF is fairly easy once you
 understand its main ideas.) I plan to publish a friendlier document which
 introduce TUF at a very high-level and instead discuss more pragmatic
 issues (such as workflows).


Feel free to chime in if you'd rather see something else or want us to
focus on clarifying a specific topic.

Thanks,
Justin
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] V3 PEP-draft for transitioning to pypi-hosting of release files

2013-03-13 Thread Carl Meyer
On 03/13/2013 01:33 PM, M.-A. Lemburg wrote:
 The proposal marks all external links as evil, 

I'm sorry the text of the PEP gave you that impression. I can see how
you'd have gotten it from some of the comments here on catalog-sig, but
we went to some lengths to avoid it in the PEP text, and plan to further
revise the text to try harder to avoid that implication.

In the proposed PEP, we are attempting to balance two things that I
believe to be true:

1) There are good and valid reasons for some package owners to prefer
external hosting, and it is good for automated installers to easily be
able to install such packages (on user request).

2) Installing non-PyPI-hosted packages should not be the *default*
behavior of installer tools, for many reasons, among them because that
is unusual and surprising behavior to many newcomers to the Python
ecosystem, and often leads to concerns on their part about the stability
of the ecosystem.

These are the axioms, if you will, of this proposal, and while I'd guess
many people in this discussion are at least slightly uncomfortable with
one or the other of them, I think accepting both is the most likely path
to a compromise everyone can live with.

I think we can find a solution that embraces both these axioms and
maintains good backwards-compatibility and usability. Holger and I had a
long talk this evening about that, and here are some of our thoughts:

A) You mentioned opt-in PyPI caching of externally-hosted files as a
means to improve reliability. We basically agree, but implementing this
on the PyPI side adds complexity to the PyPI implementation that we are
hesitant to propose. Rather, we propose that this is better handled by a
client-side tool that you point at a PyPI release with externally-hosted
files, and it simply copies those release files onto PyPI. This has
essentially the same effect. We envision this being a simple enough tool
that it could reasonably be run for every release of a project in an
ongoing way, not just as a one-time project-wide migration. We plan to
change the line in the PEP that says the existence of this tool is NOT
REQUIRED to begin the phase 2 transition to instead say that the
existence of this tool IS REQUIRED before the phase 2 transition begins.
(Holger already has a partial implementation of this tool.)

B) We also plan to change the PEP to say even more strongly that
installer tools should provide an easy option for installing
externally-hosted projects, and that our definition of easy includes
the ability for an installer to automatically tell a user what options
they can use to install a specific externally-hosted package that the
tool is refusing to install by default.

C) To make that latter part of (B) easier, we also propose that the
basic simple index include a link with a distinct rel attribute that
points to the -with-externals index page for that project, only for a
package that has external links. This way even tools using the
no-externals index by default can notify users of the existence of
external links for a project when they try to install it.

There's also another possible change, a bit more significant, that we
discussed that I'd be curious to hear your thoughts on. The initial
motivation for separating external links from the main simple/ index was
twofold: 1) Allow future tools to distinguish between internal and
external links without every tool needing to implement host-comparison
algorithms (which may break indexes that host internal files on a
CDN), and 2) Allow today's installers, without upgrade, to automatically
migrate eventually to no-external-installs-by-default.

Some things have caused us to re-evaluate these points:

- PyPI can automatically tag internal/external links in the simple index
with rel=internal and rel=external, which gives future tools a more
reliable marker than host-comparison. So this takes care of #1.

- It may be that giving up #2 is acceptable in the interest of better
backward-compatibility. Old tools will still gain most of the benefits
of this PEP due to the eventual elimination of automatic link-scraping
(both from metadata and external pages) and the move to explicit
submission of external links, only for those projects that want them.
And old tools will not be able to provide a useful error message to
users trying to install an externally-hosted package that is no longer
listed in the main simple/ index, which is a bad usability breakage.

Given that, we are thinking of perhaps simplifying the PEP to eliminate
the separate -with-externals index, and list external links in the main
simple/ index, clearly marked with rel=external. The PEP would still
recommend that future installer tools not follow rel=external links
without specific user authorization. Old tools still get many of the
benefits, without the breakage.

 and instead of
 making external links more secure, the user is left with the option
 to either not enable external links at all, or to let the
 devil in 

Re: [Catalog-sig] A modest proposal for securing PyPI with TUF

2013-03-13 Thread Daniel Holth
On Wed, Mar 13, 2013 at 8:11 PM, Trishank Karthik Kuppusamy
t...@students.poly.edu wrote:
 On 03/13/2013 02:15 PM, Daniel Holth wrote:


 With all the different kinds of metadata, It's interesting to note
 that currently TUF seems to only be concerned with the available file
 names and their integrity. (Some of us will think of PEP 426
 PKG-INFO first when we hear the word metadata.)


 Yes, you are right that the many different kinds of metadata in this
 discussion (TUF metadata, PyPI metadata) makes things a little confusing
 sometimes! :))

 My understanding of PEP 426 is that the distribution metadata is specified
 by the developer with the setup.py script.

 To take the running Django example, since the Django developers will sign
 everything under the Django role with their own keys that the D role will
 talk about, setup.py, as well as the generated PKG-INFO, will be signed by
 the Django developers. This means that pip + TUF will be able to verify
 these distribution metadata indirectly via the source distribution package.

 Does this answer your question?

Thanks, yes. The individual .tar.gz distributions do contain PKG-INFO
but we would eventually like to expose it in a more efficient way.
Then to be suitably paranoid you would also have to check that it
matched the package you downloaded! :(

Also note that on http://crate.io the simple index works the same way
as on pypi, except that the actual packages are on a different (CDN)
host.

Thanks,

Daniel

 It looks like the D metadata lists all the filenames for Django, and
 then Django lists them again with hashes and signatures. Why all the
 lists? Does every Django release re-assert all the versions of Django
 that are available on the index?


 Good observation. For D, you are talking about the paths attribute here:

 https://updateframework.com/pypi/repository/metadata/targets/packages/source/D.txt

 For Django, you are talking about the targets attribute here:

 https://updateframework.com/pypi/repository/metadata/targets/packages/source/D/Django.txt

 Why is paths in D listing all the targets that Django already talks
 about? Presently, this is because our target delegation tool (signercli.py)
 is being paranoid and making sure that D is explicitly delegating only
 targets matching these paths.

 However, the TUF specification allows for D to simply say, I delegate any
 target whatsoever under Django, by settings paths to
 packages/source/D/Django/**:

 https://www.updateframework.com/browser/specs/tuf-spec.txt#L525


 How might I deal with producing the official source distribution
 myself and having a friend produce the official Windows build of a
 package?


 There are a few solutions. You could have your friend produce the official
 Windows build for a package, and then you could sign it, implicitly trusting
 your friend but not publishing that trust.

 A more secure solution would have you delegate that target to your friend.


 As an aside PyPI has been doubling in size every 1.5 - 2 years.


 Exponential growth strikes again! We have anticipated this, and we have a
 few solutions to curb the growth of TUF metadata. Since TUF metadata is
 simply text, GZIP compression would go a long way. Alternatively, we could
 implement delta updates of TUF metadata.

 The more difficult problem is how to ensure that target delegation structure
 scales with PyPI growth. A good design will keep this in mind and plan
 accordingly.

 Speaking of which, it may be the case that our design document for
 integrating PyPI with TUF may not be terribly easy to understand. (After
 all, you do need to understand TUF first, but TUF is fairly easy once you
 understand its main ideas.) I plan to publish a friendlier document which
 introduce TUF at a very high-level and instead discuss more pragmatic issues
 (such as workflows).

___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


[Catalog-sig] ResponseNotReady error while trying to do fresh sync

2013-03-13 Thread Qijiang Fan
Hello,
I'm maintaining e.pypi.python.org (with Aron Xu).
We met some issues on our network attached storage, so we decided to
do a fresh sync of pypi.
We met an issue while doing that,

we got an exception httplib.ResponseNotReady

similar to this mail
http://mail.python.org/pipermail/catalog-sig/2013-February/005224.html;

Currently, we ignored all packages with that issues, and finish the sync.

But there would be some files missing.

The three packages which cause that exception are listed below:
https://pypi.python.org/simple/iterator/
https://pypi.python.org/simple/nester_test_ling/
https://pypi.python.org/simple/nesterswe/

Please notify us when it get fixed, so that we can update it and make
it completed.

Best Regards,
Qijiang Fan
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] A modest proposal for securing PyPI with TUF

2013-03-13 Thread Trishank Karthik Kuppusamy

On 3/13/13 9:19 PM, Daniel Holth wrote:


Thanks, yes. The individual .tar.gz distributions do contain PKG-INFO
but we would eventually like to expose it in a more efficient way.
Then to be suitably paranoid you would also have to check that it
matched the package you downloaded! :(


Great, glad we could help. Well, at least the paranoid would just need 
an extra download :))



Also note that on http://crate.io the simple index works the same way
as on pypi, except that the actual packages are on a different (CDN)
host.


Got it. I'll take a look at crate.io to see how it works. Conceivably, 
the TUF metadata and the PyPI files could live in separate locations 
altogether and we would just have to check that the TUF metadata matches 
the PyPI files.


___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig