Re: [Catalog-sig] Deprecate External Links

2013-03-05 Thread Donald Stufft
On Tuesday, March 5, 2013 at 4:01 AM, Donald Stufft wrote:
 On Thursday, February 28, 2013 at 8:35 AM, Donald Stufft wrote:
   
  
  https://crate.io/externally-hosted/ A list of things that have no files 
  hosted on
  PyPI but have a release. This doesn't include things that uploads sometimes
  but not everytime (argparse for example the latest releases have not been
  uploaded to PyPI).
  
 
 Sorted out a better way of seeing what would be effected by this change. 
 
 Here is a list of all versions that are currently installable via pip that
 are not hosted on PyPI (and thus would be uninstallable if all external
 links would be removed). This filters out projects that never existed
 or are no longer installable due to issues with the external hosting.
 
 I've also included the script I used to generate it.
 
 https://gist.github.com/dstufft/5088915 
Here's some numbers fetched from that data. 

928 projects w/ 2750 total versions have versions not installable
directly from PyPI.

721 projects w/ 2543 total versions have versions not installable
directly from PyPI if we don't consider the `dev` version.

This change would affect 2-3% of the projects on PyPI, and
just from scanning down the list it appears some of these
appear to merely be a forgotten upload and not a conscious
choice to not host their packages on PyPI (for example Django
has only 1 version not installable directly from PyPI).
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-03-01 Thread holger krekel
On Fri, Mar 01, 2013 at 10:02 +0100, Reinout van Rees wrote:
 On 28-02-13 21:08, holger krekel wrote:
 I have seen that position in this discussion (I have to upload 120
 files per release, so I won't do that, for instance).
 
 haven't seen that.
 
 Marc-Andre Lemburg said this, which I took to mean 120 uploads per release:
 
 
 However, taking our egenix-mx-base package as example, we have
 120 distribution files for every single release. Uploading those
 to PyPI would not only take long, but also ...
 

Ah ok, thanks.  Didn't interpret Marc-Andre's post as claiming that 
downloads/homepage crawling is a good idea, though.  Just that there
has been reasons not to upload things which need to be addressed,
especially the need for enough storage space.

best,
holger

 
 
 Reinout
 
 -- 
 Reinout van Reeshttp://reinout.vanrees.org/
 rein...@vanrees.org http://www.nelen-schuurmans.nl/
 If you're not sure what to do, make something. -- Paul Graham
 
 ___
 Catalog-SIG mailing list
 Catalog-SIG@python.org
 http://mail.python.org/mailman/listinfo/catalog-sig
 
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-03-01 Thread M.-A. Lemburg
On 01.03.2013 10:02, Reinout van Rees wrote:
 On 28-02-13 21:08, holger krekel wrote:
 I have seen that position in this discussion (I have to upload 120
 files per release, so I won't do that, for instance).
 
 haven't seen that.
 
 Marc-Andre Lemburg said this, which I took to mean 120 uploads per release:
 
 
 However, taking our egenix-mx-base package as example, we have
 120 distribution files for every single release. Uploading those
 to PyPI would not only take long, but also ...
 

Correct, with a total of over 100MB per release. However, the above
quote is slightly incorrect: I did not say I won't do that, just
that there are issues with doing this:

* It currently takes too long uploading that many files to
  PyPI. This causes a problem, since in order to start the upload,
  we have to register the release on PyPI, which tools will then
  immediately find. However, during the upload time, they won't
  necessarily find the right files to download and then fail.

  The proposed pull mechanism (see
  http://wiki.python.org/moin/PyPI/DownloadMetaDataProposal)
  would work around this problem: tools would simply go to
  our servers in case they can't find the files on PyPI.

* PyPI doesn't allow us to upload two egg files with the same
  name: we have to provide egg files for UCS2 Python builds and
  UCS4 Python builds, since easy_install/setuptools/pip don't
  differentiate between the two variants. This is the main
  reason why we're hosting our own PyPI-style indexes, one for
  UCS2 and the other for UCS4 builds:
  https://downloads.egenix.com/python/index/ucs2/
  https://downloads.egenix.com/python/index/ucs4/

* I'm not sure whether we want to import our crypto packages
  to the US, so for a subset of the files, we'd probably
  continue to use our servers in Germany.

  Again, with the above proposal, this shouldn't be a problem.

* Ihe PyPI terms are a bummer for us, but this can be fixed,
  I guess.

If we can resolve the issues, we'd have no problem having the
files mirrored on PyPI.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 01 2013)
 Python Projects, Consulting and Support ...   http://www.egenix.com/
 mxODBC.Zope/Plone.Database.Adapter ...   http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


: Try our mxODBC.Connect Python Database Interface for free ! ::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-03-01 Thread holger krekel
On Fri, Mar 01, 2013 at 10:24 +0100, M.-A. Lemburg wrote:
 On 01.03.2013 10:02, Reinout van Rees wrote:
  On 28-02-13 21:08, holger krekel wrote:
  I have seen that position in this discussion (I have to upload 120
  files per release, so I won't do that, for instance).
  
  haven't seen that.
  
  Marc-Andre Lemburg said this, which I took to mean 120 uploads per release:
  
  
  However, taking our egenix-mx-base package as example, we have
  120 distribution files for every single release. Uploading those
  to PyPI would not only take long, but also ...
  
 
 Correct, with a total of over 100MB per release. However, the above
 quote is slightly incorrect: I did not say I won't do that, just
 that there are issues with doing this:
 
 * It currently takes too long uploading that many files to
   PyPI. This causes a problem, since in order to start the upload,
   we have to register the release on PyPI, which tools will then
   immediately find. However, during the upload time, they won't
   necessarily find the right files to download and then fail.

You can actually skip the register and directly upload, it will
create release metadata on the fly.  Not sure if it's complete
but you can then do a register to update it if needed.

best,
holger

   The proposed pull mechanism (see
   http://wiki.python.org/moin/PyPI/DownloadMetaDataProposal)
   would work around this problem: tools would simply go to
   our servers in case they can't find the files on PyPI.
 
 * PyPI doesn't allow us to upload two egg files with the same
   name: we have to provide egg files for UCS2 Python builds and
   UCS4 Python builds, since easy_install/setuptools/pip don't
   differentiate between the two variants. This is the main
   reason why we're hosting our own PyPI-style indexes, one for
   UCS2 and the other for UCS4 builds:
   https://downloads.egenix.com/python/index/ucs2/
   https://downloads.egenix.com/python/index/ucs4/
 
 * I'm not sure whether we want to import our crypto packages
   to the US, so for a subset of the files, we'd probably
   continue to use our servers in Germany.
 
   Again, with the above proposal, this shouldn't be a problem.
 
 * Ihe PyPI terms are a bummer for us, but this can be fixed,
   I guess.
 
 If we can resolve the issues, we'd have no problem having the
 files mirrored on PyPI.
 
 -- 
 Marc-Andre Lemburg
 eGenix.com
 
 Professional Python Services directly from the Source  (#1, Mar 01 2013)
  Python Projects, Consulting and Support ...   http://www.egenix.com/
  mxODBC.Zope/Plone.Database.Adapter ...   http://zope.egenix.com/
  mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/
 
 
 : Try our mxODBC.Connect Python Database Interface for free ! ::
 
eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
http://www.egenix.com/company/contact/
 ___
 Catalog-SIG mailing list
 Catalog-SIG@python.org
 http://mail.python.org/mailman/listinfo/catalog-sig
 
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-03-01 Thread Jesse Noller
Marc Andre: I'm cc'ing Van: can you explain why the pypi terms are a bummer so 
we can see if there is actually an issue to be resolved or a matter of taste?

We need to protect the foundation while preserving author rights - but I don't 
want one user / subset dictating how we evolve the technology.

Jesse 

On Mar 1, 2013, at 4:24 AM, M.-A. Lemburg m...@egenix.com wrote:

 On 01.03.2013 10:02, Reinout van Rees wrote:
 On 28-02-13 21:08, holger krekel wrote:
 I have seen that position in this discussion (I have to upload 120
 files per release, so I won't do that, for instance).
 
 haven't seen that.
 
 Marc-Andre Lemburg said this, which I took to mean 120 uploads per release:
 
 
 However, taking our egenix-mx-base package as example, we have
 120 distribution files for every single release. Uploading those
 to PyPI would not only take long, but also ...
 
 
 Correct, with a total of over 100MB per release. However, the above
 quote is slightly incorrect: I did not say I won't do that, just
 that there are issues with doing this:
 
 * It currently takes too long uploading that many files to
  PyPI. This causes a problem, since in order to start the upload,
  we have to register the release on PyPI, which tools will then
  immediately find. However, during the upload time, they won't
  necessarily find the right files to download and then fail.
 
  The proposed pull mechanism (see
  http://wiki.python.org/moin/PyPI/DownloadMetaDataProposal)
  would work around this problem: tools would simply go to
  our servers in case they can't find the files on PyPI.
 
 * PyPI doesn't allow us to upload two egg files with the same
  name: we have to provide egg files for UCS2 Python builds and
  UCS4 Python builds, since easy_install/setuptools/pip don't
  differentiate between the two variants. This is the main
  reason why we're hosting our own PyPI-style indexes, one for
  UCS2 and the other for UCS4 builds:
  https://downloads.egenix.com/python/index/ucs2/
  https://downloads.egenix.com/python/index/ucs4/
 
 * I'm not sure whether we want to import our crypto packages
  to the US, so for a subset of the files, we'd probably
  continue to use our servers in Germany.
 
  Again, with the above proposal, this shouldn't be a problem.
 
 * Ihe PyPI terms are a bummer for us, but this can be fixed,
  I guess.
 
 If we can resolve the issues, we'd have no problem having the
 files mirrored on PyPI.
 
 -- 
 Marc-Andre Lemburg
 eGenix.com
 
 Professional Python Services directly from the Source  (#1, Mar 01 2013)
 Python Projects, Consulting and Support ...   http://www.egenix.com/
 mxODBC.Zope/Plone.Database.Adapter ...   http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/
 
 
 : Try our mxODBC.Connect Python Database Interface for free ! ::
 
   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
 ___
 Catalog-SIG mailing list
 Catalog-SIG@python.org
 http://mail.python.org/mailman/listinfo/catalog-sig
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-03-01 Thread PJ Eby
On Fri, Mar 1, 2013 at 4:24 AM, M.-A. Lemburg m...@egenix.com wrote:
 On 01.03.2013 10:02, Reinout van Rees wrote:
 On 28-02-13 21:08, holger krekel wrote:
 I have seen that position in this discussion (I have to upload 120
 files per release, so I won't do that, for instance).

 haven't seen that.

 Marc-Andre Lemburg said this, which I took to mean 120 uploads per release:

 
 However, taking our egenix-mx-base package as example, we have
 120 distribution files for every single release. Uploading those
 to PyPI would not only take long, but also ...
 

 Correct, with a total of over 100MB per release. However, the above
 quote is slightly incorrect: I did not say I won't do that, just
 that there are issues with doing this:

 * It currently takes too long uploading that many files to
   PyPI. This causes a problem, since in order to start the upload,
   we have to register the release on PyPI, which tools will then
   immediately find. However, during the upload time, they won't
   necessarily find the right files to download and then fail.

Actually, easy_install doesn't pay any attention to what releases are
registered.  It just looks for primary and secondary links.  If there
are links for a version that it can use, it uses it.  If it does not
find links for a version, then that version does not exist, as far as
it is concerned.  So registering without files is not a problem.


   The proposed pull mechanism (see
   http://wiki.python.org/moin/PyPI/DownloadMetaDataProposal)
   would work around this problem: tools would simply go to
   our servers in case they can't find the files on PyPI.

That proposal is unnecessary, actually.  You could *right now* simply
place binary download links (with optional #md5= verification)
in your package's description field, and the moment you registered the
package, existing tools would find those links and download them from
your site.  You could then remove your home page and download URLs
from the relevant fields, and place them also in the description.
(easy_install does not follow non-download links within the
description -- i.e., links that don't end in .egg, .tgz, etc. and
don't have an #egg tag.)


 * PyPI doesn't allow us to upload two egg files with the same
   name: we have to provide egg files for UCS2 Python builds and
   UCS4 Python builds, since easy_install/setuptools/pip don't
   differentiate between the two variants.

They can if it's part of the platform string; the catch is that right
now it's not.  We'd have to go through an upgrade cycle of the tools
to support that.  I need to take a look at what PEP 427 is doing (and
you should too, if you haven't already) to get this part sorted out.
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-28 Thread Nick Coghlan
On Thu, Feb 28, 2013 at 5:01 PM, Donald Stufft donald.stu...@gmail.com wrote:
 I'm glad the next set of Metadata won't have external links, however
 even if it showed up tomorrow it's going to be a long time until
 people are completely migrated to it. Furthermore you estimate
 months but the first phase will have positive benefits right away, namely
 that it will prompt people to start uploading their packages better
 increasing
 the security and reliability of the current system. And finally while I'm
 glad to see forward movement It's been said before not to bother
 making a fix to the existing system because X was going to happen
 soon, in the past i was distutils2/packaging, now it's PEP426/packaging.
 While I have every hope and I believe it will happen this time, the
 past has made me worry about holding off on good incremental
 improvements to the current infra.

Pissing off the maintainers off packages that currently rely on
external hosting by telling them they have to change their release
processes if they want to keep releasing software on PyPI and have
their users actually be able to download it is *not* a good idea,
especially when we're about to ask them to upgrade their build chains
for other reasons (including both security and reliability).

Working on the installation tools and getting them to complain about
external links is a *fine* idea. It doesn't break anything, but
maintainers will start fielding questions from their users asking
Hey, why am I getting this warning when I install your project?.
Working on the upload tools and having them *warn* distributors that
self-hosting is problematic is also a good idea, as is exploring PJE's
suggestions about refining the set of URLs that PyPI currently
publishes

However, getting PyPI to effectively *break uploads* of projects that
rely on external links at this point in time is *not* a good idea: we
should NOT mess with people's existing build and upload processes
lightly, as any such changes burn up a *lot* of community good will,
and that's not something in great supply when it comes to Python's
software distribution infrastructure. All current generation
infrastructure should continue to work without modification on both
the upload side *and* the download side (although, as noted above,
it's highly desirable for both the upload side and the download side
to be updated to warn users about any reliance on insecure legacy
behaviour).

I expect a similar rollout in the transition to the next generation
metadata format and distribution infrastructure - initially download
tools will support both formats, emitting a warning when falling back
to the legacy distribution infrastructure, then they will start
requiring an option to enable fallback to legacy mode, and eventually
there will be released installation tools that don't support the
legacy distribution infrastructure at all (such as any default
installer included in the standard library).

For *next* generation infrastructure, it's our job as system designers
to sell it to potential users (primarily everyone writing software
which they publish on PyPI, or at least the authors of the toolchains
used for that publication, but also to consumers of that software).
The distutils2 team failed, in large part because their proposal
required radical changes to the way people published Python software.
I have deliberately moderated that in the way I have approached PEP
426 - if people can't generate the new metadata with only minor
changes to their current processes, it isn't going to fly, and any
trade-offs (such as the loss of external hosting support), need to be
bought with corresponding benefits (such as guaranteed correct
pre-release handling, solid Python version declarations, a clean
post-install hook design, and, hopefully, a vastly improved rich
metadata publication system for PyPI, probably based on TUF).

Regards,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-28 Thread Nick Coghlan
On Thu, Feb 28, 2013 at 6:12 PM, M.-A. Lemburg m...@egenix.com wrote:
 On 28.02.2013 07:39, Nick Coghlan wrote:
 1. The next generation metadata infrastructure will NOT support
 external hosting of files indexed on PyPI - if you don't upload the
 archive files to PyPI, they won't be included in the next generation
 metadata. If you want external hosting, you will need to run a
 separate index (this is similar to the yum model - you can host files
 wherever you want, but you need to run yum createrepo yourself to
 generate the metadata, and instruct users on how to get their
 installers to retrieve your metadata. The big difference between PyPI
 and the yum model is that the default index still won't be curated at
 all, so there's no review process to get through if you want to use
 it, thus less need for external hosting).

 Could you elaborate on this ?

 AFAIK, the metadata only works on package names, regardless of where
 an installer finds them.

Caveat: this is NOT a final design, and people that aren't me will be
working out the exact details. It is, however, how I want it to work.

The next generation metadata publication infrastructure is likely to
be based on TUF, and thus will consist of pregenerated, signed
metadata served as static files. Installers will just download
metadata files, sdists and wheels (and probably eggs and tarballs),
and never need to contact an active web service. The only active web
service technically required will be one to regularly refresh the
signed timestamp file that prevents certain kinds of attacks based on
providing old, insecure versions of software (a cron job running on
the server hosting the metadata would suffice for this task). PyPI
itself will have another active service to automatically regenerate
the metadata when files are uploaded by maintainers. The delegation of
trust within the framework will be defined only for files hosted by
PyPI - it will not be extended to allow the declaration of external
URLs as a source for the target files.

Publishers will still be able to publish on external sites, but they
will need to generate their own metadata, and distributions published
that way won't be indexed in the next generation metadata on PyPI.
This is the same way yum repos work - the metadata for each repo only
covers SRPMs and RPMs hosted in that repo. If you want to download
software from somewhere else, you have to add another repo definition
in the client so it knows where to look for the metadata. APT works in
a similar fashion.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-28 Thread holger krekel
On Wed, Feb 27, 2013 at 22:04 +0100, Lennart Regebro wrote:
 On Wed, Feb 27, 2013 at 8:49 PM, Monty Taylor mord...@inaugust.com wrote:
  But wouldn't this only be a change in pip/easy_install, not PyPI
  itself? I suppose you could explicitly break the external links by
  having them point to nothing if you are worried about the security or
  if it's some performance issue (that would indeed be a bad
  compatibility break, in case people are using those for other
  purposes).  Otherwise, if it's a problem, then just use the old
  version of pip.
 
  If we don't remove the feature from pypi itself
 
 It isn't a feature of PyPI. PyPI doesn't require you to upload the
 files to PyPI. For that reason, easy_install and PIP will scrape
 external sites to be able to download the files.
 
 What we should do is agree that this should stop, and a deprecation
 warning to pip and easy_install and after some pre-determined time
 remove the feature from easy_install and pip.

I suggest to *change defaults* rather than to remove the feature for
the foreseeable future.  Changing defaults is a powerful way to communicate
and one that doesn't leave people totally stranded who are far removed from
discussions and rationales here.

  folks for whom its a problem, because there will be no incentive for the
  folks hosting their software that way to actually upload their stuff to
  PyPI
 
 Yes there will be: Everyone mailing them to tell them there software
 is broken and can't be installed with easy_install and pip. That's
 going to be very annoying very fast. ;-)

I've mailed several maintainers in the last half year of 1K downloaded
projects to inquire about status, and not received replies.  I wanted
to base work on their projects and of course i refrained from doing that
because of the lack of replies.  To me that means you can have users
mailing maintainers or screaming at maintainers or saying bad words
about maintainers or projects all you want but that doesn't mean it's
going to be fixed.   

To summarize, having pip/easy_install report red warnings and requiring
to pass a --htmlscrape=PROJ1,PROJ2 option or so is a good way to 
communicate, removing the ability is not, at this point.

best,
holger
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-28 Thread holger krekel
On Thu, Feb 28, 2013 at 09:48 +1100, Richard Jones wrote:
 On 28 February 2013 08:31, PJ Eby p...@telecommunity.com wrote:
  OTOH, I currently make development snapshots of setuptools and other
  projects available by dumping them in a directory that's used as an
  external download URL.  Replacing that would be a PITA because PyPI
  only lets you upload and register new releases from distutils' command
  line.  Basically, I'd need to use a download link that pointed to a
  latest URL that redirected to the final download.
 
 Yup, and the down-side of distutils as the tool for talking to PyPI
 is, of course, the horrendous turn-around time trying to add features
 or fix bugs.
 
 I've advocated us having the upload/register/whatever functionality in
 a separate tool for a while, but that doesn't seem to have gained any
 traction. Of course issues around the complexity introduced by
 setup.py make it much harder.

FWIW three days ago i presented at Pycon Russia a unifying cmdline 
workflow meta tool which configures and invokes setup.py
[...]/pip/easy_install commands.  I intend to publish it soon and 
will also send a link once the video becomes available.

IOW, i fully agree we need to move away from putting things into 
setup.py/distutils, start going for PEP426 etc. -- but WITHOUT breaking 
things for all the packaging upload/installation processes out there.
Therefore a meta tool approach to make it easier for people to
gradually move away from current practises.

cheers,
holger

 In the mean time I think Donald's suggestion for supporting
 development pre-releases is reasonable:
  instead of (please replace with easy_install lingo here)
  `pip install setuptools==setuptools-dev` please `pip install -e
  http://svn.python.org/projects/sandbox/trunk/setuptools/#egg=setuptools-dev`
   ?
 
 
 
 Richard
 ___
 Catalog-SIG mailing list
 Catalog-SIG@python.org
 http://mail.python.org/mailman/listinfo/catalog-sig
 
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-28 Thread holger krekel
On Thu, Feb 28, 2013 at 06:38 +0100, Andreas Jung wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 +1 for the proposal
 
 The complete discussion on this topic is once again absurd and bizarre.
 We are discussing the issue with externally hosted packages every year
 and the situation has not improved. Especially people using buildout
 encounter very regulary issues with external site being down - with the
 result that we can not install or update our installation.
 
 I give a shit at the arguments pulled out every time by package
 maintainers using PyPI only for listing their packages. I am both
 annoyed and bothered by these people.

I didn't see such positions from package maintainers here.  In fact
i haven't seen anyone stepping up saying listing packages externally
is a great idea.  Could you point to those posts?

However, I have seen concerns about breaking many people's and
companies processes and thus thoughts on how to do a good transition.
I guess you don't want to communicate to package-users the way 
you do above to package maintainers.

best,
holger
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-28 Thread Richard Jones
On 28 February 2013 20:09, holger krekel hol...@merlinux.eu wrote:
 On Thu, Feb 28, 2013 at 09:48 +1100, Richard Jones wrote:
 On 28 February 2013 08:31, PJ Eby p...@telecommunity.com wrote:
  OTOH, I currently make development snapshots of setuptools and other
  projects available by dumping them in a directory that's used as an
  external download URL.  Replacing that would be a PITA because PyPI
  only lets you upload and register new releases from distutils' command
  line.  Basically, I'd need to use a download link that pointed to a
  latest URL that redirected to the final download.

 Yup, and the down-side of distutils as the tool for talking to PyPI
 is, of course, the horrendous turn-around time trying to add features
 or fix bugs.

 I've advocated us having the upload/register/whatever functionality in
 a separate tool for a while, but that doesn't seem to have gained any
 traction. Of course issues around the complexity introduced by
 setup.py make it much harder.

 FWIW three days ago i presented at Pycon Russia a unifying cmdline
 workflow meta tool which configures and invokes setup.py
 [...]/pip/easy_install commands.  I intend to publish it soon and
 will also send a link once the video becomes available.

 IOW, i fully agree we need to move away from putting things into
 setup.py/distutils, start going for PEP426 etc. -- but WITHOUT breaking
 things for all the packaging upload/installation processes out there.
 Therefore a meta tool approach to make it easier for people to
 gradually move away from current practises.

Awesome!

For what it's worth I spent some time today trying to dig up some
actual stats on the number of packages with only download_url (roughly
10%) and how popular they are (roughly 90% of those packages were
looked up in the /simple index in the last day.) I'm still poking at
the numbers though.


Richard
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-28 Thread Martin v. Löwis

no support for UCS2/UCS4 binary distributions, unsupported
distribution file formats (e.g. our prebuilt format),


Not sure why PyPI would even care what charset the package files use,
but if true thats certainly a bug and we can get that fixed. What
file formats do pip/buildout support that PyPI doesn't support for
uploads?



Basically, this is all about spam/abuse prevention. I don't want people
to upload movie files (whether they be pirated movies or porn files, or
just home video) to abuse PyPI as a general file hosting service, and
I don't see a way to manually redact the content on PyPI.

Regards,
Martin
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-28 Thread Donald Stufft
On Thursday, February 28, 2013 at 5:29 AM, M.-A. Lemburg wrote:
 On 27.02.2013 19:21, Donald Stufft wrote:
  On Wednesday, February 27, 2013 at 1:11 PM, M.-A. Lemburg wrote:
   On 27.02.2013 18:37, Donald Stufft wrote:
On Wednesday, February 27, 2013 at 12:10 PM, M.-A. Lemburg wrote:
 Package installers only need access to the static files in
 the /simple/ index. Those can be put behind a CDN to increase
 uptime.
 
 PyPI itself doesn't have to be up and running if you just want
 to download the files (unfortunately, that's not true at the
 moment, because the /simple/ index is dynamically generated,
 but that can be changed).
 
 See http://wiki.python.org/moin/CloudPyPI for details.

I'm aware of that, but that doesn't change the statement. If /simple/
is down you cannot determine the external urls. There is no way
to increase uptime by adding more points of failure. 

   
   
   
   Please reread the proposal. The /simple/ index would
   get hosted on a separate domain which then points to the CDN.
   
  
  
  It. Does. Not. Matter. You are simply moving the SPOF which is
  /simple/, if /simple/ is how you discover the CDN and/or external
  urls then the things it points too can have 100% uptime and if
  /simple/ is down the entire system is down. 
  
 
 
 We appear to be talking about different things :-)
 
 The proposal suggests to put the /simple/ index itself
 on Amazon S3 and then have CloudFront distribute the files
 to the end users.
 
 The PyPI server would only manage pushing the file
 to the S3 buckets. PyPI could go down and Amazon would still
 be serving the files.
 
 See the Moving static data to a CDN of
 http://wiki.python.org/moin/CloudPyPI/Proposal
 
I'm aware of what you're talking about, Amazon doesn't have 100% uptime. Moving
that there is good for other reasons but it doesn't magically make adding 
multiple
single points of failures defy the laws of nature. 

___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-28 Thread Donald Stufft
On Thursday, February 28, 2013 at 7:56 AM, Reinout van Rees wrote:
 On 28-02-13 10:43, holger krekel wrote:
  On Thu, Feb 28, 2013 at 06:38 +0100, Andreas Jung wrote:
   
   I give a shit at the arguments pulled out every time by package
   maintainers using PyPI only for listing their packages. I am both
   annoyed and bothered by these people.
   
  
  
  I didn't see such positions from package maintainers here. In fact
  i haven't seen anyone stepping up saying listing packages externally
  is a great idea. Could you point to those posts?
  
 
 
 The position Andreas probably means is projects that *do* advertise 
 themselves on pypi, but don't put their files there.
 
 I have seen that position in this discussion (I have to upload 120 
 files per release, so I won't do that, for instance).
 
 Some arguments might be valid, but these projects *are*, taken as one 
 group, actively breaking pip and buildout regularly.
 
 So I agree with Andreas. I don't really care about the arguments pulled 
 out every time. Effectively actively breaking pip and buildout is bad, 
 period.
 
 
 Reinout
 
 -- 
 Reinout van Rees http://reinout.vanrees.org/
 rein...@vanrees.org http://www.nelen-schuurmans.nl/
 If you're not sure what to do, make something. -- Paul Graham
 
 ___
 Catalog-SIG mailing list
 Catalog-SIG@python.org (mailto:Catalog-SIG@python.org)
 http://mail.python.org/mailman/listinfo/catalog-sig
 
 

https://crate.io/externally-hosted/ A list of things that have no files hosted 
on
PyPI but have a release. This doesn't include things that uploads sometimes
but not everytime (argparse for example the latest releases have not been
uploaded to PyPI).
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-28 Thread Daniel Holth
On Thu, Feb 28, 2013 at 7:43 AM, Reinout van Rees rein...@vanrees.org wrote:
 On 27-02-13 16:26, Donald Stufft wrote:

2. External links decrease the expected uptime for a particular set
of requirements. PyPI itself has become very stable, however
the same cannot be said for all of the hosts linked that the
 toolchain
processes. Each new host is an additional SPOF.


 A very good practical illustration: my colleague cannot pip install
 mercurial right now as the mercurial.selenic.com website is down for hours
 now.

 All the download links on http://pypi.python.org/simple/Mercurial/
 point at things like
 http://mercurial.selenic.com/release/mercurial-1.5.tar.gz

 I'm very happy to have a local buildout egg cache, otherwise the mercurial
 website's failure would bring a couple of my buildouts to a grinding halt.


 A couple of those project that don't bother to put their packages on pypi
 can bring your pip or buildout *down* quite often.



 Reinout

I've been promoting a similar workflow with pip wheel (a proposed
command present in the wheel fork of pip):

pip wheel -w /wheel/directory dependency

pip install --no-index --find-links /wheel/directory dependency

You wind up with cached builds for every package you are using and its
dependencies and only consult the index when you are willing to be
surprised.
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-28 Thread Ian Cordasco
On Thu, Feb 28, 2013 at 10:30 AM, Lennart Regebro rege...@gmail.com wrote:
 On Thu, Feb 28, 2013 at 10:43 AM, Lennart Regebro rege...@gmail.com wrote:
 On Thu, Feb 28, 2013 at 9:28 AM, Nick Coghlan ncogh...@gmail.com wrote:
 Pissing off the maintainers off packages that currently rely on
 external hosting by telling them they have to change their release
 processes if they want to keep releasing software on PyPI and have
 their users actually be able to download it is *not* a good idea,
 especially when we're about to ask them to upgrade their build chains
 for other reasons (including both security and reliability).

 Who are these people by the way?

 I can answer that question now. I have a list of 2651 emails of people
 listed as maintainers or authors of software that doesn't have
 releases on PyPI.
 This is a very inclusive list, so it's lists *all* maintainers and
 authors of *all* versions of a package, if that package has no files
 on PyPI.
 And there are duplicate people, of course, although the emails are unique.

 I've suggested before that we start by sending out emails to these
 people, but I have to admit that the list is *much* longer than I
 thought, and that we might want to limit it to those who actually have
 packages that have been accessed during the last X months or so.

 //Lennart

Looking at some of the packages on Donald's link
(https://crate.io/externally-hosted/), some of the websites are just
plain broken. Those authors should potentially be contacted separately
about completely removing their package from PyPI (assuming they've
stopped development or no longer make the project available).
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-28 Thread Doug Hellmann

On Feb 28, 2013, at 3:43 AM, Nick Coghlan wrote:

 On Thu, Feb 28, 2013 at 6:12 PM, M.-A. Lemburg m...@egenix.com wrote:
 On 28.02.2013 07:39, Nick Coghlan wrote:
 1. The next generation metadata infrastructure will NOT support
 external hosting of files indexed on PyPI - if you don't upload the
 archive files to PyPI, they won't be included in the next generation
 metadata. If you want external hosting, you will need to run a
 separate index (this is similar to the yum model - you can host files
 wherever you want, but you need to run yum createrepo yourself to
 generate the metadata, and instruct users on how to get their
 installers to retrieve your metadata. The big difference between PyPI
 and the yum model is that the default index still won't be curated at
 all, so there's no review process to get through if you want to use
 it, thus less need for external hosting).
 
 Could you elaborate on this ?
 
 AFAIK, the metadata only works on package names, regardless of where
 an installer finds them.
 
 Caveat: this is NOT a final design, and people that aren't me will be
 working out the exact details. It is, however, how I want it to work.
 
 The next generation metadata publication infrastructure is likely to
 be based on TUF, and thus will consist of pregenerated, signed
 metadata served as static files. Installers will just download
 metadata files, sdists and wheels (and probably eggs and tar balls),

It sounds like that move will also be a good opportunity to create a reusable 
PyPI client library that the installer front-ends (easy_install, pip, whatever) 
could use, to provide some consistent behavior between the tools. I would like 
to see it *only* work with PyPI to upload, search, and download distributions 
but not create distributions, not find them anywhere else, and not upload them 
anywhere else.

Doug

 and never need to contact an active web service. The only active web
 service technically required will be one to regularly refresh the
 signed timestamp file that prevents certain kinds of attacks based on
 providing old, insecure versions of software (a cron job running on
 the server hosting the metadata would suffice for this task). PyPI
 itself will have another active service to automatically regenerate
 the metadata when files are uploaded by maintainers. The delegation of
 trust within the framework will be defined only for files hosted by
 PyPI - it will not be extended to allow the declaration of external
 URLs as a source for the target files.
 
 Publishers will still be able to publish on external sites, but they
 will need to generate their own metadata, and distributions published
 that way won't be indexed in the next generation metadata on PyPI.
 This is the same way yum repos work - the metadata for each repo only
 covers SRPMs and RPMs hosted in that repo. If you want to download
 software from somewhere else, you have to add another repo definition
 in the client so it knows where to look for the metadata. APT works in
 a similar fashion.
 
 Cheers,
 Nick.
 
 -- 
 Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
 ___
 Catalog-SIG mailing list
 Catalog-SIG@python.org
 http://mail.python.org/mailman/listinfo/catalog-sig

___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-28 Thread holger krekel
On Thu, Feb 28, 2013 at 16:30 +0100, Lennart Regebro wrote:
 On Thu, Feb 28, 2013 at 10:43 AM, Lennart Regebro rege...@gmail.com wrote:
  On Thu, Feb 28, 2013 at 9:28 AM, Nick Coghlan ncogh...@gmail.com wrote:
  Pissing off the maintainers off packages that currently rely on
  external hosting by telling them they have to change their release
  processes if they want to keep releasing software on PyPI and have
  their users actually be able to download it is *not* a good idea,
  especially when we're about to ask them to upgrade their build chains
  for other reasons (including both security and reliability).
 
  Who are these people by the way?
 
 I can answer that question now. I have a list of 2651 emails of people
 listed as maintainers or authors of software that doesn't have
 releases on PyPI.
 This is a very inclusive list, so it's lists *all* maintainers and
 authors of *all* versions of a package, if that package has no files
 on PyPI.
 And there are duplicate people, of course, although the emails are unique.

There are also packages which have some (older) release files on pypi
and newer ones outside (e.g. lockfile with 78256 downloads from 
code.google.com).  You didn't include such in your 2651 emails, or did you?

holger
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-28 Thread holger krekel
On Thu, Feb 28, 2013 at 13:56 +0100, Reinout van Rees wrote:
 On 28-02-13 10:43, holger krekel wrote:
 On Thu, Feb 28, 2013 at 06:38 +0100, Andreas Jung wrote:
 
 I give a shit at the arguments pulled out every time by package
 maintainers using PyPI only for listing their packages. I am both
 annoyed and bothered by these people.
 
 I didn't see such positions from package maintainers here.  In fact
 i haven't seen anyone stepping up saying listing packages externally
 is a great idea.  Could you point to those posts?
 
 The position Andreas probably means is projects that *do* advertise
 themselves on pypi, but don't put their files there.

It has been an accepted practise for 10 years.

 I have seen that position in this discussion (I have to upload 120
 files per release, so I won't do that, for instance).

haven't seen that.

 Some arguments might be valid, but these projects *are*, taken as
 one group, actively breaking pip and buildout regularly.

yes, and it's annoying, fully agreed.

 So I agree with Andreas. I don't really care about the arguments
 pulled out every time. Effectively actively breaking pip and
 buildout is bad, period.

I consider it a valid concern that taking homepage/download urls away
from pypi's server index is likely to break things for users.  I don't
see the point of doing that if we can have a better migration path by
working on the installers (like is currently ongoing).  Let's please
not do a blackwhite discussion here and try to improve the overall
situation, not just a particular aspect in a particular way.

holger
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-28 Thread Donald Stufft
On Thursday, February 28, 2013 at 1:23 PM, PJ Eby wrote:
 On Thu, Feb 28, 2013 at 4:08 AM, Nick Coghlan ncogh...@gmail.com 
 (mailto:ncogh...@gmail.com) wrote:
  On Thu, Feb 28, 2013 at 7:00 PM, holger krekel hol...@merlinux.eu 
  (mailto:hol...@merlinux.eu) wrote:
   To summarize, having pip/easy_install report red warnings and requiring
   to pass a --htmlscrape=PROJ1,PROJ2 option or so is a good way to
   communicate, removing the ability is not, at this point.
   
  
  
  +1
  
  I'm a fan of updating the client side tools (both upload and download)
  to complain if files are not hosted on PyPI, and perhaps even
  requiring switches or configuration settings to say yes, external
  downloads are OK for projects X, Y, and Z).
  
  I'm *not* a fan of changing the way PyPI handles external links,
  except perhaps for some of the suggestions PJE made about cleaning up
  some aspects of what PyPI chooses to publish for old releases.
  
  I'd prefer to leave the you can't do it any more step for the next
  generation secure metadata distribution infrastructure (so the
  installation tools will be able to fall back to the legacy
  infrastructure for projects that haven't updated yet).
  
 
 
 Indeed. I'm hoping that the new tools will make the old ones (e.g.
 setuptools) entirely irrelevant, which is why I'm hammering so hard in
 the PEP discussions on some use cases that eggs do well that wheels
 don't. I don't want people to have to keep using setuptools for those
 use cases. (e.g. simple plugin deployment ala Trac) If the new tools
 handle all of the use cases, then setuptools can die a natural death
 sometime in the next decade or so, so I don't have to be responsible
 for it when I turn old and senile. (It's already turned me grey as it
 is.) ;-)
 
 For the short run, I anticipate the following steps in the next
 release of setuptools, which I'm aiming to release before PyCon:
 
 * Default to SSL URL for PyPI
 * Support SSL certificate verification for downloads if the 'requests'
 library is available on sys.path
 * Update docs for easy_install to more clearly and prominently state
 that packages are downloaded from other sources than PyPI unless
 --allow-hosts is used
 * Add an immediate warning to each easy_install invocation (whether
 programmatic or command line) if --allow-hosts is not explicitly set
 to some value in the configuration or command line.
 
 I'm also considering adding a warning for scraping home page links,
 but at this point in the discussion haven't nailed down how that
 should work. Likewise, I'd like to provide some sort of monkeypatch
 to make register/upload work properly with SSL in older Pythons, but
 I'm not sure I can integrate cert checking there... but at least the
 security will be no worse than using plain distutils. (i.e., it'll
 still be subject to credential theft if someone MITMs PyPI)
 
 

SSL checking on upload should be possible, do you want
a patch? 
 
 Of course, this release will initially be available as a development
 snapshot, i.e., made available through external links. ;-)
 
 Future releases I'm undecided about as yet, but certainly if PyPI
 becomes able to pull and cache externally published releases (upon a
 developer's request), that addresses all of my concerns on the
 developer-burden side, and all of the availability/security concerns
 on the other. Setuptools could move to a default --allow-hosts of
 just PyPI, as soon as that feature is available and being used. (And
 if the licensing issues can be worked out, old packages with external
 links could be pulled to PyPI anyway, and the external links removed.)
 ___
 Catalog-SIG mailing list
 Catalog-SIG@python.org (mailto:Catalog-SIG@python.org)
 http://mail.python.org/mailman/listinfo/catalog-sig
 
 


___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-28 Thread Donald Stufft
On Thursday, February 28, 2013 at 6:31 PM, PJ Eby wrote:
 On Thu, Feb 28, 2013 at 5:00 PM, Donald Stufft donald.stu...@gmail.com 
 (mailto:donald.stu...@gmail.com) wrote:
  SSL checking on upload should be possible, do you want
  a patch?
  
 
 
 If it uses the 'requests' library, yes, I'll accept one. But I don't
 want to do any direct implementation of SSL cert checking in
 setuptools, at least in the short run (next few weeks), because:
 
 

Does setuptools support Python3? (or do you want it to?) 
 
 1. I don't consider myself qualified as yet to write a correct patch
 or even verify that a contributed patch is correct/safe, and
 
 

There's existing implementations out there that add cert checking
to urllib, it's fairly short. 
 
 2. There is a licensing issue with including the Mozilla root
 certificate set in setuptools under its current license, and I'm not
 100% certain I can *change* the license. (I *could* potentially use a
 platform-provided cert set, but that's not really an option on Windows
 unless you have Windows expertise above my paygrade for pulling that
 stuff out of the registry.)
 
 

Shouldn't be any issue, the PSF license is very liberal and the MPL
works on a per file (as opposed to a per project) basis. So if you
include the cert bundle that particular file is MPL licensed while
setuptools itself remains PSF.
 
 So, by delegating to the requests library, I can bypass both of those
 issues in the short term. In the longer term (1 month from now),
 more integrated solutions may be more feasible. Using requests is
 the best I think I can reasonably achieve by PyCon, but I *will* be
 publicizing a set of instructions for how to safely download
 setuptools and requests (via https in a browser to prevent MITM
 attacks), as well as how to configure easy_install for more secure
 default settings. (And easy_install will always use requests if
 present, unless specifically asked not to with a --no-ssl-verify
 option.)
 
 


___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-28 Thread PJ Eby
On Thu, Feb 28, 2013 at 5:00 PM, Donald Stufft donald.stu...@gmail.com wrote:
 SSL checking on upload should be possible, do you want
 a patch?

If it uses the 'requests' library, yes, I'll accept one.  But I don't
want to do any direct implementation of SSL cert checking in
setuptools, at least in the short run (next few weeks), because:

1. I don't consider myself qualified as yet to write a correct patch
or even verify that a contributed patch is correct/safe, and

2. There is a licensing issue with including the Mozilla root
certificate set in setuptools under its current license, and I'm not
100% certain I can *change* the license.  (I *could* potentially use a
platform-provided cert set, but that's not really an option on Windows
unless you have Windows expertise above my paygrade for pulling that
stuff out of the registry.)

So, by delegating to the requests library, I can bypass both of those
issues in the short term.  In the longer term (1 month from now),
more integrated solutions may be more feasible.  Using requests is
the best I think I can reasonably achieve by PyCon, but I *will* be
publicizing a set of instructions for how to safely download
setuptools and requests (via https in a browser to prevent MITM
attacks), as well as how to configure easy_install for more secure
default settings.  (And easy_install will always use requests if
present, unless specifically asked not to with a --no-ssl-verify
option.)
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-28 Thread Lennart Regebro
On Thu, Feb 28, 2013 at 8:52 PM, holger krekel hol...@merlinux.eu wrote:
 There are also packages which have some (older) release files on pypi
 and newer ones outside (e.g. lockfile with 78256 downloads from
 code.google.com).  You didn't include such in your 2651 emails, or did you?

No, I didn't, I assumed they would be quite few.
Possibly a better algorithm is to check if the last release has files on PyPI.

//Lennart
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Jesse Noller


On Wednesday, February 27, 2013 at 10:26 AM, Donald Stufft wrote:

 PyPI is now being served with a valid SSL certificate, and the
 tooling has begun to incorporate SSL verification of PyPI into
 the process. This is _excellent_ and the parties involved should
 all be thanked. However there is still another massive area of
 insecurity within the packaging tool chain.
 
 For those who don't know, when you attempt to install a particular
 package a number of urls are visited. The steps look roughly
 something like this:
 
 1. Visit http://pypi.python.org/simple/Package/ and attempt to
 collect any links that look like it's installable (tarballs,
 #egg=, etc).
 Note: /simple/Package/ contains download_url, home_page,
 and any link that is contained in the long_description).
 2. Visit any link referenced as home_page and attempt to
 collect any links that look like it's installable.
 3. Visit any link referenced in a dependency_links and attempt
 to collect any links that look like it's installable.
 4. Take all of the collected links and determine which one
 best matches the requirement spec given and download it.
 5. Rinse and repeat for every dependency in the requirement
 set. 
 
 I propose we deprecate the external links that PyPI has published
 on the /simple/ indexes which exist because of the history of PyPI.
 Ideally in some number of months (1? 2?) we would turn off adding
 these links from new releases, leaving the existing ones intact and
 then a few months later the existing links be removed completely.
 
 Reasoning:
 1. It is difficult to secure the process of spidering external links
 for download.
 1a. The only way I can think offhand is by requiring uploading
 a hash of the expected files to PyPI along with the download
 link and removing all urls except for the download_url. This
 has the effect that only 1 file can be associated with a particular
 release.
 2. External links decrease the expected uptime for a particular set
 of requirements. PyPI itself has become very stable, however
 the same cannot be said for all of the hosts linked that the toolchain
 processes. Each new host is an additional SPOF.
 
 Ex: I depend on PyPI and 10 other external packages, each
 service has a 99% uptime so my expected uptime to
 be able to install all my requirements would be ~89% (0.99 ** 11).
 3. Breaks the ability for a CDN and/or mirroring infrastructure to provide
 increased uptime and better latency/throughput across the globe.
 4. Privacy implications, as a user it is not particularly obvious when
 I run `pip install Foo` what hosts I will be able issuing requests against.
 It is obvious that I will be contacting PyPI and I will have made the
 decision to trust PyPI however it is not obvious what other hosts will
 be able to gather information about me, including what packages I am
 installing. This becomes even more difficult to determine the deeper
 my dependency tree goes.


I fully support this. 

As an aside, if CDN/storage concerns are an issue, I have an outstanding offer 
from a large hosting company to take care of the CDN aspects for us.  

Jesse


___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread M.-A. Lemburg
On 27.02.2013 16:26, Donald Stufft wrote:
 PyPI is now being served with a valid SSL certificate, and the
 tooling has begun to incorporate SSL verification of PyPI into
 the process. This is _excellent_ and the parties involved should
 all be thanked. However there is still another massive area of
 insecurity within the packaging tool chain.
 
 For those who don't know, when you attempt to install a particular
 package a number of urls are visited. The steps look roughly
 something like this:
 
 1. Visit http://pypi.python.org/simple/Package/ and attempt to
 collect any links that look like it's installable (tarballs,
 #egg=, etc).
 Note: /simple/Package/ contains download_url, home_page,
 and any link that is contained in the long_description).
 2. Visit any link referenced as home_page and attempt to
 collect any links that look like it's installable.
 3. Visit any link referenced in a dependency_links and attempt
 to collect any links that look like it's installable.
 4. Take all of the collected links and determine which one
 best matches the requirement spec given and download it.
 5. Rinse and repeat for every dependency in the requirement
 set.
 
 I propose we deprecate the external links that PyPI has published
 on the /simple/ indexes which exist because of the history of PyPI.
 Ideally in some number of months (1? 2?) we would turn off adding
 these links from new releases, leaving the existing ones intact and
 then a few months later the existing links be removed completely.

-1.

There are many reasons for not hosting packages and distributions
on PyPI itself.

If you use and trust a package, you also have to know and trust its
dependencies, no matter where they are hosted, so you're not gaining
any security by disabling links to other download locations: if
you don't trust the way a package is hosted, you don't use it; if
you do, then removing the link breaks the package and all its
dependencies.

Instead of suggesting to removing support for externally hosted packages,
why not propose a mechanism to provide a more direct/secure way to
reference them ?

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 26 2013)
 Python Projects, Consulting and Support ...   http://www.egenix.com/
 mxODBC.Zope/Plone.Database.Adapter ...   http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


: Try our mxODBC.Connect Python Database Interface for free ! ::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Donald Stufft
On Wednesday, February 27, 2013 at 10:39 AM, M.-A. Lemburg wrote:
 -1.
 
 There are many reasons for not hosting packages and distributions
 on PyPI itself.
 
 If you use and trust a package, you also have to know and trust its
 dependencies, no matter where they are hosted, so you're not gaining
 any security by disabling links to other download locations: if
 you don't trust the way a package is hosted, you don't use it; if
 you do, then removing the link breaks the package and all its
 dependencies.
 
 

You also have to know and trust the hosting locations for all of them, and
if they are not available via SSL you have to know and trust that there is
not a MITM available. 
 
 Instead of suggesting to removing support for externally hosted packages,
 why not propose a mechanism to provide a more direct/secure way to
 reference them ?
 
 

I did mention a method for doing that in my email. However there are reasons
beyond the security ones to require packages being hosted on PyPI. Namely
uptime, privacy, and performance.
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Aaron Meurer
On Wed, Feb 27, 2013 at 8:26 AM, Donald Stufft donald.stu...@gmail.com wrote:
 PyPI is now being served with a valid SSL certificate, and the
 tooling has begun to incorporate SSL verification of PyPI into
 the process. This is _excellent_ and the parties involved should
 all be thanked. However there is still another massive area of
 insecurity within the packaging tool chain.

 For those who don't know, when you attempt to install a particular
 package a number of urls are visited. The steps look roughly
 something like this:

 1. Visit http://pypi.python.org/simple/Package/ and attempt to
 collect any links that look like it's installable (tarballs,
 #egg=, etc).
 Note: /simple/Package/ contains download_url, home_page,
 and any link that is contained in the long_description).
 2. Visit any link referenced as home_page and attempt to
 collect any links that look like it's installable.
 3. Visit any link referenced in a dependency_links and attempt
 to collect any links that look like it's installable.
 4. Take all of the collected links and determine which one
 best matches the requirement spec given and download it.
 5. Rinse and repeat for every dependency in the requirement
 set.

 I propose we deprecate the external links that PyPI has published
 on the /simple/ indexes which exist because of the history of PyPI.
 Ideally in some number of months (1? 2?) we would turn off adding
 these links from new releases, leaving the existing ones intact and
 then a few months later the existing links be removed completely.

 Reasoning:
   1. It is difficult to secure the process of spidering external links
 for download.
 1a. The only way I can think offhand is by requiring uploading
   a hash of the expected files to PyPI along with the download
   link and removing all urls except for the download_url. This
   has the effect that only 1 file can be associated with a
 particular
   release.
   2. External links decrease the expected uptime for a particular set
   of requirements. PyPI itself has become very stable, however
   the same cannot be said for all of the hosts linked that the toolchain
   processes. Each new host is an additional SPOF.

   Ex: I depend on PyPI and 10 other external packages, each
 service has a 99% uptime so my expected uptime to
 be able to install all my requirements would be ~89% (0.99 **
 11).
   3. Breaks the ability for a CDN and/or mirroring infrastructure to provide
   increased uptime and better latency/throughput across the globe.
   4. Privacy implications, as a user it is not particularly obvious when
   I run `pip install Foo` what hosts I will be able issuing requests
 against.
   It is obvious that I will be contacting PyPI and I will have made the
   decision to trust PyPI however it is not obvious what other hosts will
   be able to gather information about me, including what packages I am
   installing. This becomes even more difficult to determine the deeper
   my dependency tree goes.

5. This is a serious PITA for package maintainers. If you accidentally
upload a file somewhere else that looks like a newer version pip will
install that.

6. It's a huge security hole.  For someone to upload a malicious
package, they just have to access some site that is crawled by pip,
which includes all old download sites.  If someone used to use some
download domain, but they no longer own it, this is very easy for
someone to upload an arbitrary malicious file with a slightly newer
version number, and pip will happily install that for everyone.

This was discussed at
http://mail.python.org/pipermail/catalog-sig/2012-June/004518.html.
My suggestion was to only download from the explicit external download
link for the latest listed version, and to do so only if an upload
didn't exist.

At the very least, let package maintainers manually enable this
behavior, so that we don't have to worry about tricking
pip/easy_install into installing the right thing by version number
naming (which is completely broken btw. It's impossible to name
separate Python 2 and Python 3 packages so that both pip and
easy_install will do the right thing in every case. See
https://code.google.com/p/sympy/issues/detail?id=3511).

Aaron Meurer
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Ronald Oussoren

On 27 Feb, 2013, at 16:42, Donald Stufft donald.stu...@gmail.com wrote:

 On Wednesday, February 27, 2013 at 10:39 AM, M.-A. Lemburg wrote:
 -1.
 
 There are many reasons for not hosting packages and distributions
 on PyPI itself.
 
 If you use and trust a package, you also have to know and trust its
 dependencies, no matter where they are hosted, so you're not gaining
 any security by disabling links to other download locations: if
 you don't trust the way a package is hosted, you don't use it; if
 you do, then removing the link breaks the package and all its
 dependencies.
 You also have to know and trust the hosting locations for all of them, and
 if they are not available via SSL you have to know and trust that there is
 not a MITM available. 

The security bits are still in flux, AFAIK both proposals won't require SSL for 
the 
actual download to be secure.

 
 Instead of suggesting to removing support for externally hosted packages,
 why not propose a mechanism to provide a more direct/secure way to
 reference them ?
 I did mention a method for doing that in my email. However there are reasons
 beyond the security ones to require packages being hosted on PyPI. Namely
 uptime, privacy, and performance.

You only mentioned restricting downloads to the 'Download-URL' property in the 
package metadata. Another alternative would be to add a PyPI API for registering
specific downloads with the same restrictions on filenames as for files hosted 
by PyPI itself.  With that PyPI could be queried for the exact downloads 
associated
with a release instead of having to perform screen scaping.

At this time I don't know if requiring that files are hosted on PyPI is a good 
idea,
as Marc-Andre said there are reasons for hosting them elsewhere.  That might
change when the package signing infrastructure is further specified.

Ronald

P.S. And only using downloads hosted on PyPI doesn't require changes to PyPI
anyway, just patches to pip and setuptools :-)

 ___
 Catalog-SIG mailing list
 Catalog-SIG@python.org
 http://mail.python.org/mailman/listinfo/catalog-sig

___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Marcus Smith
 pip/easy_install into installing the right thing by version number
 naming (which is completely broken btw. It's impossible to name
 separate Python 2 and Python 3 packages so that both pip and
 easy_install will do the right thing in every case. See
 https://code.google.com/p/sympy/issues/detail?id=3511).


to be clear, in this issue, easy_install is broke, but i understand you
want something that works consistently across both tools.

Marcus
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Donald Stufft
On Wednesday, February 27, 2013 at 11:34 AM, M.-A. Lemburg wrote:
 On 27.02.2013 16:42, Donald Stufft wrote:
  On Wednesday, February 27, 2013 at 10:39 AM, M.-A. Lemburg wrote:
   -1.
   
   There are many reasons for not hosting packages and distributions
   on PyPI itself.
   
   If you use and trust a package, you also have to know and trust its
   dependencies, no matter where they are hosted, so you're not gaining
   any security by disabling links to other download locations: if
   you don't trust the way a package is hosted, you don't use it; if
   you do, then removing the link breaks the package and all its
   dependencies.
   
  
  
  You also have to know and trust the hosting locations for all of them, and
  if they are not available via SSL you have to know and trust that there is
  not a MITM available. 
  
 
 
 Right.
 
 I'm not saying that it's not a good idea to host packages on PyPI,
 but forcing the community into doing this is not a good idea.
 
   Instead of suggesting to removing support for externally hosted packages,
   why not propose a mechanism to provide a more direct/secure way to
   reference them ?
   
  
  
  I did mention a method for doing that in my email. However there are reasons
  beyond the security ones to require packages being hosted on PyPI. Namely
  uptime, privacy, and performance.
  
 
 
 Your proposed uploading of hash values would require listing all
 distribution files for each release somehow. I don't see how you'd
 get that to work with older Python versions.
 
 
 1. It is difficult to secure the process of spidering external links
 for download.
 1a. The only way I can think offhand is by requiring uploading
 a hash of the expected files to PyPI along with the download
 link and removing all urls except for the download_url. This
 has the effect that only 1 file can be associated with a particular
 release.
 
 
 Uptime and performance have in the past been one of the reasons why
 people chose not to upload files to PyPI. This could be changed,
 of course.
 
 

I don't see how. If PyPI goes down then the packaging tools cannot
query /simple/foo/ to see the external links. Adding in additional SPOF's
only harms uptime, there is no possible way for it to increase it. 
 
 Another reason for not uploading files to PyPI are the license
 terms you have to agree to on PyPI and the fact that you can no
 longer control where your distribution files are made available
 by agreeing to them. This could be changed as well, but we'd need
 to add more legalese to the PyPI mirror setup for this to work...
 not sure whether people providing the mirrors would like this.
 
 

The legalese doesn't particularly give any more rights than any
free/OSS license does. There's not a requirement currently that
packages on PyPI be free/OSS but this change would only actually
affect people who want to upload non free code to PyPI.
 
 Security can be had by having installers check the GPG signatures
 of distribution file. You don't need to trust the download
 site for that.
 
 

GPG signatures are good, we don't have them yet. And when we do
it's only 1 layer of defense, not the final solution.
 
 I'm not sure what you meant with privacy in this context.
If I download something from server there is a certain amount
of information that by nature of HTTP and networking gets
leaked to that host. Additionally if it's done via non TLS connections
it also gets leaked to anyone who has a MITM on my connection.

This is especially important in countries where the government
actively surveils or modifies the traffic of their citizens.
 
 Something that would work even with older Python versions is
 letting the download URL point to a meta-file which contains
 the links to the other distribution files. That way you
 avoid having the installers trying to parse arbitrary
 websites and you can add more security to the downloads
 by providing hash values, etc. in those meta-files.
 
 Since installers already know how to parse the /simple/
 (HTML) index files, we might use that same format
 for those meta-files.
 
 -- 
 Marc-Andre Lemburg
 eGenix.com (http://eGenix.com)
 
 Professional Python Services directly from the Source (#1, Feb 26 2013)
Python Projects, Consulting and Support ... http://www.egenix.com/
mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/
mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/

   
  
 
 
 
 : Try our mxODBC.Connect Python Database Interface for free ! ::
 
 eGenix.com (http://eGenix.com) Software, Skills and Services GmbH 
 Pastor-Loeh-Str.48
 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
 Registered at Amtsgericht Duesseldorf: HRB 46611
 http://www.egenix.com/company/contact/
 
 


___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread M.-A. Lemburg
On 27.02.2013 17:43, Donald Stufft wrote:
 On Wednesday, February 27, 2013 at 11:34 AM, M.-A. Lemburg wrote:
 On 27.02.2013 16:42, Donald Stufft wrote:
 On Wednesday, February 27, 2013 at 10:39 AM, M.-A. Lemburg wrote:
 -1.

 There are many reasons for not hosting packages and distributions
 on PyPI itself.

 If you use and trust a package, you also have to know and trust its
 dependencies, no matter where they are hosted, so you're not gaining
 any security by disabling links to other download locations: if
 you don't trust the way a package is hosted, you don't use it; if
 you do, then removing the link breaks the package and all its
 dependencies.



 You also have to know and trust the hosting locations for all of them, and
 if they are not available via SSL you have to know and trust that there is
 not a MITM available. 



 Right.

 I'm not saying that it's not a good idea to host packages on PyPI,
 but forcing the community into doing this is not a good idea.

 Instead of suggesting to removing support for externally hosted packages,
 why not propose a mechanism to provide a more direct/secure way to
 reference them ?



 I did mention a method for doing that in my email. However there are reasons
 beyond the security ones to require packages being hosted on PyPI. Namely
 uptime, privacy, and performance.



 Your proposed uploading of hash values would require listing all
 distribution files for each release somehow. I don't see how you'd
 get that to work with older Python versions.

 
 1. It is difficult to secure the process of spidering external links
 for download.
 1a. The only way I can think offhand is by requiring uploading
 a hash of the expected files to PyPI along with the download
 link and removing all urls except for the download_url. This
 has the effect that only 1 file can be associated with a particular
 release.
 

 Uptime and performance have in the past been one of the reasons why
 people chose not to upload files to PyPI. This could be changed,
 of course.


 
 I don't see how. If PyPI goes down then the packaging tools cannot
 query /simple/foo/ to see the external links. Adding in additional SPOF's
 only harms uptime, there is no possible way for it to increase it. 

Package installers only need access to the static files in
the /simple/ index. Those can be put behind a CDN to increase
uptime.

PyPI itself doesn't have to be up and running if you just want
to download the files (unfortunately, that's not true at the
moment, because the /simple/ index is dynamically generated,
but that can be changed).

See http://wiki.python.org/moin/CloudPyPI for details.

 Another reason for not uploading files to PyPI are the license
 terms you have to agree to on PyPI and the fact that you can no
 longer control where your distribution files are made available
 by agreeing to them. This could be changed as well, but we'd need
 to add more legalese to the PyPI mirror setup for this to work...
 not sure whether people providing the mirrors would like this.


 
 The legalese doesn't particularly give any more rights than any
 free/OSS license does. There's not a requirement currently that
 packages on PyPI be free/OSS but this change would only actually
 affect people who want to upload non free code to PyPI.

It does affect any package author, regardless of the license.
Some examples:

* you may be forced remove a distribution from the net (think DMCA,
  patents, trademarks, etc)

* the distribution may contain a serious bug that you don't want to
  spread

* you may want to keep more accurate statistics of the reach of
  your project

 Security can be had by having installers check the GPG signatures
 of distribution file. You don't need to trust the download
 site for that.
 
 GPG signatures are good, we don't have them yet. And when we do
 it's only 1 layer of defense, not the final solution.

Sure, you still have to trust the author :-)

 I'm not sure what you meant with privacy in this context.

 If I download something from server there is a certain amount
 of information that by nature of HTTP and networking gets
 leaked to that host. Additionally if it's done via non TLS connections
 it also gets leaked to anyone who has a MITM on my connection.

 This is especially important in countries where the government
 actively surveils or modifies the traffic of their citizens.

I can see an issue with e.g. trying to download code that
is illegal to use in a country (e.g. crypto code, exploits,
hacks, etc.), but the country officials would probably just
block the complete PyPI site than bother with filtering single
requests.

IMO, that's beyond the scope of what we're discussing
here, though.

 Something that would work even with older Python versions is
 letting the download URL point to a meta-file which contains
 the links to the other distribution files. That way you
 avoid having the installers trying to parse arbitrary
 websites and you can add more security to the downloads
 by 

Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Donald Stufft
On Wednesday, February 27, 2013 at 12:10 PM, M.-A. Lemburg wrote:
 Package installers only need access to the static files in
 the /simple/ index. Those can be put behind a CDN to increase
 uptime.
 
 PyPI itself doesn't have to be up and running if you just want
 to download the files (unfortunately, that's not true at the
 moment, because the /simple/ index is dynamically generated,
 but that can be changed).
 
 See http://wiki.python.org/moin/CloudPyPI for details.
I'm aware of that, but that doesn't change the statement. If /simple/
is down you cannot determine the external urls. There is no way
to increase uptime by adding more points of failure. 
 
   Another reason for not uploading files to PyPI are the license
   terms you have to agree to on PyPI and the fact that you can no
   longer control where your distribution files are made available
   by agreeing to them. This could be changed as well, but we'd need
   to add more legalese to the PyPI mirror setup for this to work...
   not sure whether people providing the mirrors would like this.
   
  
  
  The legalese doesn't particularly give any more rights than any
  free/OSS license does. There's not a requirement currently that
  packages on PyPI be free/OSS but this change would only actually
  affect people who want to upload non free code to PyPI.
  
 
 
 It does affect any package author, regardless of the license.
 Some examples:
 
 * you may be forced remove a distribution from the net (think DMCA,
 patents, trademarks, etc)
 
 

IANAL but I'm pretty sure if any of those things occur you didn't have
the legal right to grant that license to the PSF and the PSF would be
required to take them down anyways. 
 
 * the distribution may contain a serious bug that you don't want to
 spread
 
 

This is a completely separate issue. PyPI supports (and always will)
a method of saying delete and/or don't install this. This is really just
a strawman.
 
 * you may want to keep more accurate statistics of the reach of
 your project
 
 

What statistics do you want? Let's have PyPI produce them and
properly anonymize them instead of leaking data. 
 
   Security can be had by having installers check the GPG signatures
   of distribution file. You don't need to trust the download
   site for that.
   
  
  
  GPG signatures are good, we don't have them yet. And when we do
  it's only 1 layer of defense, not the final solution.
  
 
 
 Sure, you still have to trust the author :-)
But do I need to trust his host? Do I need to trust that his laptop didn't
get swiped and with it his GPG key? Ideally I don't *need* to trust the
author either. I download his package from PyPI and I can review it,
then I know it's fine and I can download that version and use it. PyPI
isn't to the point you can make that assumption but It should get there.
 
   I'm not sure what you meant with privacy in this context.
  
  If I download something from server there is a certain amount
  of information that by nature of HTTP and networking gets
  leaked to that host. Additionally if it's done via non TLS connections
  it also gets leaked to anyone who has a MITM on my connection.
  
  This is especially important in countries where the government
  actively surveils or modifies the traffic of their citizens.
  
 
 
 I can see an issue with e.g. trying to download code that
 is illegal to use in a country (e.g. crypto code, exploits,
 hacks, etc.), but the country officials would probably just
 block the complete PyPI site than bother with filtering single
 requests.
 
 IMO, that's beyond the scope of what we're discussing
 here, though.
 
 

It's not just crypto code, exploits, hacks it's also things like
https://ooni.torproject.org/ and Tor itself which are *good* projects
that certain governments might not particularly like.
 
   Something that would work even with older Python versions is
   letting the download URL point to a meta-file which contains
   the links to the other distribution files. That way you
   avoid having the installers trying to parse arbitrary
   websites and you can add more security to the downloads
   by providing hash values, etc. in those meta-files.
   
   Since installers already know how to parse the /simple/
   (HTML) index files, we might use that same format
   for those meta-files.
   
  
 
 
 So what do you think of the above idea ?
If the hashes is on the external system then they are as good as
useless. If I'm able to do something nefarious with the packages
that are hosted I can do something nefarious with the metadata file.

Putting the hashes on PyPI fixes the security issue (because we
have a real SSL cert, and tools are starting to validate it) but doesn't
fix the other issues.
 
 -- 
 Marc-Andre Lemburg
 eGenix.com (http://eGenix.com)
 
 Professional Python Services directly from the Source (#1, Feb 26 2013)
Python Projects, Consulting and Support ... http://www.egenix.com/
mxODBC.Zope/Plone.Database.Adapter ... 

Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Donald Stufft
On Wednesday, February 27, 2013 at 12:22 PM, holger krekel wrote:
 The main means of securing against tampering is author-signatures
 and verification by installers. If we have that, the download location
 does not matter (pypi/CDN/google/...).

Again we don't have that yet, It's only 1 layer, and that doesn't solve
all of the issues with external packages. 
 
 
  2. External links decrease the expected uptime for a particular set
  of requirements. PyPI itself has become very stable, however
  the same cannot be said for all of the hosts linked that the toolchain
  processes. Each new host is an additional SPOF.
  
  Ex: I depend on PyPI and 10 other external packages, each
  service has a 99% uptime so my expected uptime to
  be able to install all my requirements would be ~89% (0.99 ** 11).
  
 
 
 There are many links which go to google, bitbucket or github -
 i doubt those services have worse availability than pypi.python.org 
 (http://pypi.python.org),
 rather better.

Doesn't matter if they have worse or better, you cannot increase
availability by adding more points of failure, at best you keep it
the same, typically you decrease it. 
 
 Also we would be loosing a lot of packages because i expect there to
 be a non-trivial amount of packages which will not be transferred to 
 pypi.python.org (http://pypi.python.org) no matter how much people here think 
 it's cool.
 
 Why not first have an a good infrastructure and capacity with
 pypi.python.org (http://pypi.python.org) so that people *want* to move their 
 files there?

PyPI has had very good uptime since the move to OSL. I don't have
numbers handy but I believe I can get them. 
 
 best,
 holger
 
 
  3. Breaks the ability for a CDN and/or mirroring infrastructure to provide
  increased uptime and better latency/throughput across the globe.
  4. Privacy implications, as a user it is not particularly obvious when
  I run `pip install Foo` what hosts I will be able issuing requests against.
  It is obvious that I will be contacting PyPI and I will have made the
  decision to trust PyPI however it is not obvious what other hosts will
  be able to gather information about me, including what packages I am
  installing. This becomes even more difficult to determine the deeper
  my dependency tree goes.
  
 
 
 
  ___
  Catalog-SIG mailing list
  Catalog-SIG@python.org (mailto:Catalog-SIG@python.org)
  http://mail.python.org/mailman/listinfo/catalog-sig
  
 


___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Aaron Meurer
On Feb 27, 2013, at 10:22 AM, holger krekel hol...@merlinux.eu wrote:

 On Wed, Feb 27, 2013 at 10:26 -0500, Donald Stufft wrote:
 PyPI is now being served with a valid SSL certificate, and the
 tooling has begun to incorporate SSL verification of PyPI into
 the process. This is _excellent_ and the parties involved should
 all be thanked. However there is still another massive area of
 insecurity within the packaging tool chain.

 For those who don't know, when you attempt to install a particular
 package a number of urls are visited. The steps look roughly
 something like this:

1. Visit http://pypi.python.org/simple/Package/ and attempt to
collect any links that look like it's installable (tarballs,
#egg=, etc).
Note: /simple/Package/ contains download_url, home_page,
and any link that is contained in the long_description).
2. Visit any link referenced as home_page and attempt to
collect any links that look like it's installable.
3. Visit any link referenced in a dependency_links and attempt
to collect any links that look like it's installable.
4. Take all of the collected links and determine which one
best matches the requirement spec given and download it.
5. Rinse and repeat for every dependency in the requirement
set.

 I propose we deprecate the external links that PyPI has published
 on the /simple/ indexes which exist because of the history of PyPI.
 Ideally in some number of months (1? 2?) we would turn off adding
 these links from new releases, leaving the existing ones intact and
 then a few months later the existing links be removed completely.

 Reasoning:
  1. It is difficult to secure the process of spidering external links
for download.
1a. The only way I can think offhand is by requiring uploading
  a hash of the expected files to PyPI along with the download
  link and removing all urls except for the download_url. This
  has the effect that only 1 file can be associated with a particular
  release.

 The main means of securing against tampering is author-signatures
 and verification by installers.  If we have that, the download location
 does not matter (pypi/CDN/google/...).

  2. External links decrease the expected uptime for a particular set
  of requirements. PyPI itself has become very stable, however
  the same cannot be said for all of the hosts linked that the toolchain
  processes. Each new host is an additional SPOF.

  Ex: I depend on PyPI and 10 other external packages, each
service has a 99% uptime so my expected uptime to
be able to install all my requirements would be ~89% (0.99 ** 11).

 There are many links which go to google, bitbucket or github -
 i doubt those services have worse availability than pypi.python.org,
 rather better.

 Also we would be loosing a lot of packages because i expect there to
 be a non-trivial amount of packages which will not be transferred to
 pypi.python.org no matter how much people here think it's cool.

 Why not first have an a good infrastructure and capacity with
 pypi.python.org so that people *want* to move their files there?

If you change the policy to also download links, but only official
links actually manually put there by the package maintainer, no
crawling, isn't it fair to say, if you want pip to install your
package, you need to tell PyPI where it is, explicitly. And if you
release a new version, you need to tell PyPI about that new version,
or else it will continue to install the old version.  I suppose they
could also just have a link to latest tarball if they really want to
be lazy.

PyPI/pip are not like Linux package systems. They should have no
prerogative to always try to get the latest version without any work
by the package maintainer, especially since there's not a team of
people who do it: the whole thing happens automatically by some
heuristics.

Aaron Meurer


 best,
 holger


  3. Breaks the ability for a CDN and/or mirroring infrastructure to provide
  increased uptime and better latency/throughput across the globe.
  4. Privacy implications, as a user it is not particularly obvious when
  I run `pip install Foo` what hosts I will be able issuing requests 
 against.
  It is obvious that I will be contacting PyPI and I will have made the
  decision to trust PyPI however it is not obvious what other hosts will
  be able to gather information about me, including what packages I am
  installing. This becomes even more difficult to determine the deeper
  my dependency tree goes.


 ___
 Catalog-SIG mailing list
 Catalog-SIG@python.org
 http://mail.python.org/mailman/listinfo/catalog-sig

 ___
 Catalog-SIG mailing list
 Catalog-SIG@python.org
 http://mail.python.org/mailman/listinfo/catalog-sig

Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Jesse Noller
  2. External links decrease the expected uptime for a particular set
  of requirements. PyPI itself has become very stable, however
  the same cannot be said for all of the hosts linked that the toolchain
  processes. Each new host is an additional SPOF.
  
  Ex: I depend on PyPI and 10 other external packages, each
  service has a 99% uptime so my expected uptime to
  be able to install all my requirements would be ~89% (0.99 ** 11).
 
 
 
 There are many links which go to google, bitbucket or github -
 i doubt those services have worse availability than pypi.python.org 
 (http://pypi.python.org),
 rather better.
 
 Also we would be loosing a lot of packages because i expect there to
 be a non-trivial amount of packages which will not be transferred to 
 pypi.python.org (http://pypi.python.org) no matter how much people here think 
 it's cool.
 
 Why not first have an a good infrastructure and capacity with
 pypi.python.org (http://pypi.python.org) so that people *want* to move their 
 files there?
 
 best,
 holger
 
Ok, so we have that. What now? 
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Justin Cappos
Having different sources for package metadata does pose security concerns,
for example version mismatch attacks by a MITM.   Unless we co-locate all
package metadata at a single source that is trusted for protecting against
these issues, this will be an issue.(However, possibly not the biggest
threat right now.)

I do believe that if you do centralize metadata, you could outsource
mirroring the data if desired without losing the other security goals you
have.

Thanks,
Justin



On Wed, Feb 27, 2013 at 10:39 AM, M.-A. Lemburg m...@egenix.com wrote:

 On 27.02.2013 16:26, Donald Stufft wrote:
  PyPI is now being served with a valid SSL certificate, and the
  tooling has begun to incorporate SSL verification of PyPI into
  the process. This is _excellent_ and the parties involved should
  all be thanked. However there is still another massive area of
  insecurity within the packaging tool chain.
 
  For those who don't know, when you attempt to install a particular
  package a number of urls are visited. The steps look roughly
  something like this:
 
  1. Visit http://pypi.python.org/simple/Package/ and attempt to
  collect any links that look like it's installable (tarballs,
  #egg=, etc).
  Note: /simple/Package/ contains download_url, home_page,
  and any link that is contained in the long_description).
  2. Visit any link referenced as home_page and attempt to
  collect any links that look like it's installable.
  3. Visit any link referenced in a dependency_links and attempt
  to collect any links that look like it's installable.
  4. Take all of the collected links and determine which one
  best matches the requirement spec given and download it.
  5. Rinse and repeat for every dependency in the requirement
  set.
 
  I propose we deprecate the external links that PyPI has published
  on the /simple/ indexes which exist because of the history of PyPI.
  Ideally in some number of months (1? 2?) we would turn off adding
  these links from new releases, leaving the existing ones intact and
  then a few months later the existing links be removed completely.

 -1.

 There are many reasons for not hosting packages and distributions
 on PyPI itself.

 If you use and trust a package, you also have to know and trust its
 dependencies, no matter where they are hosted, so you're not gaining
 any security by disabling links to other download locations: if
 you don't trust the way a package is hosted, you don't use it; if
 you do, then removing the link breaks the package and all its
 dependencies.

 Instead of suggesting to removing support for externally hosted packages,
 why not propose a mechanism to provide a more direct/secure way to
 reference them ?

 --
 Marc-Andre Lemburg
 eGenix.com

 Professional Python Services directly from the Source  (#1, Feb 26 2013)
  Python Projects, Consulting and Support ...   http://www.egenix.com/
  mxODBC.Zope/Plone.Database.Adapter ...   http://zope.egenix.com/
  mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/
 

 : Try our mxODBC.Connect Python Database Interface for free ! ::

eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
http://www.egenix.com/company/contact/
 ___
 Catalog-SIG mailing list
 Catalog-SIG@python.org
 http://mail.python.org/mailman/listinfo/catalog-sig

___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Aaron Meurer
Which in particular means that metadata needs to come from PyPI itself, not
from the tarball file name.

Aaron Meurer

On Feb 27, 2013, at 11:06 AM, Justin Cappos jcap...@poly.edu wrote:

Having different sources for package metadata does pose security concerns,
for example version mismatch attacks by a MITM.   Unless we co-locate all
package metadata at a single source that is trusted for protecting against
these issues, this will be an issue.(However, possibly not the biggest
threat right now.)

I do believe that if you do centralize metadata, you could outsource
mirroring the data if desired without losing the other security goals you
have.

Thanks,
 Justin



On Wed, Feb 27, 2013 at 10:39 AM, M.-A. Lemburg m...@egenix.com wrote:

 On 27.02.2013 16:26, Donald Stufft wrote:
  PyPI is now being served with a valid SSL certificate, and the
  tooling has begun to incorporate SSL verification of PyPI into
  the process. This is _excellent_ and the parties involved should
  all be thanked. However there is still another massive area of
  insecurity within the packaging tool chain.
 
  For those who don't know, when you attempt to install a particular
  package a number of urls are visited. The steps look roughly
  something like this:
 
  1. Visit http://pypi.python.org/simple/Package/ and attempt to
  collect any links that look like it's installable (tarballs,
  #egg=, etc).
  Note: /simple/Package/ contains download_url, home_page,
  and any link that is contained in the long_description).
  2. Visit any link referenced as home_page and attempt to
  collect any links that look like it's installable.
  3. Visit any link referenced in a dependency_links and attempt
  to collect any links that look like it's installable.
  4. Take all of the collected links and determine which one
  best matches the requirement spec given and download it.
  5. Rinse and repeat for every dependency in the requirement
  set.
 
  I propose we deprecate the external links that PyPI has published
  on the /simple/ indexes which exist because of the history of PyPI.
  Ideally in some number of months (1? 2?) we would turn off adding
  these links from new releases, leaving the existing ones intact and
  then a few months later the existing links be removed completely.

 -1.

 There are many reasons for not hosting packages and distributions
 on PyPI itself.

 If you use and trust a package, you also have to know and trust its
 dependencies, no matter where they are hosted, so you're not gaining
 any security by disabling links to other download locations: if
 you don't trust the way a package is hosted, you don't use it; if
 you do, then removing the link breaks the package and all its
 dependencies.

 Instead of suggesting to removing support for externally hosted packages,
 why not propose a mechanism to provide a more direct/secure way to
 reference them ?

 --
 Marc-Andre Lemburg
 eGenix.com

 Professional Python Services directly from the Source  (#1, Feb 26 2013)
  Python Projects, Consulting and Support ...   http://www.egenix.com/
  mxODBC.Zope/Plone.Database.Adapter ...   http://zope.egenix.com/
  mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/
 

 : Try our mxODBC.Connect Python Database Interface for free ! ::

eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
http://www.egenix.com/company/contact/
 ___
 Catalog-SIG mailing list
 Catalog-SIG@python.org
 http://mail.python.org/mailman/listinfo/catalog-sig


___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Noah Kantrowitz

On Feb 27, 2013, at 9:28 AM, M.-A. Lemburg wrote:

 On 27.02.2013 18:05, Noah Kantrowitz wrote:
 
 
 M.-A. Lemburg m...@egenix.com wrote:
 I propose we deprecate the external links that PyPI has published
 on the /simple/ indexes which exist because of the history of PyPI.
 Ideally in some number of months (1? 2?) we would turn off adding
 these links from new releases, leaving the existing ones intact and
 then a few months later the existing links be removed completely.
 
 -1.
 
 There are many reasons for not hosting packages and distributions
 on PyPI itself.
 
 
 [citation needed]
 
 We've been through this discussion a couple of times in the past.
 I'm sure the reasons will get listed again in this discussion :-)
 
 Too many distribution files for PyPI to handle,

Again, please point at a specific package. I wasn't aware that PyPI limited 
uploads at all, but if it does we can certainly increase the number if there is 
a good reason.

 no support for
 UCS2/UCS4 binary distributions, unsupported distribution file
 formats (e.g. our prebuilt format),

Not sure why PyPI would even care what charset the package files use, but if 
true thats certainly a bug and we can get that fixed. What file formats do 
pip/buildout support that PyPI doesn't support for uploads?

 giving up control
 are some of them.

This is the point of running a package server, the author gives up control over 
distribution in order to reap the benefits of solid infrastructure and 
discoverability. This is a feature.

 
 The legal restrictions on code on pypi itself is nothing more than needed to 
 let people actually install things, which is kind of the point of listing on 
 pypi. If someone really wants their own universe, run a package server 
 yourself. What other reasons are there? Agreeing to an extra license would 
 block pip anyway, so no loss there. Huge package files maybe? 
 
 That's not quite true:
 
 http://www.python.org/about/legal/
 
 
 ... third party content providers grant the PSF and all other users of the 
 web site an irrevocable,
 worldwide, royalty-free, nonexclusive license to reproduce, distribute, 
 transmit, display, perform,
 and publish such content, including in digital form.
 
 
 Once you upload the files to PyPI, you completely give up control,
 because that license is irrevocable. This goes far beyond what the
 Python license does:
 
 http://docs.python.org/2/license.html
 
 Changing the PyPI terms to be more author-friendly is on my agenda,
 but I haven't found the time for that particular discussion yet ;-)

You are comparing an artifact distribution requirement with a source code 
license. PyPI's terms don't say a thing about source code or anything else, 
just that if you want a package file to be installable, we need to be able to 
send it to people. There is nothing even remotely author unfriendly here, it is 
just common sense. Again, PyPI is _not_ the only way to publish packages, we 
are allowed to expect interoperability from people that choose to participate 
in our community.

--Noah



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Giovanni Bajo
Il giorno 27/feb/2013, alle ore 19:23, Donald Stufft donald.stu...@gmail.com 
ha scritto:

 On Wednesday, February 27, 2013 at 12:44 PM, Donald Stufft wrote:
 
 Why not first have an a good infrastructure and capacity with
 pypi.python.org so that people *want* to move their files there?
 PyPI has had very good uptime since the move to OSL. I don't have
 numbers handy but I believe I can get them.
 I got the numbers! Since almost a year ago (This was setup at the last
 US PyCon):
 
 Uptime: 99.99%
 Downtime: 6h 58m
 Number of Downtimes: 126
 
 I want to stress again that even if that was a poor number that adding
 more points of failure only decrease the expected uptime, or at best
 does nothing.

In fact, adding a caching CDN in front of PyPI (instead of the current mirror 
protocol) would probably bring the uptime close to 100% for people downloading 
packages via pip.

I'm +1 on dropping the current (complicated) mirror system and external links, 
and in favor of centralizing everything into PyPI, plus a third-party CDN / 
hosting service. In fact, Python is a big-enough brand name that we could even 
get a CDN service almost for free in exchange of an acknowledge of the CDN 
company being used.
-- 
Giovanni Bajo   ::  ra...@develer.com
Develer S.r.l.  ::  http://www.develer.com

My Blog: http://giovanni.bajo.it







smime.p7s
Description: S/MIME cryptographic signature
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Donald Stufft
On Wednesday, February 27, 2013 at 1:32 PM, Giovanni Bajo wrote: 
 In fact, Python is a big-enough brand name that we could even get a CDN 
 service almost for free in exchange of an acknowledge of the CDN company 
 being used.
 
 

As far as I know this has already have been offered in some form to Python. 

___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Donald Stufft
On Wednesday, February 27, 2013 at 1:34 PM, holger krekel wrote:
 On Wed, Feb 27, 2013 at 13:00 -0500, Jesse Noller wrote:
2. External links decrease the expected uptime for a particular set
of requirements. PyPI itself has become very stable, however
the same cannot be said for all of the hosts linked that the toolchain
processes. Each new host is an additional SPOF.

Ex: I depend on PyPI and 10 other external packages, each
service has a 99% uptime so my expected uptime to
be able to install all my requirements would be ~89% (0.99 ** 11).

   
   
   
   
   There are many links which go to google, bitbucket or github -
   i doubt those services have worse availability than pypi.python.org 
   (http://pypi.python.org),
   rather better.
   
   Also we would be loosing a lot of packages because i expect there to
   be a non-trivial amount of packages which will not be transferred to 
   pypi.python.org (http://pypi.python.org) no matter how much people here 
   think it's cool.
   
   Why not first have an a good infrastructure and capacity with
   pypi.python.org (http://pypi.python.org) so that people *want* to move 
   their files there?
   
   best,
   holger
   
  
  Ok, so we have that. What now? 
  
 
 
 I am not sure i understand. Just last week there were many installs
 going wrong - installs failing due to the http/https redirecting.
 
 

This same problem would have affected external urls as well because
you cannot install something with having first contacted PyPI. 
 I've got at least 3 occassions myself in the last months where i couldn't 
 use pypi.python.org (http://pypi.python.org) and i've heart similar things 
 from other people.
 
 

Couldn't Use pypi.python.org is very vague. I hit PyPI every 15 seconds
or so and rarely have issues. Lately when there have been installation
problems it's been due to external services being down. For example
Mercurial has recently been having problems because they don't host
their packages on PyPI and their website has had downtime issues lately. 
 There is also the issue that it's not clear we could just put all packages
 from download locations to pypi.python.org (http://pypi.python.org) due to 
 sizing constraints - 
 at least that is what i got from discussions here earlier.
 
 

If a package is too large for PyPI that is a solvable problem, the current
limit exists for a sanity check, not for any hard technical reason. 
 
 holger 

___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Jesse Noller


On Wednesday, February 27, 2013 at 1:33 PM, Donald Stufft wrote:

 On Wednesday, February 27, 2013 at 1:32 PM, Giovanni Bajo wrote: 
  In fact, Python is a big-enough brand name that we could even get a CDN 
  service almost for free in exchange of an acknowledge of the CDN company 
  being used.
 
 
 As far as I know this has already have been offered in some form to Python.

Yup, by like, 2 or 3 hosting companies. 


___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Aaron Meurer
On Wed, Feb 27, 2013 at 11:37 AM, holger krekel hol...@merlinux.eu wrote:
 On Wed, Feb 27, 2013 at 19:34 +0100, Lennart Regebro wrote:
 On Wed, Feb 27, 2013 at 5:34 PM, M.-A. Lemburg m...@egenix.com wrote:
  I'm not saying that it's not a good idea to host packages on PyPI,
  but forcing the community into doing this is not a good idea.

 I still don't understand why not. The only reasons I've seen are
 Because they don't want to or because they don't trust PyPI. And
 in the latter case I'm assuming they wouldn't use PyPI at all.

 And of course, nobody is forcing anyone, just like nobody is forcing
 you to use PyPI. :-)

 I understood there is the idea to disable external links within a couple
 of months.  That does break backward compatibility in a considerable way.

 holger

But wouldn't this only be a change in pip/easy_install, not PyPI
itself? I suppose you could explicitly break the external links by
having them point to nothing if you are worried about the security or
if it's some performance issue (that would indeed be a bad
compatibility break, in case people are using those for other
purposes).  Otherwise, if it's a problem, then just use the old
version of pip.

Aaron Meurer
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Lennart Regebro
On a general note: It really warms my heart to see that people are
warming up to the idea of using CDN's and getting rid of external
downloads. I'm all for that.

//Lennart
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Jesse Noller


On Wednesday, February 27, 2013 at 2:47 PM, Lennart Regebro wrote:

 On a general note: It really warms my heart to see that people are
 warming up to the idea of using CDN's and getting rid of external
 downloads. I'm all for that.

Excellent. So it's a date! 


___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Noah Kantrowitz

On Feb 27, 2013, at 11:47 AM, Lennart Regebro wrote:

 On a general note: It really warms my heart to see that people are
 warming up to the idea of using CDN's and getting rid of external
 downloads. I'm all for that.

Just to be clear on this point
1) Moving PyPI and other PSF properties behind a caching CDN will be happening, 
just haven't had the cycles but the foundation has been laid
2) Moving PyPI to use cloud storage as its primary backing store (S3, Swift, 
etc) is not really determined, we might opt to move it to using a local Gluster 
or Ceph cluster instead and do the origin serving ourselves since it matters 
much less in light of #1
3) Most importantly, this has absolutely nothing to do with the current 
discussion

--Noah

___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Daniel Holth
Would it be wrong to ask for a /complex API at the same time? The
simple api, with 28k package names on one page, is getting a little
silly.
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Donald Stufft
On Wednesday, February 27, 2013 at 2:56 PM, Aaron Meurer wrote:
 On Wed, Feb 27, 2013 at 12:49 PM, Monty Taylor mord...@inaugust.com 
 (mailto:mord...@inaugust.com) wrote:
  
  
  On 02/27/2013 02:47 PM, Aaron Meurer wrote:
   On Wed, Feb 27, 2013 at 11:37 AM, holger krekel hol...@merlinux.eu 
   (mailto:hol...@merlinux.eu) wrote:
On Wed, Feb 27, 2013 at 19:34 +0100, Lennart Regebro wrote:
 On Wed, Feb 27, 2013 at 5:34 PM, M.-A. Lemburg m...@egenix.com 
 (mailto:m...@egenix.com) wrote:
  I'm not saying that it's not a good idea to host packages on PyPI,
  but forcing the community into doing this is not a good idea.
  
 
 
 I still don't understand why not. The only reasons I've seen are
 Because they don't want to or because they don't trust PyPI. And
 in the latter case I'm assuming they wouldn't use PyPI at all.
 
 And of course, nobody is forcing anyone, just like nobody is forcing
 you to use PyPI. :-)
 


I understood there is the idea to disable external links within a couple
of months. That does break backward compatibility in a considerable way.

holger
   
   But wouldn't this only be a change in pip/easy_install, not PyPI
   itself? I suppose you could explicitly break the external links by
   having them point to nothing if you are worried about the security or
   if it's some performance issue (that would indeed be a bad
   compatibility break, in case people are using those for other
   purposes). Otherwise, if it's a problem, then just use the old
   version of pip.
   
  
  
  If we don't remove the feature from pypi itself, then it won't help the
  folks for whom its a problem, because there will be no incentive for the
  folks hosting their software that way to actually upload their stuff to
  PyPI - which means that client-side disabling of external_links is
  fairly likely to never be usable.
  
 
 
 How would you remove it from PyPI itself? Would that just require
 changing some urls, so that pip doesn't know where to find stuff any
 more?
 
 

Modify the PyPI software to no longer link to those urls. 
 
 Sorry if this is obvious. I'm not a pip/PyPI developer. Just a
 package maintainer who has been irked several times by
 pip's/PyPI's/easy_install's idiotic external links policy.
 
 


___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Aaron Meurer
On Feb 27, 2013, at 1:01 PM, Donald Stufft donald.stu...@gmail.com wrote:

On Wednesday, February 27, 2013 at 2:56 PM, Aaron Meurer wrote:

On Wed, Feb 27, 2013 at 12:49 PM, Monty Taylor mord...@inaugust.com wrote:



On 02/27/2013 02:47 PM, Aaron Meurer wrote:

On Wed, Feb 27, 2013 at 11:37 AM, holger krekel hol...@merlinux.eu wrote:

On Wed, Feb 27, 2013 at 19:34 +0100, Lennart Regebro wrote:

On Wed, Feb 27, 2013 at 5:34 PM, M.-A. Lemburg m...@egenix.com wrote:

I'm not saying that it's not a good idea to host packages on PyPI,
but forcing the community into doing this is not a good idea.


I still don't understand why not. The only reasons I've seen are
Because they don't want to or because they don't trust PyPI. And
in the latter case I'm assuming they wouldn't use PyPI at all.

And of course, nobody is forcing anyone, just like nobody is forcing
you to use PyPI. :-)


I understood there is the idea to disable external links within a couple
of months. That does break backward compatibility in a considerable way.

holger


But wouldn't this only be a change in pip/easy_install, not PyPI
itself? I suppose you could explicitly break the external links by
having them point to nothing if you are worried about the security or
if it's some performance issue (that would indeed be a bad
compatibility break, in case people are using those for other
purposes). Otherwise, if it's a problem, then just use the old
version of pip.


If we don't remove the feature from pypi itself, then it won't help the
folks for whom its a problem, because there will be no incentive for the
folks hosting their software that way to actually upload their stuff to
PyPI - which means that client-side disabling of external_links is
fairly likely to never be usable.


How would you remove it from PyPI itself? Would that just require
changing some urls, so that pip doesn't know where to find stuff any
more?

Modify the PyPI software to no longer link to those urls.


Right. As I was saying, this would break any other tools that might use
those urls, perhaps for less nefarious purposes. But then again, that's
somewhat speculative. If someone can point out something that uses them,
that will be something to consider, but for now, the main thing we know
uses it is pip (and easy_install), and the whole point is to break them.

Aaron Meurer


Sorry if this is obvious. I'm not a pip/PyPI developer. Just a
package maintainer who has been irked several times by
pip's/PyPI's/easy_install's idiotic external links policy.
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Daniel Holth
On Wed, Feb 27, 2013 at 3:08 PM, Aaron Meurer asmeu...@gmail.com wrote:
 On Feb 27, 2013, at 1:01 PM, Donald Stufft donald.stu...@gmail.com wrote:

 On Wednesday, February 27, 2013 at 2:56 PM, Aaron Meurer wrote:

 On Wed, Feb 27, 2013 at 12:49 PM, Monty Taylor mord...@inaugust.com wrote:



 On 02/27/2013 02:47 PM, Aaron Meurer wrote:

 On Wed, Feb 27, 2013 at 11:37 AM, holger krekel hol...@merlinux.eu wrote:

 On Wed, Feb 27, 2013 at 19:34 +0100, Lennart Regebro wrote:

 On Wed, Feb 27, 2013 at 5:34 PM, M.-A. Lemburg m...@egenix.com wrote:

 I'm not saying that it's not a good idea to host packages on PyPI,
 but forcing the community into doing this is not a good idea.


 I still don't understand why not. The only reasons I've seen are
 Because they don't want to or because they don't trust PyPI. And
 in the latter case I'm assuming they wouldn't use PyPI at all.

 And of course, nobody is forcing anyone, just like nobody is forcing
 you to use PyPI. :-)


 I understood there is the idea to disable external links within a couple
 of months. That does break backward compatibility in a considerable way.

 holger


 But wouldn't this only be a change in pip/easy_install, not PyPI
 itself? I suppose you could explicitly break the external links by
 having them point to nothing if you are worried about the security or
 if it's some performance issue (that would indeed be a bad
 compatibility break, in case people are using those for other
 purposes). Otherwise, if it's a problem, then just use the old
 version of pip.


 If we don't remove the feature from pypi itself, then it won't help the
 folks for whom its a problem, because there will be no incentive for the
 folks hosting their software that way to actually upload their stuff to
 PyPI - which means that client-side disabling of external_links is
 fairly likely to never be usable.


 How would you remove it from PyPI itself? Would that just require
 changing some urls, so that pip doesn't know where to find stuff any
 more?

 Modify the PyPI software to no longer link to those urls.


 Right. As I was saying, this would break any other tools that might use
 those urls, perhaps for less nefarious purposes. But then again, that's
 somewhat speculative. If someone can point out something that uses them,
 that will be something to consider, but for now, the main thing we know uses
 it is pip (and easy_install), and the whole point is to break them.

 Aaron Meurer


 Sorry if this is obvious. I'm not a pip/PyPI developer. Just a
 package maintainer who has been irked several times by
 pip's/PyPI's/easy_install's idiotic external links policy.

Or just expose a new no external links API the same as the simple
API (pretty sure crate offers this) that will be the default in the
next release of pip, giving people a little more control over when
their packaging tool breaks.
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread holger krekel
On Wed, Feb 27, 2013 at 14:49 -0500, Monty Taylor wrote:
 On 02/27/2013 02:47 PM, Aaron Meurer wrote:
  On Wed, Feb 27, 2013 at 11:37 AM, holger krekel hol...@merlinux.eu wrote:
  On Wed, Feb 27, 2013 at 19:34 +0100, Lennart Regebro wrote:
  On Wed, Feb 27, 2013 at 5:34 PM, M.-A. Lemburg m...@egenix.com wrote:
  I'm not saying that it's not a good idea to host packages on PyPI,
  but forcing the community into doing this is not a good idea.
 
  I still don't understand why not. The only reasons I've seen are
  Because they don't want to or because they don't trust PyPI. And
  in the latter case I'm assuming they wouldn't use PyPI at all.
 
  And of course, nobody is forcing anyone, just like nobody is forcing
  you to use PyPI. :-)
 
  I understood there is the idea to disable external links within a couple
  of months.  That does break backward compatibility in a considerable way.
 
  holger
  
  But wouldn't this only be a change in pip/easy_install, not PyPI
  itself? I suppose you could explicitly break the external links by
  having them point to nothing if you are worried about the security or
  if it's some performance issue (that would indeed be a bad
  compatibility break, in case people are using those for other
  purposes).  Otherwise, if it's a problem, then just use the old
  version of pip.
 
 If we don't remove the feature from pypi itself, then it won't help the
 folks for whom its a problem, because there will be no incentive for the
 folks hosting their software that way to actually upload their stuff to
 PyPI - which means that client-side disabling of external_links is
 fairly likely to never be usable.

I can see it's tempting to just try to force everyone to upload
their stuff to pypi.python.org.  I am very skeptical about this approach.

There already is a high frustration with the packaging ecology
in Python.  When we remove external links on the server side, installs
for many people and companies are going to break, no matter what.  And
they would have no client-side switch anymore to make things working.
Requiring to use older setuptools/pip versions would not help because
the server information is gone.  That'd just increase frustration.

So at the very least using external links needs to be a client-side
installer choice for a long while and the server needs to offer
the according information.

I'd generally prefer to think hard about ways to improve the situation
without breaking things.  Putting simple/ and packaging serving on a CDN
is one such step and a good idea i think.  Establishing a
signing/verification mechanism is another.  Refining py2/py3 dependency
discovery yet another good one.

best,
holger
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Noah Kantrowitz

On Feb 27, 2013, at 12:16 PM, holger krekel wrote:

 On Wed, Feb 27, 2013 at 14:49 -0500, Monty Taylor wrote:
 On 02/27/2013 02:47 PM, Aaron Meurer wrote:
 On Wed, Feb 27, 2013 at 11:37 AM, holger krekel hol...@merlinux.eu wrote:
 On Wed, Feb 27, 2013 at 19:34 +0100, Lennart Regebro wrote:
 On Wed, Feb 27, 2013 at 5:34 PM, M.-A. Lemburg m...@egenix.com wrote:
 I'm not saying that it's not a good idea to host packages on PyPI,
 but forcing the community into doing this is not a good idea.
 
 I still don't understand why not. The only reasons I've seen are
 Because they don't want to or because they don't trust PyPI. And
 in the latter case I'm assuming they wouldn't use PyPI at all.
 
 And of course, nobody is forcing anyone, just like nobody is forcing
 you to use PyPI. :-)
 
 I understood there is the idea to disable external links within a couple
 of months.  That does break backward compatibility in a considerable way.
 
 holger
 
 But wouldn't this only be a change in pip/easy_install, not PyPI
 itself? I suppose you could explicitly break the external links by
 having them point to nothing if you are worried about the security or
 if it's some performance issue (that would indeed be a bad
 compatibility break, in case people are using those for other
 purposes).  Otherwise, if it's a problem, then just use the old
 version of pip.
 
 If we don't remove the feature from pypi itself, then it won't help the
 folks for whom its a problem, because there will be no incentive for the
 folks hosting their software that way to actually upload their stuff to
 PyPI - which means that client-side disabling of external_links is
 fairly likely to never be usable.
 
 I can see it's tempting to just try to force everyone to upload
 their stuff to pypi.python.org.  I am very skeptical about this approach.
 
 There already is a high frustration with the packaging ecology
 in Python.  When we remove external links on the server side, installs
 for many people and companies are going to break, no matter what.  And
 they would have no client-side switch anymore to make things working.
 Requiring to use older setuptools/pip versions would not help because
 the server information is gone.  That'd just increase frustration.
 
 So at the very least using external links needs to be a client-side
 installer choice for a long while and the server needs to offer
 the according information.
 
 I'd generally prefer to think hard about ways to improve the situation
 without breaking things.  Putting simple/ and packaging serving on a CDN
 is one such step and a good idea i think.  Establishing a
 signing/verification mechanism is another.  Refining py2/py3 dependency
 discovery yet another good one.

None of these things have anything to do with improving _this issue_, though 
they would make PyPI better and will be tackled at some point. This is a 
feature that must be removed if we are going to operate a trustable packaging 
network. Today, tomorrow, or six months from now, but this feature will be 
going away, the only question is how do we get there? Yes things will break. We 
also broke old users of pypissh a few weeks ago as part of the SSL lockdown, 
this is an acceptable loss as deprecation schedules were made and followed. We 
will not randomly disable these links today, as you said the right first move 
will be to show a warning (and then an error) in pip/buildout when using these 
links. Donald has already begun that conversation with each of the tool 
developers. We will need a global plan though, an overarching schedule to work 
with pip and buildout (and easy_install if someone does the legwork there) 
about how to announce this removal and how to ensure we break as few people as 
possible over the course of the plan. That is what this discussion is about.

--Noah



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Marcus Smith
 As far as I'm concerned, pip is broke too, in the sense that the method we
 use to make pip work in Python 3 is a bit of an annoying hack (namely,
 upload a separate tarball for each minor Python 3 version).


I agree it's a hack.
but only =1.2 package metadata supports requires-python and nothing is
writing that now (except for wheel).
if newer metadata were pervasive and available on pypi, pip could respond
to it.
I think it would probably automatically start showing up in the json and
xml interfaces?
but would require some changes to expose an html attribute for the simple
interface, which pip currently uses.

Marcus
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Richard Jones
On Feb 28, 2013 2:26 AM, Donald Stufft donald.stu...@gmail.com wrote:
 I propose we deprecate the external links that PyPI has published
 on the /simple/ indexes which exist because of the history of PyPI.

+1
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Chris Lambacher
On Wed, Feb 27, 2013 at 3:27 PM, Donald Stufft donald.stu...@gmail.comwrote:

 I'm not asking for this to be shutoff immediately, it will be phased,
 particularly so project maintainers can be made aware that it's
 going away and can upload versions to PyPI to prevent this kind of
 wide spread breakage. Particularly the first phase I outlined for
 PyPI was to disable _new_ links from being added to the /simple/
 pages and keeping the old around. So that _old_ releases still work
 for now, but _new_ ones do not.


+1

Here is the critical bit. *new releases*. There is no extra work for
package managers until a new release is made. I think most package managers
would rather adjust their processes to ensure that users of the package can
accesses it securely and reliably. It is much easier to concentrate work on
the reliability of PyPI than to 100s of individual sites hosting packages
that at this point likely don't even have SSL.

I think most users would rather get the packages from PyPI infrastructure
and as was already posted, new users probably don't realize that
pip/easy_install hits external dependencies.

-Chris
-- 
Christopher Lambacher
ch...@kateandchris.net
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Lennart Regebro
On Wed, Feb 27, 2013 at 8:49 PM, Monty Taylor mord...@inaugust.com wrote:
 But wouldn't this only be a change in pip/easy_install, not PyPI
 itself? I suppose you could explicitly break the external links by
 having them point to nothing if you are worried about the security or
 if it's some performance issue (that would indeed be a bad
 compatibility break, in case people are using those for other
 purposes).  Otherwise, if it's a problem, then just use the old
 version of pip.

 If we don't remove the feature from pypi itself

It isn't a feature of PyPI. PyPI doesn't require you to upload the
files to PyPI. For that reason, easy_install and PIP will scrape
external sites to be able to download the files.

What we should do is agree that this should stop, and a deprecation
warning to pip and easy_install and after some pre-determined time
remove the feature from easy_install and pip.

 folks for whom its a problem, because there will be no incentive for the
 folks hosting their software that way to actually upload their stuff to
 PyPI

Yes there will be: Everyone mailing them to tell them there software
is broken and can't be installed with easy_install and pip. That's
going to be very annoying very fast. ;-)

//Lennart
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Monty Taylor


On 02/27/2013 04:04 PM, Lennart Regebro wrote:
 On Wed, Feb 27, 2013 at 8:49 PM, Monty Taylor mord...@inaugust.com wrote:
 But wouldn't this only be a change in pip/easy_install, not PyPI
 itself? I suppose you could explicitly break the external links by
 having them point to nothing if you are worried about the security or
 if it's some performance issue (that would indeed be a bad
 compatibility break, in case people are using those for other
 purposes).  Otherwise, if it's a problem, then just use the old
 version of pip.

 If we don't remove the feature from pypi itself
 
 It isn't a feature of PyPI. PyPI doesn't require you to upload the
 files to PyPI. For that reason, easy_install and PIP will scrape
 external sites to be able to download the files.
 
 What we should do is agree that this should stop, and a deprecation
 warning to pip and easy_install and after some pre-determined time
 remove the feature from easy_install and pip.

Good point.

 folks for whom its a problem, because there will be no incentive for the
 folks hosting their software that way to actually upload their stuff to
 PyPI
 
 Yes there will be: Everyone mailing them to tell them there software
 is broken and can't be installed with easy_install and pip. That's
 going to be very annoying very fast. ;-)

++



We could also write an easy utility that a maintainer could run on their
project like:

suck_in my_package

Which would query current pypi for a list of available releases of
my_package, then post them as a direct upload to pypi and finally remove
the external link. That way, once someone annoys them, there's an easy
answer of how to migrate.
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread PJ Eby
On Wed, Feb 27, 2013 at 1:34 PM, Lennart Regebro rege...@gmail.com wrote:
 On Wed, Feb 27, 2013 at 5:34 PM, M.-A. Lemburg m...@egenix.com wrote:
 I'm not saying that it's not a good idea to host packages on PyPI,
 but forcing the community into doing this is not a good idea.

 I still don't understand why not. The only reasons I've seen are
 Because they don't want to or because they don't trust PyPI. And
 in the latter case I'm assuming they wouldn't use PyPI at all.

I haven't seen anybody mention it yet, but checkouts of development
versions are a use case that can't currently be addressed without
support for multiple external links.  For example, setuptools itself
offers SVN checkout URLs for two different branches.  I've also seen
in-development packages offered via github or bitbucket checkouts as
well.
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Lennart Regebro
On Wed, Feb 27, 2013 at 9:01 PM, Donald Stufft donald.stu...@gmail.com wrote:
 Modify the PyPI software to no longer link to those urls.

Well, I guess we can remove the software home page and the download
URL's from the simple index.

For example, in PIL's case the simple index looks like this:

1.1.5a1 home_page
1.1.5a1 download_url
1.1.4 home_page
1.1.5 home_page
1.1.5 download_url
1.1.5a2 home_page
1.1.5a2 download_url
1.1.3 home_page
1.1.3 download_url
1.1.6 home_page
1.1.6 download_url

(Each of those is a link)

That result in the following actions from easy_install, where Process
url: means it looks at the URL to see if it is a distribution
package, or if it is HTML, if that page possibly contains links that
could be a distribution package, and Found link: means that it found
a distribution package.

Process url: http://pypi.python.org/simple/PIL/
Process url: http://www.pythonware.com/products/pil
Process url: http://effbot.org/zone/pil-changes-115.htm
Process url: http://www.pythonware.com/products/pil/
Process url: http://www.pythonware.com/products/pil
Process url: http://effbot.org/zone/pil-changes-115.htm
Process url: http://www.pythonware.com/products/pil
Process url: http://effbot.org/zone/pil-changes-115.htm
Process url: http://www.pythonware.com/products/pil/
Process url: http://www.pythonware.com/downloads/Imaging-1.1.3.tar.gz
Found link: http://www.pythonware.com/downloads/Imaging-1.1.3.tar.gz
Process url: http://www.pythonware.com/products/pil
Process url: http://effbot.org/downloads/#Imaging
Process url: http://www.pythonware.com/products/pil
Reading http://www.pythonware.com/products/pil
Process url: http://www.pythonware.com/media/css/pythonware.css
Process url: http://www.pythonware.com/index.htm
Process url: http://www.pythonware.com/products/index.htm
Process url: http://www.pythonware.com/library/index.htm
Process url: http://www.pythonware.com/search.htm
Process url: http://www.pythonware.com/daily/index.htm
Process url: http://www.pythonware.com/products/
Process url: http://www.pythonware.com/products/pil/support.htm
Process url: http://www.pythonware.com/products/pil/old.htm
Process url: http://www.pythonware.com/products/pil/license.htm
Process url: http://www.pythonware.com/products/pil/faq.htm
Process url: http://www.djangoproject.com/
Process url: http://www.pythonware.com/products/pil/license.htm
Process url: http://www.pythonware.com/products/pil/#pil117
Process url: mailto:image-...@python.org
Process url: http://mail.python.org/mailman/listinfo/image-sig
Process url: mailto:image-sig-requ...@python.org
Process url: http://effbot.org/downloads/Imaging-1.1.7.tar.gz
Found link: http://effbot.org/downloads/Imaging-1.1.7.tar.gz
Process url: http://effbot.org/downloads/PIL-1.1.7.win32-py2.4.exe
Found link: http://effbot.org/downloads/PIL-1.1.7.win32-py2.4.exe
Process url: http://effbot.org/downloads/PIL-1.1.7.win32-py2.5.exe
Found link: http://effbot.org/downloads/PIL-1.1.7.win32-py2.5.exe
Process url: http://effbot.org/downloads/PIL-1.1.7.win32-py2.6.exe
Found link: http://effbot.org/downloads/PIL-1.1.7.win32-py2.6.exe
Process url: http://effbot.org/downloads/PIL-1.1.7.win32-py2.7.exe
Found link: http://effbot.org/downloads/PIL-1.1.7.win32-py2.7.exe
Process url: http://effbot.org/downloads#pil
Process url: http://effbot.org/downloads/Imaging-1.1.6.tar.gz
Found link: http://effbot.org/downloads/Imaging-1.1.6.tar.gz
Process url: http://effbot.org/downloads/PIL-1.1.6.win32-py2.2.exe
Found link: http://effbot.org/downloads/PIL-1.1.6.win32-py2.2.exe
Process url: http://effbot.org/downloads/PIL-1.1.6.win32-py2.3.exe
Found link: http://effbot.org/downloads/PIL-1.1.6.win32-py2.3.exe
Process url: http://effbot.org/downloads/PIL-1.1.6.win32-py2.4.exe
Found link: http://effbot.org/downloads/PIL-1.1.6.win32-py2.4.exe
Process url: http://effbot.org/downloads/PIL-1.1.6.win32-py2.5.exe
Found link: http://effbot.org/downloads/PIL-1.1.6.win32-py2.5.exe
Process url: http://effbot.org/downloads/PIL-1.1.6.win32-py2.6.exe
Found link: http://effbot.org/downloads/PIL-1.1.6.win32-py2.6.exe
Process url: http://effbot.org/zone/pil-changes-116.htm
Process url: http://effbot.org/zone/python-register.htm
Process url: http://effbot.org/downloads/Imaging-1.1.5.tar.gz
Found link: http://effbot.org/downloads/Imaging-1.1.5.tar.gz
Process url: http://effbot.org/downloads/PIL-1.1.5.win32-py2.1.exe
Found link: http://effbot.org/downloads/PIL-1.1.5.win32-py2.1.exe
Process url: http://effbot.org/downloads/PIL-1.1.5.win32-py2.2.exe
Found link: http://effbot.org/downloads/PIL-1.1.5.win32-py2.2.exe
Process url: http://effbot.org/downloads/PIL-1.1.5.win32-py2.3.exe
Found link: http://effbot.org/downloads/PIL-1.1.5.win32-py2.3.exe
Process url: http://effbot.org/downloads/PIL-1.1.5.win32-py2.4.exe
Found link: http://effbot.org/downloads/PIL-1.1.5.win32-py2.4.exe
Process url: http://effbot.org/downloads/PIL-1.1.5.win32-py2.5.exe
Found link: http://effbot.org/downloads/PIL-1.1.5.win32-py2.5.exe
Process url: 

Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread PJ Eby
On Wed, Feb 27, 2013 at 4:04 PM, Lennart Regebro rege...@gmail.com wrote:
 On Wed, Feb 27, 2013 at 8:49 PM, Monty Taylor mord...@inaugust.com wrote:
 But wouldn't this only be a change in pip/easy_install, not PyPI
 itself? I suppose you could explicitly break the external links by
 having them point to nothing if you are worried about the security or
 if it's some performance issue (that would indeed be a bad
 compatibility break, in case people are using those for other
 purposes).  Otherwise, if it's a problem, then just use the old
 version of pip.

 If we don't remove the feature from pypi itself

 It isn't a feature of PyPI. PyPI doesn't require you to upload the
 files to PyPI. For that reason, easy_install and PIP will scrape
 external sites to be able to download the files.

 What we should do is agree that this should stop,

So far, I don't think anybody's talking to the right we for stopping
it.  It's the tools that control this, not PyPI.  (PyPI can't actually
stop the tools from using this information without also making itself
a lot less useful to *humans* at the same time.)

As far as my personal position on the matter, I think that it's
reasonable to deprecate the scraping of home page and download links.
As somebody pointed out, expired domains are a potentially nasty
problem there.

OTOH, I currently make development snapshots of setuptools and other
projects available by dumping them in a directory that's used as an
external download URL.  Replacing that would be a PITA because PyPI
only lets you upload and register new releases from distutils' command
line.  Basically, I'd need to use a download link that pointed to a
latest URL that redirected to the final download.

Anyway, I'm not seeing much discussion here about how to help authors
make changes to their release processes.  Note that many popular and
long-lived projects (pywin32, PIL, etc.) have similar issues.  (Not to
mention the newer projects that host directly from revision control.)

Given that easy_install was deliberately designed so that those guys
would *not* need to change their hosting strategies to get automated
downloads, I'd like to see more talk about how we're going to help
people change their releasing and hosting strategies.
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Lennart Regebro
On Wed, Feb 27, 2013 at 10:17 PM, PJ Eby p...@telecommunity.com wrote:
 I haven't seen anybody mention it yet, but checkouts of development
 versions are a use case that can't currently be addressed without
 support for multiple external links.  For example, setuptools itself
 offers SVN checkout URLs for two different branches.  I've also seen
 in-development packages offered via github or bitbucket checkouts as
 well.

These versions should not be installed unless the installer is
explicitly told to install just those versions, so that is really not
connected to this issue.
You should of course be able to install files both locally and from a
specific URL.

But the development tgz created and hosted on github should IMO never
be installed by just saying easy_install frobnitz or even
easy_install frobnitz==1.3.4dev5

//Lennart
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Noah Kantrowitz

On Feb 27, 2013, at 1:31 PM, PJ Eby wrote:

 On Wed, Feb 27, 2013 at 4:04 PM, Lennart Regebro rege...@gmail.com wrote:
 On Wed, Feb 27, 2013 at 8:49 PM, Monty Taylor mord...@inaugust.com wrote:
 But wouldn't this only be a change in pip/easy_install, not PyPI
 itself? I suppose you could explicitly break the external links by
 having them point to nothing if you are worried about the security or
 if it's some performance issue (that would indeed be a bad
 compatibility break, in case people are using those for other
 purposes).  Otherwise, if it's a problem, then just use the old
 version of pip.
 
 If we don't remove the feature from pypi itself
 
 It isn't a feature of PyPI. PyPI doesn't require you to upload the
 files to PyPI. For that reason, easy_install and PIP will scrape
 external sites to be able to download the files.
 
 What we should do is agree that this should stop,
 
 So far, I don't think anybody's talking to the right we for stopping
 it.  It's the tools that control this, not PyPI.  (PyPI can't actually
 stop the tools from using this information without also making itself
 a lot less useful to *humans* at the same time.)
 
 As far as my personal position on the matter, I think that it's
 reasonable to deprecate the scraping of home page and download links.
 As somebody pointed out, expired domains are a potentially nasty
 problem there.
 
 OTOH, I currently make development snapshots of setuptools and other
 projects available by dumping them in a directory that's used as an
 external download URL.  Replacing that would be a PITA because PyPI
 only lets you upload and register new releases from distutils' command
 line.  Basically, I'd need to use a download link that pointed to a
 latest URL that redirected to the final download.
 
 Anyway, I'm not seeing much discussion here about how to help authors
 make changes to their release processes.  Note that many popular and
 long-lived projects (pywin32, PIL, etc.) have similar issues.  (Not to
 mention the newer projects that host directly from revision control.)
 
 Given that easy_install was deliberately designed so that those guys
 would *not* need to change their hosting strategies to get automated
 downloads, I'd like to see more talk about how we're going to help
 people change their releasing and hosting strategies.

To be honest, either they will adapt or replacements will arise (see also: 
Pillow). PIL is a great example of something that can and _should_ be 
completely broken since it is already 90% broken anyway.

--Noah



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Donald Stufft
On Wednesday, February 27, 2013 at 4:17 PM, PJ Eby wrote:
 On Wed, Feb 27, 2013 at 1:34 PM, Lennart Regebro rege...@gmail.com 
 (mailto:rege...@gmail.com) wrote:
  On Wed, Feb 27, 2013 at 5:34 PM, M.-A. Lemburg m...@egenix.com 
  (mailto:m...@egenix.com) wrote:
   I'm not saying that it's not a good idea to host packages on PyPI,
   but forcing the community into doing this is not a good idea.
   
  
  
  I still don't understand why not. The only reasons I've seen are
  Because they don't want to or because they don't trust PyPI. And
  in the latter case I'm assuming they wouldn't use PyPI at all.
  
 
 
 I haven't seen anybody mention it yet, but checkouts of development
 versions are a use case that can't currently be addressed without
 support for multiple external links. For example, setuptools itself
 offers SVN checkout URLs for two different branches. I've also seen
 in-development packages offered via github or bitbucket checkouts as
 well.
 
 

Is this 
http://svn.python.org/projects/sandbox/trunk/setuptools/#egg=setuptools-dev and
http://svn.python.org/projects/sandbox/branches/setuptools-0.6/#egg=setuptools-dev06
 ?

I don't think they belong on the main repo page. Not every project supports 
this, and
the ones that do use varying names, is there anything wrong with just updating 
your
instructions to say instead of (please replace with easy_install lingo here)
`pip install setuptools==setuptools-dev` please `pip install -e 
http://svn.python.org/projects/sandbox/trunk/setuptools/#egg=setuptools-dev` ?

Alternatively if the extra typing is really not desired then I'd say let's add 
a separate
method (/dev/setuptools/ for example?) that only links these external 
development
urls. And update the tooling to check there via a --dev flag or something. I 
still
don't think needing to specify the full url is a terrible burden though.
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Łukasz Langa
On 27 lut 2013, at 21:16, holger krekel hol...@merlinux.eu wrote:

 On Wed, Feb 27, 2013 at 14:49 -0500, Monty Taylor wrote:
 On 02/27/2013 02:47 PM, Aaron Meurer wrote:
 
 If we don't remove the feature from pypi itself, then it won't help the
 folks for whom its a problem, because there will be no incentive for the
 folks hosting their software that way to actually upload their stuff to
 PyPI - which means that client-side disabling of external_links is
 fairly likely to never be usable.
 
 I can see it's tempting to just try to force everyone to upload
 their stuff to pypi.python.org.  I am very skeptical about this approach.

I can totally understand why users would want to force maintainers to upload 
stuff to pypi.python.org after another failed build caused by a dependency on 
third-party infrastructure. While our package index is not perfect, lately it 
seems the main problem is with external packages.

 There already is a high frustration with the packaging ecology
 in Python.  When we remove external links on the server side, installs
 for many people and companies are going to break, no matter what.

As Donald points out, we would only do this for new releases. This would break 
no existing releases for users. Speaking of frustration and breakage though, 
let's say Mercurial or python-memcached isn't available because their website 
is down. Where can you go? Unless you have a pip-cached copy, the answer is too 
often nowhere.

-- 
Best regards,
Łukasz Langa

WWW: http://lukasz.langa.pl/
Twitter: @llanga
IRC: ambv on #python-dev

___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Lennart Regebro
On Wed, Feb 27, 2013 at 10:31 PM, PJ Eby p...@telecommunity.com wrote:
 Replacing that would be a PITA because PyPI
 only lets you upload and register new releases from distutils' command
 line.

You can upload files, but not create new releases. But that seems like
a pretty minor addition, or?

 Anyway, I'm not seeing much discussion here about how to help authors
 make changes to their release processes.  Note that many popular and
 long-lived projects (pywin32, PIL, etc.) have similar issues.

I know I probably have tunnel vision here, but I'm not sure what the
issues are. :-)

//Lennart
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Donald Stufft
On Wednesday, February 27, 2013 at 4:31 PM, PJ Eby wrote:
 So far, I don't think anybody's talking to the right we for stopping
 it. It's the tools that control this, not PyPI. (PyPI can't actually
 stop the tools from using this information without also making itself
 a lot less useful to *humans* at the same time.)

I have issues out for pip and buildout, didn't have time to find
and make issues for setuptools and distribute but I plan on doing that
as well. However PyPI _can_ stop publish that info on the simple index.
If tooling wants to go out of their way to scrape the human pages that's
their problem and would be unsupported. By not publishing that
content we make a clear line of what is and isn't supported for the
tooling to use.
 
 As far as my personal position on the matter, I think that it's
 reasonable to deprecate the scraping of home page and download links.
 As somebody pointed out, expired domains are a potentially nasty
 problem there.
 
 OTOH, I currently make development snapshots of setuptools and other
 projects available by dumping them in a directory that's used as an
 external download URL. Replacing that would be a PITA because PyPI
 only lets you upload and register new releases from distutils' command
 line. Basically, I'd need to use a download link that pointed to a
 latest URL that redirected to the final download.

Development snapshots are a use case that i'm not sure makes sense
for PyPI, but if they do should require specific opt-in to install them. Does
easy_install have a command line flag that adds extra links? pip has 
--find-links
can your instructions simply state to do the equivalent of
`pip install --find-links=http://setuptools.com/dev-snapshopts/`?

Alternatively I would like to get the tooling smarter about not installing
pre-release versions unless asked as well. So with that the answer
could simply be to make dev releases to PyPI, (PyPI will probably
need some sort of prefer stable option for it's web ui), and have
the tooling prefer stable releases.
 
 Anyway, I'm not seeing much discussion here about how to help authors
 make changes to their release processes. Note that many popular and
 long-lived projects (pywin32, PIL, etc.) have similar issues. (Not to
 mention the newer projects that host directly from revision control.)

Most of these projects are already running python setup.py register,
so for the vast bulk of them they'll just need to add a sdst upload to that.
 
 Given that easy_install was deliberately designed so that those guys
 would *not* need to change their hosting strategies to get automated
 downloads, I'd like to see more talk about how we're going to help
 people change their releasing and hosting strategies.

Someone has made a comment about making a script to make it easy
to make old versions available on PyPI for authors. I believe for
most people the change should be fairly easy since they are already
registering their releases. However if someone has an odd release
process I'd be willing to try and help them fit the new requirement
into it. 

___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Richard Jones
On 28 February 2013 08:31, PJ Eby p...@telecommunity.com wrote:
 OTOH, I currently make development snapshots of setuptools and other
 projects available by dumping them in a directory that's used as an
 external download URL.  Replacing that would be a PITA because PyPI
 only lets you upload and register new releases from distutils' command
 line.  Basically, I'd need to use a download link that pointed to a
 latest URL that redirected to the final download.

Yup, and the down-side of distutils as the tool for talking to PyPI
is, of course, the horrendous turn-around time trying to add features
or fix bugs.

I've advocated us having the upload/register/whatever functionality in
a separate tool for a while, but that doesn't seem to have gained any
traction. Of course issues around the complexity introduced by
setup.py make it much harder.


In the mean time I think Donald's suggestion for supporting
development pre-releases is reasonable:
 instead of (please replace with easy_install lingo here)
 `pip install setuptools==setuptools-dev` please `pip install -e
 http://svn.python.org/projects/sandbox/trunk/setuptools/#egg=setuptools-dev` ?



Richard
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Lennart Regebro
On Wed, Feb 27, 2013 at 11:48 PM, Richard Jones rich...@python.org wrote:
 I've advocated us having the upload/register/whatever functionality in
 a separate tool for a while, but that doesn't seem to have gained any
 traction. Of course issues around the complexity introduced by
 setup.py make it much harder.

Well, if we break distutils, we would have to make a separate tool.

Is it a problem or is it an opportunity? :-)

//Lennart
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Aaron Meurer
On Wed, Feb 27, 2013 at 2:31 PM, PJ Eby p...@telecommunity.com wrote:
 On Wed, Feb 27, 2013 at 4:04 PM, Lennart Regebro rege...@gmail.com wrote:
 On Wed, Feb 27, 2013 at 8:49 PM, Monty Taylor mord...@inaugust.com wrote:
 But wouldn't this only be a change in pip/easy_install, not PyPI
 itself? I suppose you could explicitly break the external links by
 having them point to nothing if you are worried about the security or
 if it's some performance issue (that would indeed be a bad
 compatibility break, in case people are using those for other
 purposes).  Otherwise, if it's a problem, then just use the old
 version of pip.

 If we don't remove the feature from pypi itself

 It isn't a feature of PyPI. PyPI doesn't require you to upload the
 files to PyPI. For that reason, easy_install and PIP will scrape
 external sites to be able to download the files.

 What we should do is agree that this should stop,

 So far, I don't think anybody's talking to the right we for stopping
 it.  It's the tools that control this, not PyPI.  (PyPI can't actually
 stop the tools from using this information without also making itself
 a lot less useful to *humans* at the same time.)

 As far as my personal position on the matter, I think that it's
 reasonable to deprecate the scraping of home page and download links.
 As somebody pointed out, expired domains are a potentially nasty
 problem there.

 OTOH, I currently make development snapshots of setuptools and other
 projects available by dumping them in a directory that's used as an
 external download URL.  Replacing that would be a PITA because PyPI
 only lets you upload and register new releases from distutils' command
 line.  Basically, I'd need to use a download link that pointed to a
 latest URL that redirected to the final download.

 Anyway, I'm not seeing much discussion here about how to help authors
 make changes to their release processes.  Note that many popular and
 long-lived projects (pywin32, PIL, etc.) have similar issues.  (Not to
 mention the newer projects that host directly from revision control.)

As far as I'm concerned, this is all about helping package
maintainers.  The way pip works now, every time I do a release
candidate, pip automatically installs it, even though I only upload it
to Google Code.  I don't want it to do this, but the only way around
it would be either 1. give it some weird name so that pip doesn't
think it is newer 2. upload it somewhere else or 3. go in to PyPI and
remove all mentions of Google Code from the index.

And by the way, this hasn't been mentioned, but I really mean *all*
mentions of Google Code on PyPI.  pip crawls Google Code not just
because Google Code listed as an official site for my package or
because the latest release is there, but because a single old release
points there.  So to get pip to not crawl there, I would have to go
through and remove all old mentions of Google Code, even from releases
that were made in 2006.  So you can see why the expired domain
scenario is a very real issue. And combined with the fact that
everyone uses pip with sudo that was discussed on this list a while
back, you have a hackers dream for installing malicious code on
everyone's computers.

I also had the issue where pip was trying to install our
documentation, because I named it sympy-0.7.1-doc, which it thought
was newer than sympy-0.7.1.  Again, I only uploaded that file to
Google Code, not PyPI.

And currently we have the issue where it tries to install the Python 2
tarball in Python 3, which is partially related to all this (it's all
part of the gathering metadata from the filename instead of the PyPI
classifiers).  If we require that people upload files, we can
additionally only gather metadata from classifiers.  If pip installs
Python 2 code in Python 3, the solution isn't to try to trick it by
some filename mangling (which won't work in easy_install, but oh
well), but rather, just set the classifier for the download like you
were supposed to in the first place, and it will just work.  With this
change if I (the package maintainer) do the right thing, pip does the
right thing.  The way it is now, if I do the right thing, pip does the
wrong thing, and to make pip do the right thing, I have to trick it
into do so.  So for me at least, the change to the release process
is stop wasting my time figuring out how to trick pip, and just do
things according to the PyPI classifier API (which I'm already doing
anyway, just pip ignores it), and everything will work.

Aaron Meurer


 Given that easy_install was deliberately designed so that those guys
 would *not* need to change their hosting strategies to get automated
 downloads, I'd like to see more talk about how we're going to help
 people change their releasing and hosting strategies.
 ___
 Catalog-SIG mailing list
 Catalog-SIG@python.org
 http://mail.python.org/mailman/listinfo/catalog-sig

Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread PJ Eby
On Wed, Feb 27, 2013 at 4:50 PM, Donald Stufft donald.stu...@gmail.com wrote:
 Development snapshots are a use case that i'm not sure makes sense
 for PyPI, but if they do should require specific opt-in to install them.
 Does easy_install have a command line flag that adds extra links?

*chuckle*.  Yes, it's the original source of the --find-links option,
emulated in pip to ease migration.

 can your instructions simply state to do the equivalent of
 `pip install --find-links=http://setuptools.com/dev-snapshopts/`?

The problem with find-links is that if you push them off of PyPI, they
have to go somewhere else, which is setuptools' dependency-links
feature.  Now you have an even *harder* problem to update or remove
those links, because they're not under the control of the author nor
visible on PyPI.


 Alternatively I would like to get the tooling smarter about not installing
 pre-release versions unless asked as well.

Yes, and that discussion doesn't have much to do with PyPI per se,
because again, it's up to the tools.
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Donald Stufft
On Wednesday, February 27, 2013 at 7:08 PM, PJ Eby wrote:
 On Wed, Feb 27, 2013 at 6:16 PM, Aaron Meurer asmeu...@gmail.com 
 (mailto:asmeu...@gmail.com) wrote:
  As far as I'm concerned, this is all about helping package
  maintainers. The way pip works now, every time I do a release
  candidate, pip automatically installs it, even though I only upload it
  to Google Code. I don't want it to do this, but the only way around
  it would be either 1. give it some weird name so that pip doesn't
  think it is newer 2. upload it somewhere else or 3. go in to PyPI and
  remove all mentions of Google Code from the index.
  
 
 
 There's also a *fourth* way, which I asked the PyPI developers many
 years ago to do, which is to stop including download links on the
 /simple index for hidden (i.e., non-current) releases.
 
 (Something I am still in favor of, btw. Jim Fulton argued against it,
 IIRC, and it ended in a stalemate. However, I don't think we
 discussed distinguishing PyPI downloads from other downloads, just
 getting rid of old links in general)
 
 Frankly, just dropping /simple links for hidden releases would also
 fix a good chunk of expired domain, stale releases, too many
 downloads, etc. In addition, if a project migrates to using PyPI
 uploads, they will not still be subject to external downloads for
 older versions being crawled.
 
 So, if we must do away with the links, I would suggest that the phases be:
 
 1. Remove homepage/download URLs for hidden versions from the
 /simple index altogether (leaving PyPI download links available)
 2. Remove the rel=... attributes from the remaining download and
 home page links (this will stop off-site crawling, but not off-site
 downloading)
 3. Re-evaluate whether anything else actually needs to be removed.
 
 

This seems a bit complicated, people in general don't even know
the external link spidering exists, much less understand the intricacies
of what types of links get spidered when. A simple After X date no new
urls will be added and after Y date all existing urls will be removed removes
ambiguity from the process. Having this kind of link will get removed Y
and that matters in Z conditions leads to a lot of confusion about
what does and doesn't work.
 
 Basically, 99% of the complaints here are lumping together all of
 these different kinds of links -- stale links, spidered links, and
 plain external download links -- even though they don't create the
 same sorts of problems. Taking it in stages will give authors time to
 change processes, while still getting rid of the biggest problem
 sources right away (stale homepage/download URLs).
 
 

My complaints is external urls at all, for a myriad of reasons, some
specific to particular cases of them, some not. 
 
 The first of these changes could be done now, though I'd check with
 Jim about the buildout use case; IIRC it was to allow pinned
 versions. But if the main use cases also had eggs on PyPI rather than
 downloading them from elsewhere, then removing *just* the
 homepage/download links would clean things up nicely, including your
 runaway Google Code downloads, without needing to change any installer
 code that's out in the field right now.
 ___
 Catalog-SIG mailing list
 Catalog-SIG@python.org (mailto:Catalog-SIG@python.org)
 http://mail.python.org/mailman/listinfo/catalog-sig
 
 


___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Aaron Meurer
On Wed, Feb 27, 2013 at 6:24 PM, Donald Stufft donald.stu...@gmail.com wrote:
 On Wednesday, February 27, 2013 at 8:13 PM, PJ Eby wrote:

 On Wed, Feb 27, 2013 at 7:36 PM, Donald Stufft donald.stu...@gmail.com
 wrote:

 This seems a bit complicated, people in general don't even know
 the external link spidering exists, much less understand the intricacies
 of what types of links get spidered when. A simple After X date no new
 urls will be added and after Y date all existing urls will be removed
 removes ambiguity from the process. Having this kind of link will get
 removed Y
 and that matters in Z conditions leads to a lot of confusion about
 what does and doesn't work.


 AFAICT, that's an argument in *favor* of phased removals, not against.
 (Also, you have the order backwards from my proposal, which is to
 *first* remove broken old junk in two phases. This is actually *less*
 problematic than doing it for new releases first. And of course the
 simplest thing of all would be to make no change at all.)

 The phased removals are a problem when people won't understand
 the differentiating factors between the different phases.


 Anyway, let's try to be a little bit less like the politicians who,
 upon being told that Something must be done!, turn around and pick
 any arbitrary value of something, and do that, so as to be seen to
 be doing something.

 But that is *exactly* what is happening now: people are proposing to
 create worse problems down the line by insisting on doing something
 right now (although never is often better, per the Zen of Python)
 without considering the consequences that will happen six months or so
 from now... when the users and toolmakers move the external links
 someplace else, that will have even *less* visibility,
 maintainability, and trust than they have now.

 This is not something I've just cooked up, It's been thought about since
 I stood up Crate a year ago, infact there is a /simple/ index on Crate
 that flat out removes external links (as well as all the breakage that
 occurs).

 I'm well aware of the implications here. dependency_links cannot be
 controlled
 via PyPI (and infact require a download to even trigger them if they are in
 setup.py) so that problem is outside of the realm of PyPI. Like I said I've
 already opened issues with pip/buildout about this, and I have every
 intention of seeing them through till completion.

Can you give the links to the issues in their issue trackers, for
those of us who want to follow the progress of this more closely?

Aaron Meurer

 PyPI is one part
 of the overall remove automatic trolling of links from the index plan.


 This won't make your problems better, it will actually make them
 *worse*, for the sake of making what is essentially a political
 statement about how seriously the Python community values security.
 (This is especially the case because getting rid of the links won't
 actually get you to a secure system. The *actual* solution is code
 signing... which there is already a PEP for. Get the code signing
 done right, and the external links will be irrelevant.)

 Code signing only solves some problems, and this isn't just about security,
 (although it does play a major part) read my previous emails. Furthermore
 code signing is a larger change *and* it's a lot more difficult to get old
 releases to go back and sign their releases. This improves the overall
 security of these old releases even if we are unable to get them signed.


 Now, I am not saying that something doesn't need to be done, but it
 needs to be considered more carefully than just, First thing we do,
 let's kill all the links! A phase-out will not lose anything that
 isn't already lost. (A parallel from Mercurial, when they added SSL
 cert verification: the warnings don't mean things are more insecure
 now, you're just getting informed now of how insecure they *already
 always were*.)


___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Donald Stufft
On Wednesday, February 27, 2013 at 8:34 PM, Aaron Meurer wrote:
 On Wed, Feb 27, 2013 at 6:24 PM, Donald Stufft donald.stu...@gmail.com 
 (mailto:donald.stu...@gmail.com) wrote:
  On Wednesday, February 27, 2013 at 8:13 PM, PJ Eby wrote:
  
  On Wed, Feb 27, 2013 at 7:36 PM, Donald Stufft donald.stu...@gmail.com 
  (mailto:donald.stu...@gmail.com)
  wrote:
  
  This seems a bit complicated, people in general don't even know
  the external link spidering exists, much less understand the intricacies
  of what types of links get spidered when. A simple After X date no new
  urls will be added and after Y date all existing urls will be removed
  removes ambiguity from the process. Having this kind of link will get
  removed Y
  and that matters in Z conditions leads to a lot of confusion about
  what does and doesn't work.
  
  
  AFAICT, that's an argument in *favor* of phased removals, not against.
  (Also, you have the order backwards from my proposal, which is to
  *first* remove broken old junk in two phases. This is actually *less*
  problematic than doing it for new releases first. And of course the
  simplest thing of all would be to make no change at all.)
  
  The phased removals are a problem when people won't understand
  the differentiating factors between the different phases.
  
  
  Anyway, let's try to be a little bit less like the politicians who,
  upon being told that Something must be done!, turn around and pick
  any arbitrary value of something, and do that, so as to be seen to
  be doing something.
  
  But that is *exactly* what is happening now: people are proposing to
  create worse problems down the line by insisting on doing something
  right now (although never is often better, per the Zen of Python)
  without considering the consequences that will happen six months or so
  from now... when the users and toolmakers move the external links
  someplace else, that will have even *less* visibility,
  maintainability, and trust than they have now.
  
  This is not something I've just cooked up, It's been thought about since
  I stood up Crate a year ago, infact there is a /simple/ index on Crate
  that flat out removes external links (as well as all the breakage that
  occurs).
  
  I'm well aware of the implications here. dependency_links cannot be
  controlled
  via PyPI (and infact require a download to even trigger them if they are in
  setup.py) so that problem is outside of the realm of PyPI. Like I said I've
  already opened issues with pip/buildout about this, and I have every
  intention of seeing them through till completion.
  
 
 
 Can you give the links to the issues in their issue trackers, for
 those of us who want to follow the progress of this more closely?
 
 

https://github.com/pypa/pip/issues/818

https://github.com/buildout/buildout/issues/92 
 
 Aaron Meurer

___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Marcus Smith
 maintainers.  The way pip works now, every time I do a release
 candidate, pip automatically installs it, even though I only upload it


an option to exclude pre-releases (or in reverse, an option to allow them)
does seem overdue.
reasons not to do this? anyone? links to the most relevant
conversations/posts from the past?



 well), but rather, just set the classifier for the download like you
 were supposed to in the first place, and it will just work.  With this
 change if I (the package maintainer) do the right thing, pip does the
 right thing.  The way it is now, if I do the right thing, pip does the
 wrong thing


it's not clear that trove classifiers is the consensus on how an installer
should know about the python version.
surfacing requires-python in pypi for installers (when metadata-version
=1.2 actually becomes pervasive) seems like the right idea.
but maybe an option to look at classifiers in the short term? not sure.

Marcus
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Nick Coghlan
On Thu, Feb 28, 2013 at 6:27 AM, Donald Stufft donald.stu...@gmail.com wrote:
 Sometimes you need to break things. The goal is to do it with ample
 warning and migration time so that people have a chance to move
 to the new way of doing things.

 Again, I am not suggesting we delete all external links immediately, just
 disable new ones. Removing old ones will come later.

This thread is long enough that I'm not sure on where to weigh in.
Here seems appropriate enough.

1. The next generation metadata infrastructure will NOT support
external hosting of files indexed on PyPI - if you don't upload the
archive files to PyPI, they won't be included in the next generation
metadata. If you want external hosting, you will need to run a
separate index (this is similar to the yum model - you can host files
wherever you want, but you need to run yum createrepo yourself to
generate the metadata, and instruct users on how to get their
installers to retrieve your metadata. The big difference between PyPI
and the yum model is that the default index still won't be curated at
all, so there's no review process to get through if you want to use
it, thus less need for external hosting).

2. Near term, with the current generation infrastructure, I think it's
better to approach the problem *very* gently. Our political capital
with users is low at this point, and we need to prioritise what things
we want to make people angry about (whether or not we consider their
anger justified is completely irrelevant). This proposal is for a
transition that would take months. Since I want to have the next
generation metadata up and running within months *anyway*, that means
this strikes me as primarily a distraction from fixing the problem
properly.

3. Various other problems raised in this thread will only be fixed
with next generation metadata that the automated tools can *rely* on
rather than having to guess the intended semantics. That's why PEP 426
is now explicit about pre-release handling, and why it makes version
specifiers like (for example), Requires-Python: 2.6 exclude Python 3
by default. (although the thread does raise an interesting question of
whether or not you can cleanly specify dual Python 2  3 support given
the current state of PEP 426)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig


Re: [Catalog-sig] Deprecate External Links

2013-02-27 Thread Donald Stufft
On Thursday, February 28, 2013 at 1:39 AM, Nick Coghlan wrote:
 On Thu, Feb 28, 2013 at 6:27 AM, Donald Stufft donald.stu...@gmail.com 
 (mailto:donald.stu...@gmail.com) wrote:
  Sometimes you need to break things. The goal is to do it with ample
  warning and migration time so that people have a chance to move
  to the new way of doing things.
  
  Again, I am not suggesting we delete all external links immediately, just
  disable new ones. Removing old ones will come later.
  
 
 
 This thread is long enough that I'm not sure on where to weigh in.
 Here seems appropriate enough.
 
 1. The next generation metadata infrastructure will NOT support
 external hosting of files indexed on PyPI - if you don't upload the
 archive files to PyPI, they won't be included in the next generation
 metadata. If you want external hosting, you will need to run a
 separate index (this is similar to the yum model - you can host files
 wherever you want, but you need to run yum createrepo yourself to
 generate the metadata, and instruct users on how to get their
 installers to retrieve your metadata. The big difference between PyPI
 and the yum model is that the default index still won't be curated at
 all, so there's no review process to get through if you want to use
 it, thus less need for external hosting).
 
 2. Near term, with the current generation infrastructure, I think it's
 better to approach the problem *very* gently. Our political capital
 with users is low at this point, and we need to prioritise what things
 we want to make people angry about (whether or not we consider their
 anger justified is completely irrelevant). This proposal is for a
 transition that would take months. Since I want to have the next
 generation metadata up and running within months *anyway*, that means
 this strikes me as primarily a distraction from fixing the problem
 properly.
 
 

I'm glad the next set of Metadata won't have external links, however
even if it showed up tomorrow it's going to be a long time until
people are completely migrated to it. Furthermore you estimate
months but the first phase will have positive benefits right away, namely
that it will prompt people to start uploading their packages better increasing
the security and reliability of the current system. And finally while I'm
glad to see forward movement It's been said before not to bother
making a fix to the existing system because X was going to happen
soon, in the past i was distutils2/packaging, now it's PEP426/packaging.
While I have every hope and I believe it will happen this time, the
past has made me worry about holding off on good incremental
improvements to the current infra. 
 
 3. Various other problems raised in this thread will only be fixed
 with next generation metadata that the automated tools can *rely* on
 rather than having to guess the intended semantics. That's why PEP 426
 is now explicit about pre-release handling, and why it makes version
 specifiers like (for example), Requires-Python: 2.6 exclude Python 3
 by default. (although the thread does raise an interesting question of
 whether or not you can cleanly specify dual Python 2  3 support given
 the current state of PEP 426)
 
 

Pre release handling doesn't require anything new to handle
(https://github.com/pypa/pip/issues/820) requires-python will be that's
a separate issue really. 
 
 Cheers,
 Nick.
 
 -- 
 Nick Coghlan | ncogh...@gmail.com (mailto:ncogh...@gmail.com) | Brisbane, 
 Australia
 
 


___
Catalog-SIG mailing list
Catalog-SIG@python.org
http://mail.python.org/mailman/listinfo/catalog-sig