Re: [Catalog-sig] API for uploading packages to PyPI
On 22.03.2013 23:33, anatoly techtonik wrote: On Fri, Mar 22, 2013 at 4:17 PM, M.-A. Lemburg m...@egenix.com wrote: On 22.03.2013 13:38, anatoly techtonik wrote: On Fri, Mar 22, 2013 at 3:26 PM, M.-A. Lemburg m...@egenix.com wrote: On 22.03.2013 13:20, anatoly techtonik wrote: On Fri, Mar 22, 2013 at 1:49 PM, M.-A. Lemburg m...@egenix.com wrote: Again: Please don't do this. I think you're not against renaming pages, but against renaming without redirects. In fact, if MoinMoin could automatically insert #REDIRECT directives when a page is renamed, then there won't be any problem like this at all. I hope that pydotorg@ or infrastructure@ have this item on their feature lists. You can add redirects from the page names you think are more correct to the existing ones, but please don't rename the pages themselves. You need to expand that, because I don't get it. Why do you want the canonical pages about PyPI JSON API to bear the name of PyPiJson? This name is hard to synthesize if you want to type in directly into the URL without waiting for the page to load to click a link or use search field. It's not about which name I want. It's about the name of the page that was used to add content and which has been around long enough to assume that others have linked to it. With the redirect from the new name to the existing one, you get what you want and all others can continue to use the existing name. All right. So it is the matter of using old name or the new name. But both names lead to the same page. So the point of conflict here is what should be the end name of this page. If you say that it is not about which name do you want, then say why this name should not be the name I want? The person who created the pages got to chose. There's nothing much to argue here. I want canonical names for pages. Names that are consistent, which capitalization is easy to remember and reproduce, and I want that people linked to these names directly to avoid double redirects. That's fine: for pages that you create, you get to chose. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 23 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ 2013-03-13: Released eGenix pyOpenSSL 0.13 ...http://egenix.com/go39 : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] API for uploading packages to PyPI
On 22.03.2013 10:04, Ronald Oussoren wrote: On 22 Mar, 2013, at 9:58, anatoly techtonik techto...@gmail.com wrote: Some links are broken. I added redirects for wiki pages, but it will be better to fix links too. The OAuth link appears to be broken, and that's likely part of the fallout of the wiki.python.org breakin. It is broken because of Anatoly's renaming. The new name is http://wiki.python.org/moin/PyPiOauth Anatoly: I don't consider such renaming for some perceived level of consistency important enough to warrant the breakage you are introducing to external links. Please don't ! -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 22 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ 2013-03-13: Released eGenix pyOpenSSL 0.13 ...http://egenix.com/go39 : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] API for uploading packages to PyPI
On 22.03.2013 10:14, M.-A. Lemburg wrote: On 22.03.2013 10:04, Ronald Oussoren wrote: On 22 Mar, 2013, at 9:58, anatoly techtonik techto...@gmail.com wrote: Some links are broken. I added redirects for wiki pages, but it will be better to fix links too. The OAuth link appears to be broken, and that's likely part of the fallout of the wiki.python.org breakin. It is broken because of Anatoly's renaming. The new name is http://wiki.python.org/moin/PyPiOauth Sorry, that was the old name, which is now gone. The new name is http://wiki.python.org/moin/PyPIOAuth Anatoly: I don't consider such renaming for some perceived level of consistency important enough to warrant the breakage you are introducing to external links. Please don't ! -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 22 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ 2013-03-13: Released eGenix pyOpenSSL 0.13 ...http://egenix.com/go39 : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] API for uploading packages to PyPI
On 22.03.2013 09:58, anatoly techtonik wrote: On Fri, Mar 22, 2013 at 11:16 AM, Ronald Oussoren ronaldousso...@mac.comwrote: On 22 Mar, 2013, at 8:37, anatoly techtonik techto...@gmail.com wrote: Hi, I understand that this will make PyPI a potential target for automated spam bots, but still it will be awesome to have an API to upload packages to PyPI. For example, I have a code that extract all necessary meta data for the package from the source file itself. It is even able to generate setup.py from this data. https://bitbucket.org/techtonik/astdump The next logical step in this chain is to teach it to upload stuff to PyPI. Now I thought that this setup.py is an unnecessary complication. What I need, ideally is just upload single .py file, or a JSON and a .tar.gz FWIW. Is there a straightforward API for things like that? Yes: The distutils upload command implements the API. It essentially uses the same HTML form interface as the PyPI UI. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 22 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ 2013-03-13: Released eGenix pyOpenSSL 0.13 ...http://egenix.com/go39 : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] API for uploading packages to PyPI
On 22.03.2013 10:16, M.-A. Lemburg wrote: On 22.03.2013 10:14, M.-A. Lemburg wrote: On 22.03.2013 10:04, Ronald Oussoren wrote: On 22 Mar, 2013, at 9:58, anatoly techtonik techto...@gmail.com wrote: Some links are broken. I added redirects for wiki pages, but it will be better to fix links too. The OAuth link appears to be broken, and that's likely part of the fallout of the wiki.python.org breakin. It is broken because of Anatoly's renaming. The new name is http://wiki.python.org/moin/PyPiOauth Sorry, that was the old name, which is now gone. The new name is http://wiki.python.org/moin/PyPIOAuth I added a redirect now to keep the old URL working. Anatoly: I don't consider such renaming for some perceived level of consistency important enough to warrant the breakage you are introducing to external links. Please don't ! -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 22 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ 2013-03-13: Released eGenix pyOpenSSL 0.13 ...http://egenix.com/go39 : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] API for uploading packages to PyPI
On 22.03.2013 11:25, anatoly techtonik wrote: On Fri, Mar 22, 2013 at 12:14 PM, M.-A. Lemburg m...@egenix.com wrote: On 22.03.2013 10:04, Ronald Oussoren wrote: On 22 Mar, 2013, at 9:58, anatoly techtonik techto...@gmail.com wrote: Some links are broken. I added redirects for wiki pages, but it will be better to fix links too. The OAuth link appears to be broken, and that's likely part of the fallout of the wiki.python.org breakin. It is broken because of Anatoly's renaming. The new name is http://wiki.python.org/moin/PyPiOauth Anatoly: I don't consider such renaming for some perceived level of consistency important enough to warrant the breakage you are introducing to external links. Please don't ! I've renamed PyPIOAuth this long before today and fixed all link on the wiki. I don't have any tools to monitor any external links in MoinMoin. It will be nice if you add this request to the internal backlog of tasks for the next order to pydotorg redesign from PSF. There's no point in adding more work for everyone just because you feel there's an inconsistency in naming. It's also quite impossible to change all the links on the Internet pointing to our wiki pages, even if you knew who to contact. Again: Please don't do this. Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 22 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ 2013-03-13: Released eGenix pyOpenSSL 0.13 ...http://egenix.com/go39 : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] API for uploading packages to PyPI
On 22.03.2013 13:20, anatoly techtonik wrote: On Fri, Mar 22, 2013 at 1:49 PM, M.-A. Lemburg m...@egenix.com wrote: Again: Please don't do this. I think you're not against renaming pages, but against renaming without redirects. In fact, if MoinMoin could automatically insert #REDIRECT directives when a page is renamed, then there won't be any problem like this at all. I hope that pydotorg@ or infrastructure@ have this item on their feature lists. You can add redirects from the page names you think are more correct to the existing ones, but please don't rename the pages themselves. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 22 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ 2013-03-13: Released eGenix pyOpenSSL 0.13 ...http://egenix.com/go39 : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] API for uploading packages to PyPI
On 22.03.2013 13:38, anatoly techtonik wrote: On Fri, Mar 22, 2013 at 3:26 PM, M.-A. Lemburg m...@egenix.com wrote: On 22.03.2013 13:20, anatoly techtonik wrote: On Fri, Mar 22, 2013 at 1:49 PM, M.-A. Lemburg m...@egenix.com wrote: Again: Please don't do this. I think you're not against renaming pages, but against renaming without redirects. In fact, if MoinMoin could automatically insert #REDIRECT directives when a page is renamed, then there won't be any problem like this at all. I hope that pydotorg@ or infrastructure@ have this item on their feature lists. You can add redirects from the page names you think are more correct to the existing ones, but please don't rename the pages themselves. You need to expand that, because I don't get it. Why do you want the canonical pages about PyPI JSON API to bear the name of PyPiJson? This name is hard to synthesize if you want to type in directly into the URL without waiting for the page to load to click a link or use search field. It's not about which name I want. It's about the name of the page that was used to add content and which has been around long enough to assume that others have linked to it. With the redirect from the new name to the existing one, you get what you want and all others can continue to use the existing name. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 22 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ 2013-03-13: Released eGenix pyOpenSSL 0.13 ...http://egenix.com/go39 : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] Access to Windows' cert store
On 21.03.2013 13:06, Christian Heimes wrote: Hi, the message is slightly off-topic but it might be interesting for pip, setuptools and other developers that are working on HTTPS for PyPI. I while ago I found C++ example code that shows how to dump CA and CRL certs from Windows's system cert store. The system cert store contains the certificates used by Windows, IE etc. Why not simply use the Firefox certs ? We started adding these to our pyOpenSSL distribution with the last release: https://cms.egenix.com/products/python/pyOpenSSL/doc/#Module_OpenSSL.ca_bundle Yesterday I reimplemented the C++ code with Python and ctypes. I have tested it with Python 2.6 to 3.3 (x86 and x86_64) on Windows 7. It should work with Windows XP / Windows Server 2003 and all newer versions of Windows. The output is usabl by Python's SSL module but you have to dump the certs to a file first. You can setup OpenSSL Contexts to validate based in-memory certificate as well: just add the certs one by one to the Context using the X509Store object you can obtain using context.get_cert_store(). I'm planing to add the feature to Python 3.4, too. http://bugs.python.org/issue17134 You can download the code from https://bitbucket.org/tiran/wincertstore I think this would be useful addition for pyOpenSSL as well - if it's possible to extract the Windows certificates without admin rights. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 21 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ 2013-03-13: Released eGenix pyOpenSSL 0.13 ...http://egenix.com/go39 : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] Access to Windows' cert store
On 21.03.2013 14:32, Christian Heimes wrote: Am 21.03.2013 13:58, schrieb M.-A. Lemburg: Why not simply use the Firefox certs ? We started adding these to our pyOpenSSL distribution with the last release: https://cms.egenix.com/products/python/pyOpenSSL/doc/#Module_OpenSSL.ca_bundle Sure, that's another viable option. But IIRC some people have raised license concerns. I think the more problematic aspect is not being able to easily update the CA list. Firefox and Windows do this automatically for you, but for Python, this could only be done with patch level releases. Still, it's better than not having access to any such CA list, so would be a good fallback solution. You can setup OpenSSL Contexts to validate based in-memory certificate as well: just add the certs one by one to the Context using the X509Store object you can obtain using context.get_cert_store(). I assume you are talking about pyOpenSSL? I was referring to Python's SSL module. It can only load CA certs from a file or directory. It would be a useful feature for Python's SSL module, too. Ah, right. I think this would be useful addition for pyOpenSSL as well - if it's possible to extract the Windows certificates without admin rights. The code works without special privileges. The MSDN references don't mention any restrictions, too. The code is rather simple -- I'm only using four functions and three structs. Nice. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 21 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ 2013-03-13: Released eGenix pyOpenSSL 0.13 ...http://egenix.com/go39 : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] PEP 438 implementation on testpypi
On 20.03.2013 19:26, Richard Jones wrote: Thanks to Donald Stufft for his implementation of the PEP 438 changes, I've made them live on testpypi.python.org - specifically the urls page of package administration. Please poke and play. Nice... first tests: * Going to urls and then clicking on [Change] gives an error: Name and version are required Name and version are required It doesn't matter which choice you select. * Will there be an RPC interface to register URLs with PyPI ? Doing this manually for a large number of files is, well, not ideal :-) * Adding URLs should do some more tests, I think: It's possible to register test#md5=123 (without http/ftp and without providing the full MD5 sum). It's possible to register ../test/#md5=123, i.e. point to different files on PyPI itself. Not sure whether this is a bug or feature ;-) It's possible to register test#md5=123sha1=123. This is actually a good thing, since it allows implementing the hash tag extensions proposed by Christian Heimes. I'm just mentioning this, so that it becomes a supported feature. * I'm missing an option: [ ] Ask tools to scrape only the Download URL. This should result in the download_url being put on the /simple/ index page with rel=download being set. Reasoning: This is the designated URL where packages should be downloaded from. With the current list of choices, I'd have to select the last option, which includes the old long description links and the homepage URL. Other things: - * Would it be possible to add a link to the corresponding /simple/ index page on the package menu (the one with files, urls, etc.) ? * Could you add a link to the PKG-INFO file from pypi?:action=display_pkginfo to the /simple/ page as version-PKG-INFO (to match the other links) ? Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 20 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ 2013-03-13: Released eGenix pyOpenSSL 0.13 ...http://egenix.com/go39 : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] PEP 438 implementation on testpypi
On 20.03.2013 20:31, M.-A. Lemburg wrote: Other things: - * Would it be possible to add a link to the corresponding /simple/ index page on the package menu (the one with files, urls, etc.) ? * Could you add a link to the PKG-INFO file from pypi?:action=display_pkginfo to the /simple/ page as version-PKG-INFO (to match the other links) ? Or even better and more suitable for the CDN... Have PyPI publish the PKG-INFO under the /simple/ index URL: /simple/package/version-PKG-INFO (instead of just setting a link to the /pypi/ page) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 20 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ 2013-03-13: Released eGenix pyOpenSSL 0.13 ...http://egenix.com/go39 : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] PEP 438 implementation on testpypi
On 20.03.2013 21:16, Richard Jones wrote: On 20 March 2013 12:31, M.-A. Lemburg m...@egenix.com wrote: * Will there be an RPC interface to register URLs with PyPI ? Doing this manually for a large number of files is, well, not ideal :-) It's just a HTTP POST and there's plans for a tool. Is this documented somewhere ? I'd like to add support for it to our release process. * Adding URLs should do some more tests, I think: I thought about it, but didn't see any benefit. It's documented... Hmm, where ? :-) * I'm missing an option: [ ] Ask tools to scrape only the Download URL. This is not part of the planned implementation. The download_url was never well-specified, and only allows for one URL, hence the implementation we have. I know it's not in PEP 438 at the moment, but was one of the nits I mentioned to Holger last week. It's specified in the meta-data format 1.1 as A string containing the URL from which this version of the package can be downloaded.: http://www.python.org/dev/peps/pep-0314/ Having such an option would allow cleaning up the /simple/ index pages a lot, without any changes on the tools side. It would also be needed for the my proposal of securing external downloads, where you point to a hashed download page with the download_url. * Would it be possible to add a link to the corresponding /simple/ index page on the package menu (the one with files, urls, etc.) ? I guess this could be added, yes. Great. * Could you add a link to the PKG-INFO file from pypi?:action=display_pkginfo to the /simple/ page as version-PKG-INFO (to match the other links) ? We could think about it - what's the use-case? This would allow tools to easily and safely access meta-data of a package release without downloading, extracting and running the release files' setup.py. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 20 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ 2013-03-13: Released eGenix pyOpenSSL 0.13 ...http://egenix.com/go39 : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] PEP 438 implementation on testpypi
On 20.03.2013 23:01, Richard Jones wrote: On 20 March 2013 14:56, M.-A. Lemburg m...@egenix.com wrote: Could you change The URL must end with the MD5 hash of the file contents to The URL must include the MD5 hash of the file contents ? (See my original test report for the reason :-)) Hm. The wording was passed by one of the pip maintainers so I'll defer to them on what the URL format should be. The format should be defined in the PEP 438. If we adopt the hash tag extensions, then the URL fragment will just start with the md5= part and not necessarily also end with it. pip and easy_install will then have to implement the extension mechanism; and package authors will have to decide whether or not they want to stay compatible to versions of those tools that don't have these implemented. I was just asking for the text on the page to be in line with what PyPI actually checks. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 20 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ 2013-03-13: Released eGenix pyOpenSSL 0.13 ...http://egenix.com/go39 : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] PEP 438 implementation on testpypi
On 20.03.2013 23:19, Richard Jones wrote: On 20 March 2013 15:01, Richard Jones r1chardj0...@gmail.com wrote: On 20 March 2013 14:56, M.-A. Lemburg m...@egenix.com wrote: Could you change The URL must end with the MD5 hash of the file contents to The URL must include the MD5 hash of the file contents ? (See my original test report for the reason :-)) Hm. The wording was passed by one of the pip maintainers so I'll defer to them on what the URL format should be. Having discussed this further offline I've now modified the text as above (with a tweak.) Thanks. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 20 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ 2013-03-13: Released eGenix pyOpenSSL 0.13 ...http://egenix.com/go39 : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] PEP 438 implementation on testpypi
On 20.03.2013 23:28, Richard Jones wrote: On 20 March 2013 14:17, Richard Jones r1chardj0...@gmail.com wrote: On 20 March 2013 13:27, M.-A. Lemburg m...@egenix.com wrote: On 20.03.2013 21:16, Richard Jones wrote: On 20 March 2013 12:31, M.-A. Lemburg m...@egenix.com wrote: * Will there be an RPC interface to register URLs with PyPI ? Doing this manually for a large number of files is, well, not ideal :-) It's just a HTTP POST and there's plans for a tool. Is this documented somewhere ? I'd like to add support for it to our release process. I'll think about adding this to the PEP. This is now in the PEP. Hmm, looks like the PEP update process isn't working on the site: http://www.python.org/dev/peps/pep-0438/ Last-Modified: 2013-03-15 22:51:25 -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 21 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ 2013-03-13: Released eGenix pyOpenSSL 0.13 ...http://egenix.com/go39 : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] V4 Pre-PEP: transition to release-file hosting on PYPI
Thanks, Holger. This version looks a lot better :-) There are still some minor quirks which would need to be addressed more explicitly, but overall, this proposal provides a good way forward. Perhaps it would also be possible to add the secured download links and the caching/proxying ideas to the PEP at some point, or we turn those into a new PEP. I can't follow up in detail today, but will have a closer look next week. On 15.03.2013 10:29, holger krekel wrote: Hi all, in particular Philip, Marc-Andre, Donald, Carl and me decided to simplify the PEP and avoid the somewhat awkward ``simple/-with-externals`` index for various reasons, among them Marc-Andre's criticisms. This also means present-day installation tools (shipped with Redhat/Debian/etc.) will continue to work as today for those packages which remain in a hosting-mode that requires crawling and scraping. They will still benefit from the fact that most packages will soon have a hosting-mode that avoids it. Future releases of installation tools will default to not perform crawling or using (scraped) external links, and new PYPI projects will default to only serve uploaded files. The V4 pre-PEP also renames the three PyPI hosting modes to be more descriptive. Since all three modes allow external links, pypi-ext vs pypi-only were misleading. The new naming distinguishes the mode that both scrapes links from metadata and crawls external pages for more links (pypi-scrape-crawl) from the mode that only scrapes links from metadata (pypi-scrape) from the mode where all links are explicit (pypi-explicit). Without the separate external index, it also turns out that the two transition phases are separated into PyPI changes (phase one) and installer-tool updates (phase two). There are no PyPI changes necessary in phase two. As stated in a new open question, it should be possible to do PEP-related installation tool updates during phase 1, that may require a bit of clarification in the PEP's language still. Carl and me are happy with this PEP version now and hope you all are as well. Donald is already working on improving the analysis tool so we hopefully have some updated numbers soon. cheers, Holger PEP: XXX Title: Transitioning to release-file hosting on PyPI Version: $Revision$ Last-Modified: $Date$ Author: Holger Krekel hol...@merlinux.eu, Carl Meyer c...@oddbird.net Discussions-To: catalog-sig@python.org Status: Draft (PRE-submit V4) Type: Process Content-Type: text/x-rst Created: 10-Mar-2013 Post-History: Abstract This PEP proposes a backward-compatible two-phase transition process to speed up, simplify and robustify installing from the pypi.python.org (PyPI) package index. To ease the transition and minimize client-side friction, **no changes to distutils or existing installation tools are required in order to benefit from the first transition phase, which will result in faster, more reliable installs for most existing packages**. The first transition phase implements an easy and explicit means for a package maintainer to control which release file links are served to present-day installation tools. The first phase also includes the implementation of analysis tools for present-day packages, to support communication with package maintainers and the automated setting of default modes for controlling release file links. The first phase also will make new projects on PYPI use a default to only serve links to release files which were uploaded to PYPI. The second transition phase concerns end-user installation tools, which shall default to only install release files that are hosted on PyPI and tell the user if external release files exist, offering a choice to automatically use those external files. Rationale = .. _history: History and motivations for external hosting When PyPI went online, it offered release registration but had no facility to host release files itself. When hosting was added, no automated downloading tool existed yet. When Philip Eby implemented automated downloading (through setuptools), he made the choice to allow people to use download hosts of their choice. The finding of externally-hosted packages was implemented as follows: #. The PyPI ``simple/`` index for a package contains all links found by scraping them from that package's long_description metadata for any release. Links in the Download-URL and Home-page metadata fields are given ``rel=download`` and ``rel=homepage`` attributes, respectively. #. Any of these links whose target is a file whose name appears to be in the form of an installable source or binary distribution, with name in the form packagename-version.ARCHIVEEXT, is considered a potential installation candidate by installation tools. #. Similarly, any links suffixed with an #egg=packagename-version
Re: [Catalog-sig] V4 Pre-PEP: transition to release-file hosting on PYPI
A little off-topic, but I thought you might enjoy this in the context of all the crypto, hash and signing debate: http://xkcd.com/1181/ Cheers, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 15 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] setuptools/distribute/easy_install/pkg_resource sorting algorithm
On 12.03.2013 22:26, PJ Eby wrote: On Tue, Mar 12, 2013 at 3:59 PM, M.-A. Lemburg m...@egenix.com wrote: On 12.03.2013 19:15, M.-A. Lemburg wrote: I've run into a weird issue with easy_install, that I'm trying to solve: If I place two files named egenix_mxodbc_connect_client-2.0.2-py2.6.egg egenix-mxodbc-connect-client-2.0.2.win32-py2.6.prebuilt.zip into the same directory and let easy_install running on Linux scan this, it considers the second file for Windows as best match. Is the algorithm used for determining the best match documented somewhere ? I've had a look at the implementation, but this left me rather clueless. I thought that setuptools would prefer the .egg file over the prebuilt .zip file - binary files being easier to install than source files. After some experiments, I found that the follow change in filename (swapping platform and python version, in addition to use '-' instead of '.) works: egenix-mxodbc-connect-client-2.0.2-py2.6-win32.prebuilt.zip OTOH, this one doesn't (notice the difference ?): egenix-mxodbc-connect-client-2.0.2.py2.6-win32.prebuilt.zip The logic behind all this looks rather fragile to me. easy_install only guarantees sane version parsing for distribution files built using setuptools' naming algorithms. If you use distutils, it can only make guesses, because the distutils does not have a completely unambiguous file naming scheme. And if you are naming the files by hand, God help you. ;-) The problem appears to be a bug in setuptools' package_index.py. The function interpret_distro_name() creates a set of possible separations of the found name into project name and version. It does find the right separation, but for some reason, the code using that function does not check the found project names against the project name the user is trying to install, but simply takes the last entry of the list returned by the above function. As a result, easy_install downloads and tries to install project files that don't match the project name in some cases. Here's another example where it fails (say you're on a x64 Linux box): # easy_install egenix-pyopenssl As example, say it finds these distribution files: 'egenix-pyopenssl-0.13.1.1.0.1.5-py2.7_ucs2-linux-x86_64-prebuilt.zip', 'egenix_pyopenssl-0.13.1.1.0.1.5-py2.7-linux-x86_64.egg', 'egenix-pyopenssl-0.13.1.1.0.1.5-py2.7_ucs2-macosx-10.5-x86_64-prebuilt.zip', 'egenix-pyopenssl-0.13.1.1.0.1.5-py2.7_ucs4-macosx-10.5-x86_64-prebuilt.zip', It then creates different interpretations of those names, puts them in a list and sorts them. Here's the end of that list: egenix-pyopenssl; 0.13.1.1.0.1.5 -- this would be the correct .egg file egenix-pyopenssl; 0.13.1.1.0.1.5-py2.7-ucs2-linux-x86-64-prebuilt egenix-pyopenssl; 0.13.1.1.0.1.5-py2.7-ucs2-macosx-10.5-x86-64-prebuilt egenix-pyopenssl; 0.13.1.1.0.1.5-py2.7-ucs4-macosx-10.5-x86-64-prebuilt egenix-pyopenssl-0.13.1.1.0.1.5-py2.7-ucs2-macosx; 10.5-x86-64-prebuilt egenix-pyopenssl-0.13.1.1.0.1.5-py2.7-ucs4-macosx; 10.5-x86-64-prebuilt It picks the last entry, which would be for a project called egenix-pyopenssl-0.13.1.1.0.1.5-py2.7-ucs4-macosx - not the one the user searched. I'm trying to find a way to get it to use the correct .egg file The .egg files does have precedence over the other files, since easy_install regards them as source files with lower precedence. This is important, because the /simple/ index page will have links not only to .egg files, but also to our prebuilt .zip files, which use a source file compatible setup.py interface. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 14 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] setuptools/distribute/easy_install/pkg_resource sorting algorithm
On 14.03.2013 17:39, PJ Eby wrote: On Thu, Mar 14, 2013 at 6:07 AM, M.-A. Lemburg m...@egenix.com wrote: On 12.03.2013 22:26, PJ Eby wrote: On Tue, Mar 12, 2013 at 3:59 PM, M.-A. Lemburg m...@egenix.com wrote: On 12.03.2013 19:15, M.-A. Lemburg wrote: I've run into a weird issue with easy_install, that I'm trying to solve: If I place two files named egenix_mxodbc_connect_client-2.0.2-py2.6.egg egenix-mxodbc-connect-client-2.0.2.win32-py2.6.prebuilt.zip into the same directory and let easy_install running on Linux scan this, it considers the second file for Windows as best match. Is the algorithm used for determining the best match documented somewhere ? I've had a look at the implementation, but this left me rather clueless. I thought that setuptools would prefer the .egg file over the prebuilt .zip file - binary files being easier to install than source files. After some experiments, I found that the follow change in filename (swapping platform and python version, in addition to use '-' instead of '.) works: egenix-mxodbc-connect-client-2.0.2-py2.6-win32.prebuilt.zip OTOH, this one doesn't (notice the difference ?): egenix-mxodbc-connect-client-2.0.2.py2.6-win32.prebuilt.zip The logic behind all this looks rather fragile to me. easy_install only guarantees sane version parsing for distribution files built using setuptools' naming algorithms. If you use distutils, it can only make guesses, because the distutils does not have a completely unambiguous file naming scheme. And if you are naming the files by hand, God help you. ;-) The problem appears to be a bug in setuptools' package_index.py. The function interpret_distro_name() creates a set of possible separations of the found name into project name and version. It does find the right separation, but for some reason, the code using that function does not check the found project names against the project name the user is trying to install, but simply takes the last entry of the list returned by the above function. As a result, easy_install downloads and tries to install project files that don't match the project name in some cases. Here's another example where it fails (say you're on a x64 Linux box): # easy_install egenix-pyopenssl As example, say it finds these distribution files: 'egenix-pyopenssl-0.13.1.1.0.1.5-py2.7_ucs2-linux-x86_64-prebuilt.zip', 'egenix_pyopenssl-0.13.1.1.0.1.5-py2.7-linux-x86_64.egg', 'egenix-pyopenssl-0.13.1.1.0.1.5-py2.7_ucs2-macosx-10.5-x86_64-prebuilt.zip', 'egenix-pyopenssl-0.13.1.1.0.1.5-py2.7_ucs4-macosx-10.5-x86_64-prebuilt.zip', It then creates different interpretations of those names, puts them in a list and sorts them. Here's the end of that list: egenix-pyopenssl; 0.13.1.1.0.1.5 -- this would be the correct .egg file egenix-pyopenssl; 0.13.1.1.0.1.5-py2.7-ucs2-linux-x86-64-prebuilt egenix-pyopenssl; 0.13.1.1.0.1.5-py2.7-ucs2-macosx-10.5-x86-64-prebuilt egenix-pyopenssl; 0.13.1.1.0.1.5-py2.7-ucs4-macosx-10.5-x86-64-prebuilt egenix-pyopenssl-0.13.1.1.0.1.5-py2.7-ucs2-macosx; 10.5-x86-64-prebuilt egenix-pyopenssl-0.13.1.1.0.1.5-py2.7-ucs4-macosx; 10.5-x86-64-prebuilt It picks the last entry, which would be for a project called egenix-pyopenssl-0.13.1.1.0.1.5-py2.7-ucs4-macosx - not the one the user searched. Actually, that's not quite true. It's picking: egenix-pyopenssl; 0.13.1.1.0.1.5-py2.7-ucs4-macosx-10.5-x86-64-prebuilt Because it thinks that '0.13.1.1.0.1.5-py2.7-ucs4-macosx-10.5-x86-64-prebuilt' is a higher version than 0.13.1.1.0.1.5. It does also record the possibility you mentioned, but it doesn't pick that one. The project names actually *do* have to match. Ah, ok, that makes sense then. Is there any way to have 0.13.1.1.0.1.5-something sort before 0.13.1.1.0.1.5 ? (e.g. like is done for release candidates) Ideally, I'd like to get this to work without any changes to setuptools, even though it would of course be better not to take stuff after a Python version marker into account when looking for a package version (since the Python marker is actually a new component in the file name). If you open a ticket on the setuptools tracker, 'll try to see if I can get it to recognize that strings like py2.7, macosx, ucs, and the like are terminators for a version number. I don't know how successful I'll be, though. Basically, those zip files are (I assume) bdist_dumb distributions being taken for source distributions, and easy_install doesn't actually support bdist_dumb files at the moment. If you could point me to that tracker, I'll open a ticket :-) Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 14 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com
Re: [Catalog-sig] V2 pre-PEP: transitioning to release file hosting on PYPI
On 13.03.2013 07:28, Nick Coghlan wrote: On Tue, Mar 12, 2013 at 12:59 PM, M.-A. Lemburg m...@egenix.com wrote: I think we should establish a versioned API like that for PyPI to make progress easier. All major web APIs use versioning for this reason. Why set up versioning for something we want to phase out? There will never be a simple-v3, so this is really overengineering the proposed change. Who says that we want to phase out the /simple/ index ? FWIW, I don't think that two or three small changes to the PyPI (see my email to Holger) server warrants calling this over-engineering. This is about moving forward in a backwards compatible and future proof way. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 13 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] V3 PEP-draft for transitioning to pypi-hosting of release files
On 13.03.2013 12:21, holger krekel wrote: Hi all, after some more discussions and hours spend by Carl Meyer (who is now co-authoring the PEP) and me, here is a new V3 pre-submit draft. It is now more ambitious than the previous draft as should be obvious from the modified abstract (and Carl Meyers and Philip's earlier interactions on this list). There also are more details of how the current link-scraping works among other improvements and incorporations of feedback from discussions here. We intend to submit this draft tonight to the PEP editors. Feedback now and later remains welcome. I am sure there are issues to be sorted and clarified, among them the versioning-API suggestion by Marc-Andre. Thanks for everybody's support and feedback so far, holger PEP: XXX Title: Transitioning to release-file hosting on PyPI Version: $Revision$ Last-Modified: $Date$ Author: Holger Krekel hol...@merlinux.eu, Carl Meyer c...@oddbird.net Discussions-To: catalog-sig@python.org Status: Draft (PRE-submit V3) Type: Process Content-Type: text/x-rst Created: 10-Mar-2013 Post-History: Abstract This PEP proposes a backward-compatible two-phase transition process to speed up, simplify and robustify installing from the pypi.python.org (PyPI) package index. To ease the transition and minimize client-side friction, **no changes to distutils or existing installation tools are required in order to benefit from the transition phases, which is to result in faster, more reliable installs for most existing packages**. The first transition phase implements easy and explicit means for a package maintainter to control which release file links are served to present-day installation tools. The first phase also includes the implementation of analysis tools for present-day packages, to support communication with package maintainers and the automated setting of default modes for controling release file links. The second transition phase will result in the current PYPI index to only serve PYPI-hosted files by default. Externally hosted files will still be automatically discoverable through a second index. Present-day installation tools will be able to continue working by specifying this second index. New versions of installation tools shall default to only install packages from PYPI unless the user explicitely wishes to include non-PYPI sites. I must say, don't like this change in motivation compared to V1 and V2. The original of the discussion was to make PyPI more secure and the installation process faster and more reliable by moving away from crawling arbitrary external web pages. Both can be had by: * limiting the crawling to package author defined specific URLs, with added hashes to make sure that the URLs and their target content is not modified (this is the securing external downloads part - see here for an example: https://pypi.python.org/pypi/egenix-pyopenssl/0.13.1.1.0.1.5), and * adding a way for the package authors to say PyPI, please go ahead and cache/copy my distributions files (this is the increase download reliability part - can be had by doing opt-in CDN caching/proxying of external links via PyPI) Now, with V3 of the proposal, you are moving towards a system that basically says do it this way, or stay out of our eco system, which, in my book, is not what the Python eco system is all about. Your V2 was much more inviting in this respect. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 13 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] V3 PEP-draft for transitioning to pypi-hosting of release files
On 13.03.2013 20:08, Donald Stufft wrote: On Mar 13, 2013, at 2:57 PM, M.-A. Lemburg m...@egenix.com wrote: On 13.03.2013 12:21, holger krekel wrote: [V3 proposal] I must say, don't like this change in motivation compared to V1 and V2. The original of the discussion was to make PyPI more secure and the installation process faster and more reliable by moving away from crawling arbitrary external web pages. Both can be had by: * limiting the crawling to package author defined specific URLs, with added hashes to make sure that the URLs and their target content is not modified (this is the securing external downloads part - see here for an example: https://pypi.python.org/pypi/egenix-pyopenssl/0.13.1.1.0.1.5), and * adding a way for the package authors to say PyPI, please go ahead and cache/copy my distributions files (this is the increase download reliability part - can be had by doing opt-in CDN caching/proxying of external links via PyPI) Now, with V3 of the proposal, you are moving towards a system that basically says do it this way, or stay out of our eco system, which, in my book, is not what the Python eco system is all about. I don't see how? The -with-externals index will still contain all the existing links, and indeed PJ Elby has already stated that setuptools will move to support this index by default but with proper warnings to people so they know they are installing a package off site. This allows existing tools to be moved to a secure by default position. Allows future tools to choose if they want to enable the existing behavior through use of -with-externals (hopefully with a warning or opt-in sort of thing like laid out by PJE, but it's certainly not required). And even allows users of existing tools to opt into the old behavior via the -i option. Maybe i'm missing it but in what way does this force authors to do it this way or stay out of our eco system since all the same options are available as there are today? The proposal marks all external links as evil, and instead of making external links more secure, the user is left with the option to either not enable external links at all, or to let the devil in :-) That's not nice. It's also security theater. The real problem is unreviewed code getting executed by users, or worse, automated build systems. Yet, we let users believe that everything is secured on PyPI. Taking an extreme position, it would probably be better just leave everything as it is and instead educate users about the risk they are taking with a pip install AngryBirds, signed with keys issued by the PSF on the official PyPI server, delivered straight to your drive via the latest in crypto technology, only to wipe your notebook... But then, I don't like extreme positions, so would rather like to incrementally improve the situation both from the server and the client side, both addressing user and author concerns, and keeping the Python eco system a friendly place to be. Your V2 was much more inviting in this respect. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 13 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] A 90% Solution
On 12.03.2013 03:46, PJ Eby wrote: On Mon, Mar 11, 2013 at 8:28 PM, M.-A. Lemburg m...@egenix.com wrote: On 12.03.2013 00:39, Donald Stufft wrote: On Mar 11, 2013, at 7:04 PM, PJ Eby p...@telecommunity.com wrote: Just a thought, but... If 90% of PyPI projects do not have any external files to download, then, wouldn't it make sense to: To be accurate it's 90% don't have any files/release available *only* externally. Most have external files to download because it's very rare that a project doesn't include an home_page or a download_url, especially since distutils complains if you don't. How are you going to verify that disabling the links on those projects won't make certain release versions of those packages unavailable for pip/easy_install ? I'm not sure if you're asking Donald or me here. I was asking Donald, since he came up with the list. Given that he was using the pip PackageFinder, it is not clear whether this actually covers all easy_install'able packages as well (most likely not, since pip doesn't support e.g. egg files). My proposal was to only automatically disable the rel attributes for links to pages that do *not* contain any easy_install or pip-able download links. So, by definition, this would not make any releases unavailable. Ok. As for what Donald is proposing, I honestly have no idea what he's talking about, or whether the 90% statistic actually applies for what I'm proposing. So it's possible that it might be a lot less than 90% that my proposal would be able to affect *instantly*, without contacting any authors. We'd still need to inform authors that we changed a setting in their package, since they may want to use the feature to host packages or releases off-PyPI again in the future. How are you planing to inform the package authors of that change, so that they can take corrective action ? Which options would be available for authors ? Do see my proposal again, which was simply that there be a switch to enable or disable the rel attributes, that it default off for new packages, and be switched to off for exactly that set of packages which would not result in the loss of access to any download files. Yes, I saw that, but was putting up the questions in the context of Donald's idea to remove the links altogether. There is, at this point, the question of how to handle projects that have some of their releases hosted externally, or with some of the files external and some not. I would prefer that any automated changeover apply only to packages where the set of discoverable links is exactly equal to the links found on the project's /simple page. That would be safer, yes. Regarding the links, it's probably better to not remove the rel= attributes but instead change them from rel=download to e.g. rel=external-download; or to keep the old index semantics around as /simple-v1/. This keeps the valuable semantic relation available for tools that want to use it. For what? If you must keep them, rel=disabled-homepage etc. would get the message across. But I really don't see the point, and I *invented* the bloody things. True, but they are now part of the PyPI API and thus cannot be changed or removed easily. The rel= attributes provide extra information to tools using the /simple/ index as (static) API and losing such information would break the API. You're only thinking about installers using the /simple/ API, but there may very well also be e.g. researchers interested in scanning the index for homepages to find out where Python software lives, how the community is connected, which preferences for hosting and developing Python software there are, etc. That's a different context and in that context, the rel= attributes play a different role. Removing them would make such research impossible to implement using the /simple/ index and researchers would have to either go with the XML-RPC API (which is slow compared to /simple/, puts a lot of load on the PyPI server and cannot be placed on a CDN) or revert to the old-style scanning of the PyPI package pages. Frankly, I'm more than prepared to toss the rel attributes altogether, after adequate notice is given for people to move their files or links to the files. I just don't want any changes in the *rest* of the /simple generation algorithm. See above. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 12 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Re: [Catalog-sig] A 90% Solution
On 12.03.2013 10:20, Jesse Noller wrote: On Mar 12, 2013, at 3:57 AM, M.-A. Lemburg m...@egenix.com wrote: On 12.03.2013 03:46, PJ Eby wrote: On Mon, Mar 11, 2013 at 8:28 PM, M.-A. Lemburg m...@egenix.com wrote: On 12.03.2013 00:39, Donald Stufft wrote: On Mar 11, 2013, at 7:04 PM, PJ Eby p...@telecommunity.com wrote: Just a thought, but... If 90% of PyPI projects do not have any external files to download, then, wouldn't it make sense to: To be accurate it's 90% don't have any files/release available *only* externally. Most have external files to download because it's very rare that a project doesn't include an home_page or a download_url, especially since distutils complains if you don't. How are you going to verify that disabling the links on those projects won't make certain release versions of those packages unavailable for pip/easy_install ? I'm not sure if you're asking Donald or me here. I was asking Donald, since he came up with the list. Given that he was using the pip PackageFinder, it is not clear whether this actually covers all easy_install'able packages as well (most likely not, since pip doesn't support e.g. egg files). My proposal was to only automatically disable the rel attributes for links to pages that do *not* contain any easy_install or pip-able download links. So, by definition, this would not make any releases unavailable. Ok. As for what Donald is proposing, I honestly have no idea what he's talking about, or whether the 90% statistic actually applies for what I'm proposing. So it's possible that it might be a lot less than 90% that my proposal would be able to affect *instantly*, without contacting any authors. We'd still need to inform authors that we changed a setting in their package, since they may want to use the feature to host packages or releases off-PyPI again in the future. How are you planing to inform the package authors of that change, so that they can take corrective action ? Which options would be available for authors ? Do see my proposal again, which was simply that there be a switch to enable or disable the rel attributes, that it default off for new packages, and be switched to off for exactly that set of packages which would not result in the loss of access to any download files. Yes, I saw that, but was putting up the questions in the context of Donald's idea to remove the links altogether. There is, at this point, the question of how to handle projects that have some of their releases hosted externally, or with some of the files external and some not. I would prefer that any automated changeover apply only to packages where the set of discoverable links is exactly equal to the links found on the project's /simple page. That would be safer, yes. Regarding the links, it's probably better to not remove the rel= attributes but instead change them from rel=download to e.g. rel=external-download; or to keep the old index semantics around as /simple-v1/. This keeps the valuable semantic relation available for tools that want to use it. For what? If you must keep them, rel=disabled-homepage etc. would get the message across. But I really don't see the point, and I *invented* the bloody things. True, but they are now part of the PyPI API and thus cannot be changed or removed easily. The rel= attributes provide extra information to tools using the /simple/ index as (static) API and losing such information would break the API. You're only thinking about installers using the /simple/ API, but there may very well also be e.g. researchers interested in scanning the index for homepages to find out where Python software lives, how the community is connected, which preferences for hosting and developing Python software there are, etc. That's a different context and in that context, the rel= attributes play a different role. Removing them would make such research impossible to implement using the /simple/ index and researchers would have to either go with the XML-RPC API (which is slow compared to /simple/, puts a lot of load on the PyPI server and cannot be placed on a CDN) or revert to the old-style scanning of the PyPI package pages. So because of hypothetical researchers we can't make the system better. Of course we can, but just like with Python itself, we have to pay attention to backwards compatibility. Not hard to do: we'd just need to keep the old index in place using a different URL, e.g. /simple-v1/. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 12 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface
Re: [Catalog-sig] V2 pre-PEP: transitioning to release file hosting on PYPI
On 12.03.2013 12:38, holger krekel wrote: Hi all, below is the new PEP pre-submit version (V2) which incorporates the latest suggestions and aims at a rapidly deployable solution. Thanks in particular to Philip, Donald and Marc-Andre. I also added a few notes on how installers should behave with respect to non-PYPI crawling. I think a PEP like doc is warranted and that we should not silently change things without proper communication to maintainers and pre-planning the implementation/change process. Arguably, the changes are more invasive than oh, let's just do a http-https redirect which didn't work too well either. Now, if there is some agreement, i can submit this PEP officially tomorrow, and given agreement/refinments from the Pycon folks and the likes of Richard, we may be able to get going very shortly after Pycon. cheers, holger PEP-draft: transitioning to release-file hosting on PYPI Status --- PRE-SUBMIT-v2 Abstract This PEP proposes a backward-compatible transition process to speed up, simplify and robustify installing from the pypi.python.org (PYPI) package index. The initial transition will put most packages on PYPI automatically in a configuration mode which will prevent client-side crawling from installers. To ease automatic transition and minimize client-side friction, **no changes to distutils or installation tools** are required. Instead, the transition is implemented by modifying PYPI to serve links from ``simple/`` pages in a configurable way, preventing or allowing crawling of non-PYPI sites for detecting release files. Maintainers of all PYPI packages will be notified ahead of those changes. Maintainers of packages which currently are hosted on non-PYPI sites shall receive instructions and tools to ease re-hosting of their historic and future package release files. The implementation of such tools is NOT required for implementing the initial automatic transition. Installation tools like pip and easy_install shall warn about crawling non-PYPI sites and later default to disallow it and only allow it with an explicit option. History and motivations for external hosting When PYPI went online, it offered release registration but had no facility to host release files itself. When hosting was added, no automated downloading tool existed yet. When Philip Eby implemented automated downloading (through setuptools), he made the choice to allow people to use download hosts of their choice. This was implemented by the PYPI ``simple/`` index containing links of type ``rel=homepage`` or ``rel=download`` which are crawled by installation tools to discover package links. As of March 2013, a substantial part of packages (estimated to about 10%) make use of this mechanism to host files on github, bitbucket, sourceforge or own hosting sites like ``mercurial.selenic.com``, to just name a few. There are many reasons [2]_ why people choose to use external hosting, to cite just a few: - release processes and scripts have been developed already and upload to external sites - it takes too long to upload large files from some places in the world - export restrictions e.g. for crypto-related software - company policies which prescribe offering open source packages through own sites - problems with integrating uploading to PYPI into one's release process (because of release policies) - perceived bad reliability of PYPI - missing knowlege you can upload files Irrespective of the present-day validity of these reasons, there clearly is a history why people choose to host files externally and it even was for some time the only way you could do things. Problem --- **Today, python package installers (pip and easy_install) often need to query non-PYPI sites even if there are no externally hosted files**. Apart from querying pypi.python.org's simple index pages, also all homepages and download pages ever specified with any release of a package are crawled by an installer. The need for installers to crawl 3rd party sites slows down installation and makes for a brittle unreliable installation process. Those sites and packages also don't take part in the :pep:`381` mirroring infrastructure, further decreasing reliability and speed of automated installation processes around the world. Roughly 90% of packages are hosted directly on pypi.python.org [1]_. Even for them installers still need to crawl the homepage(s) of a package. Many package uploaders are particularly not aware that specifying the homepage in their release process will slow down the installation process for all its users. Relying on third party sites also opens up more attack vectors for injecting malicious packages into sites using
Re: [Catalog-sig] pre-PEP: transition to release-file hosting at pypi site
On 12.03.2013 16:42, Jacob Kaplan-Moss wrote: On Tue, Mar 12, 2013 at 10:38 AM, PJ Eby p...@telecommunity.com wrote: I'll ask it again: why should *thousands* of projects be censored or made to change their release processes, because *you* can't be bothered to cache the distributions of the projects you depend on? Because externally-hosted files are a security risk, one that most users don't realize exists. We can either fix this problem now, or we can wait until someone is compromised using PyPI as a vector. We can fix this problem, yes, but we need to do this right and try not to break things. I don't see the need to rush this, just to address some perceived high risk. Files hosted on PyPI are just as risky to use as files on any other server. The only way to minimize the risk is by downloading all the packages you need, do reviews of all of them and each time a new release is published. If you then point your installers only to the repository where you keep your reviewed files, then you can feel safer. In reality, this doesn't happen, though, so a lot of the stuff we're talking about here is security theater, no matter how much crypto/signing/hashing/hosting/CDN we throw at it :-) So let's do this carefully and find a good solution before jumping to conclusions. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 12 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] pre-PEP: transition to release-file hosting at pypi site
On 12.03.2013 17:29, Jacob Kaplan-Moss wrote: On Tue, Mar 12, 2013 at 11:19 AM, M.-A. Lemburg m...@egenix.com wrote: So let's do this carefully and find a good solution before jumping to conclusions. Completely agreed; rushing is a bad idea. But so is not starting. What I'm seeing — as a total outsider, a user of these tools, not someone who creates them — is that a bunch of people (Holger, Donald, Richard, the pip maintainers, etc.) have the beginnings of a solution ready to go *right now*, and I want to capture that energy and enthusiasm before it evaporates. This isn't an academic situation; I've seen companies decline to adopt Python over this exact security issue. I can't share details in writing but ask me at PyCon and I can tell you some stories. Externally-hosted packages are a security risk, full stop. There's likely a even better solution involving strong cryptography and such, but there's also an incremental improvement on the table right now. Nobody's suggesting that we do this hastily or all at once, but there *is* a proposal to get the process started right now. Why shouldn't we get going while there's momentum? Sure; I'm just saying that we need to test drive the proposal before actually adopting it. I'm also trying to get some of the more radical unneeded changes reconsidered. We don't need to break things just because we can - let's leave that to our kids ;-) Holger has already addressed much of this in his V2 proposal and apart from the time frame and some details, it looks good. Meanwhile, I've been playing around with the earlier proposal I put forward: http://wiki.python.org/moin/PyPI/DownloadMetaDataProposal to secure external links and found several issues while implementing it. It's easy to draw up a design, but you only get down to the problems when actually trying to implement it. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 12 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] V2 pre-PEP: transitioning to release file hosting on PYPI
Just a quick note (more later, if time permits)... On 12.03.2013 18:05, holger krekel wrote: Hi Marc-Andre, all, - Prepare PYPI implementation to allow a per-project hosting mode, effectively enabling or disabling external crawling. When enabled nothing changes from the current situation of producing ``rel=download`` and ``rel=homepage`` attributed links on ``simple/`` pages, causing installers to crawl those sites. When disabled, the attributions of links will change to ``rel=newdownload`` and ``rel=newhomepage`` causing installers to avoid crawling 3rd party sites. Retaining the meta-information allows tools to still make use of the semantic information. Please start using versioned APIs for these things. The old style index should still be available under some URL, e.g. /simple-v1/ or /v1/simple/ or /1/simple/ Not sure it is neccessary in this case. I would think it makes the implementation harder and it would probably break PEP381 (mirroring infrastructure) as well. Here's what I meant: We publish the current implementation of the /simple/ index API under a new URL /simple-v1/, so that people that want to use the old API can continue to do so. Then we setup a new /simple-v2/ index API with your proposed change, perhaps even dropping the rel attribute altogether. During testing, we'd then have: /simple/- same as /simple-v1/ /simple-v1/ - old API with rel attributes always set /simple-v2/ - new API with your changes (rel attributes only set in some cases) After a month or so of testing, we then switch this to: /simple/- same as /simple-v2/ /simple-v1/ - old API with rel attributes always set /simple-v2/ - new API with your changes (rel attributes only set in some cases) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 12 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
[Catalog-sig] setuptools/distribute/easy_install/pkg_resource sorting algorithm
I've run into a weird issue with easy_install, that I'm trying to solve: If I place two files named egenix_mxodbc_connect_client-2.0.2-py2.6.egg egenix-mxodbc-connect-client-2.0.2.win32-py2.6.prebuilt.zip into the same directory and let easy_install running on Linux scan this, it considers the second file for Windows as best match. Is the algorithm used for determining the best match documented somewhere ? I've had a look at the implementation, but this left me rather clueless. I thought that setuptools would prefer the .egg file over the prebuilt .zip file - binary files being easier to install than source files. Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 12 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] V2 pre-PEP: transitioning to release file hosting on PYPI
On 12.03.2013 20:17, holger krekel wrote: On Tue, Mar 12, 2013 at 19:07 +0100, M.-A. Lemburg wrote: Just a quick note (more later, if time permits)... On 12.03.2013 18:05, holger krekel wrote: Hi Marc-Andre, all, - Prepare PYPI implementation to allow a per-project hosting mode, effectively enabling or disabling external crawling. When enabled nothing changes from the current situation of producing ``rel=download`` and ``rel=homepage`` attributed links on ``simple/`` pages, causing installers to crawl those sites. When disabled, the attributions of links will change to ``rel=newdownload`` and ``rel=newhomepage`` causing installers to avoid crawling 3rd party sites. Retaining the meta-information allows tools to still make use of the semantic information. Please start using versioned APIs for these things. The old style index should still be available under some URL, e.g. /simple-v1/ or /v1/simple/ or /1/simple/ Not sure it is neccessary in this case. I would think it makes the implementation harder and it would probably break PEP381 (mirroring infrastructure) as well. Here's what I meant: We publish the current implementation of the /simple/ index API under a new URL /simple-v1/, so that people that want to use the old API can continue to do so. Then we setup a new /simple-v2/ index API with your proposed change, perhaps even dropping the rel attribute altogether. During testing, we'd then have: /simple/- same as /simple-v1/ /simple-v1/ - old API with rel attributes always set /simple-v2/ - new API with your changes (rel attributes only set in some cases) After a month or so of testing, we then switch this to: /simple/- same as /simple-v2/ /simple-v1/ - old API with rel attributes always set /simple-v2/ - new API with your changes (rel attributes only set in some cases) I understand but am not sure how easy this is to manage at the moment. I'd like to put this up in open questions and have (eventually) Richard comment on this before evolving it further. Should be pretty easy to do... Just add a version parameter to .run_simple() at https://bitbucket.org/loewis/pypi/src/dc6c23cce746bb25e0b013a1a1e020bc27bb332b/webui.py?at=default#cl-706 and then hook it up to the two URLs at https://bitbucket.org/loewis/pypi/src/dc6c23cce746bb25e0b013a1a1e020bc27bb332b/webui.py?at=default#cl-486 and https://bitbucket.org/loewis/pypi/src/dc6c23cce746bb25e0b013a1a1e020bc27bb332b/pypi.wsgi?at=default#cl-71 -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 12 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] V2 pre-PEP: transitioning to release file hosting on PYPI
On 12.03.2013 20:46, PJ Eby wrote: On Tue, Mar 12, 2013 at 2:07 PM, M.-A. Lemburg m...@egenix.com wrote: Just a quick note (more later, if time permits)... On 12.03.2013 18:05, holger krekel wrote: Hi Marc-Andre, all, - Prepare PYPI implementation to allow a per-project hosting mode, effectively enabling or disabling external crawling. When enabled nothing changes from the current situation of producing ``rel=download`` and ``rel=homepage`` attributed links on ``simple/`` pages, causing installers to crawl those sites. When disabled, the attributions of links will change to ``rel=newdownload`` and ``rel=newhomepage`` causing installers to avoid crawling 3rd party sites. Retaining the meta-information allows tools to still make use of the semantic information. Please start using versioned APIs for these things. The old style index should still be available under some URL, e.g. /simple-v1/ or /v1/simple/ or /1/simple/ Not sure it is neccessary in this case. I would think it makes the implementation harder and it would probably break PEP381 (mirroring infrastructure) as well. Here's what I meant: We publish the current implementation of the /simple/ index API under a new URL /simple-v1/, so that people that want to use the old API can continue to do so. Do you know of anyone who's *actually* going to need/use this alternate API. I think we should establish a versioned API like that for PyPI to make progress easier. All major web APIs use versioning for this reason. Why can't they just the XML-RPC API, the DOAP API, or any other means of obtaining this information? Those cannot easily be put on the CDN and would cause an unnecessary strain on the PyPI server. We could/should probably also make the PKG-INFO meta data file, plus some other static information such as upload/release dates (as RSS/Atom file) available on the /simple/ page to make this easier to use over the CDN. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 12 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] setuptools/distribute/easy_install/pkg_resource sorting algorithm
On 12.03.2013 19:15, M.-A. Lemburg wrote: I've run into a weird issue with easy_install, that I'm trying to solve: If I place two files named egenix_mxodbc_connect_client-2.0.2-py2.6.egg egenix-mxodbc-connect-client-2.0.2.win32-py2.6.prebuilt.zip into the same directory and let easy_install running on Linux scan this, it considers the second file for Windows as best match. Is the algorithm used for determining the best match documented somewhere ? I've had a look at the implementation, but this left me rather clueless. I thought that setuptools would prefer the .egg file over the prebuilt .zip file - binary files being easier to install than source files. After some experiments, I found that the follow change in filename (swapping platform and python version, in addition to use '-' instead of '.) works: egenix-mxodbc-connect-client-2.0.2-py2.6-win32.prebuilt.zip OTOH, this one doesn't (notice the difference ?): egenix-mxodbc-connect-client-2.0.2.py2.6-win32.prebuilt.zip The logic behind all this looks rather fragile to me. Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 12 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] pre-PEP: transition to release-file hosting at pypi site
On 11.03.2013 09:18, Lennart Regebro wrote: On Mon, Mar 11, 2013 at 9:06 AM, Ronald Oussoren ronaldousso...@mac.com wrote: But this isn't necessarily true, there is another solution: mirror your requirements locally. I do that. This is not a solution, because your requirements yesterday is not your requirements tomorrow. Is it even clear why numerous archives aren't hosted on PyPI? No, the only one that has mentioned why is Marc-André, I think, whose eGenix packages are distributed as binary packages for loads of different platforms. It's unclear to me if all these binary packages should be uploaded to PyPI, and it is also unclear to me why they can't be, it seems to be mostly a case of it being too much work. I've listed all the reasons in one of the previous emails: http://mail.python.org/pipermail/catalog-sig/2013-March/005502.html Others will likely have additional reasons, like e.g. * the PyPI uploads not being compatible to their release process * not knowing that it's possible to host files on PyPI - after all it's an *index*, not a repository :-) * still believing that PyPI is an unreliable hosting provider due the many downtimes and problems it had in the past - which is no longer true today * not wanting to host and maintain files in several different places * not wanting to host release files at all, i.e. have people check out the version from a repository instead of doing the download, unzip, install dance * not wanting to separate associated library or product code from the Python wrapper code (think e.g. the Python interface for subversion) * not being allowed to upload files to external servers by company policy, or having to deal with a company policy that makes this difficult/unattractive * having issues with the added latency of PyPI downloads compared to a simple file based index hosted on a company web server * having a strong need to know the number of downloads per package and associated statistics such as downloads per country, per year/month/day/hour * not wanting to give up access to the download log files * having a requirement to restrict downloads on a per country basis, e.g. for export controlled software or software which may not be imported/used in certain countries * having PyPI not provide the technical means to host the release files, e.g. due to the releases using a format which is not supported by PyPI (e.g. all the ActiveState packages - http://code.activestate.com/pypm/) * user experience/support issues: if the package has external dependencies, or needs special setup, it may provide a better user experience to host the Python wrapper on the same page as the dependencies and instructions on how to install them; rather than having them on PyPI which lets people believe that a simple pip install something will get them a working setup Those are just a few things that come to mind. I'm sure there are more issues that keep authors from uploading their packages to PyPI. Overall, I think we should encourage people to make their code available through PyPI and make it attractive to them, but keep the possibility to continue using external hosting platforms, should they run into issues that PyPI cannot solve for them. He also mentioned the big Python distributions eGenix does as being too large for PyPI, but I don't really see the point of uploading Python distributions to PyPI, they can't be installed with Python installers anyway. Not sure what you mean here. PyPI is also used to index Python projects which are not Python packages to be installed by pip/easy_install/etc. Some of those may also want to IMHO it would be better to remove barriers than force projects to host files on PyPI. Nobody has really been able to point out any real barriers, so we don't know what they are or if they exist. Again, please see the email where I listed the ones affecting at least eGenix. Most of those can be addressed in one way or another, e.g. by having PyPI cache the files, provide access to the download counts by country, provide a way to host separate indexes for UCS2/UCS4 egg files, etc. The only issues that need more investigation are the PyPI license terms and the general issue of not being able to host export regulated files on PyPI. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 11 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB
Re: [Catalog-sig] A 90% Solution
On 12.03.2013 00:39, Donald Stufft wrote: On Mar 11, 2013, at 7:04 PM, PJ Eby p...@telecommunity.com wrote: Just a thought, but... If 90% of PyPI projects do not have any external files to download, then, wouldn't it make sense to: To be accurate it's 90% don't have any files/release available *only* externally. Most have external files to download because it's very rare that a project doesn't include an home_page or a download_url, especially since distutils complains if you don't. How are you going to verify that disabling the links on those projects won't make certain release versions of those packages unavailable for pip/easy_install ? How are you planing to inform the package authors of that change, so that they can take corrective action ? Which options would be available for authors ? PyPI is a much too important Python resource to play around with. We need a good understanding of the effects a change may have and provide ways to deal with them, before putting a change, which potentially breaks hundreds of packages, into production. So yeah, just a thought ;-) 1. Add a project-level option to enable or disable the adding of the rel= attribute to /simple links (but not affecting the links in any other way) 2. Default it to disabled for new projects, and 3. Set it to disabled *now* for the 90% of projects that *don't have external files*? +1 except 1. should be to remove the links entirely from the /simple/ index, not to just remove the rel attribute. Removing those links removes the possibility of tools to still download or display information based on those links, e.g. to build a semantic web of Python resources. Please remember that the /simple/ index is part of the PyPI API, so it needs to be handled with the same care as the rest of the PyPI APIs. If you want to experiment with new ways of building the index, I'd suggest to first experiment with a new index, say /simple-v2/, before touching the main /simple/ index. Regarding the links, it's probably better to not remove the rel= attributes but instead change them from rel=download to e.g. rel=external-download; or to keep the old index semantics around as /simple-v1/. This keeps the valuable semantic relation available for tools that want to use it. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 12 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] A 90% Solution
On 12.03.2013 01:23, Donald Stufft wrote: On Mar 11, 2013, at 8:12 PM, PJ Eby p...@telecommunity.com wrote: On Mon, Mar 11, 2013 at 7:39 PM, Donald Stufft don...@stufft.io wrote: On Mar 11, 2013, at 7:04 PM, PJ Eby p...@telecommunity.com wrote: Just a thought, but... If 90% of PyPI projects do not have any external files to download, then, wouldn't it make sense to: To be accurate it's 90% don't have any files/release available *only* externally. Most have external files to download because it's very rare that a project doesn't include an home_page or a download_url, especially since distutils complains if you don't. So what is the % of projects for whom the option can be disabled automatically, *without* disabling automated downloadability of a project's externally hosted files? Your statement is confusing to me, because the having of a home page or download URL doesn't have anything to do with whether that page has any files to download from it. I didn't differentiate between spidering or direct links to external files. I simply iterated over all files that the pip PackageFinder was able to find, figured out the version for each url, and stored if that version came a link to a pypi.python.org resource or a different domain. I then diffed the two lists to get a list of versions that are _only_ installable externally. That 90% is 90% who can have *all* links what so ever besides ones hosted on PyPI itself removed and not have any versions be no longer installable. Which kinds of distribution files can pip's PackageFinder find ? Does it find MSIs, EXEs, egg files ? AFAIK, it only supports .tar.gz and .zip files, but no binary files (except for the new .whl binary format). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 12 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] hash tags
[Discussion about MD5] I think there's not much point in discussing MD5 in this context. When creating new designs, you should always use the current best and most widely deployed algorithm, IMO. For Python, this is the SHA-2 family at the moment, since SHA-3 is not supported by Python's hashlib. MD5 is only needed to support older software. SHA-1 is also support by Python versions older than Python 2.5. It seems that SHA-256 and SHA-512, both from the SHA-2 family, are the most popular at the moment, so I guess SHA-256 is a good candidate to move forward and satisfy the 80/20 rule. Agreed ? FWIW, I'm pretty sure, SHA-256 will be broken in 10 years from now as well :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 09 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] hash tags
On 08.03.2013 13:15, Christian Heimes wrote: Am 08.03.2013 12:49, schrieb M.-A. Lemburg: Together with the added hash tag on the download file URLs (*), this would solve the availability and the security aspects. Instead of deprecating external links altogether, we could then deprecate non-compliant download links and get an overall very flexible system for Python package distribution. (*) Yes, I know, I still have to deliver the updated proposal - been working on getting our indexes ready to serve as example :-) How does your proposal look like? Here's the first version with the basic idea: http://wiki.python.org/moin/PyPI/DownloadMetaDataProposal After the feedback I got from Holger and Phillip, I'm currently writing a new version, which drops some of the unneeded requirements and spells out a few more things. Here's a very short version... Installers are modified: * to only follow rel=download links from the /simple/ index page, which have a hash tag (e.g. #md5=...) * will only use the fetched download page if its contents match the hash tag * scan that page for rel=download links, which again have to have a hash tag to be taken into account * only install files for which the hash tag matches the downloaded content This should provide a good way to make sure that the downloaded files are indeed under control of the package maintainer. So far the only practical problem I've found with the approach is that the download page may not contain dynamic data, e.g. a date or timestamp, since that causes the hash tag not to verify. The package maintainer will also have to reregister the package whenever changes to the download page are made - but that's actually intended :-) I like to propose query string-like key/value pairs. key/value pairs are more flexible and allow us to add/remove new information in the future. Good idea. I'll add that as extension mechanism. I also propose that we add the file size in octets (bytes with 8bits in each byte) to the fragment identifier. File size validation prohibits e.g. length extension attacks. It is useful to download tools. I know that HTTP servers usually set a Content-Length header for static files. But the header is set by the CDN while the information in the fragment identifier shall come from PyPI's internal database. Example: defusedxml-0.4.tar.gz#md5=09873c31ce773d48b8a4759571655a2csha1=33821e6891e3fc3829f5a238a93490f939533d62octets=48324 Minor nit: s/octets/size We could probably even add GPG sigs to the link. The only problem with the extension mechanism is that the currently available installers only support #md5= Perhaps there's some way to trick them into still working with the query-style fragment links ?! -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 07 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] hash tags
On 08.03.2013 14:09, Donald Stufft wrote: Accidentally sent this to only MAL so resending! On Mar 8, 2013, at 7:50 AM, M.-A. Lemburg m...@egenix.com wrote: On 08.03.2013 13:15, Christian Heimes wrote: Am 08.03.2013 12:49, schrieb M.-A. Lemburg: Together with the added hash tag on the download file URLs (*), this would solve the availability and the security aspects. Instead of deprecating external links altogether, we could then deprecate non-compliant download links and get an overall very flexible system for Python package distribution. (*) Yes, I know, I still have to deliver the updated proposal - been working on getting our indexes ready to serve as example :-) How does your proposal look like? Here's the first version with the basic idea: http://wiki.python.org/moin/PyPI/DownloadMetaDataProposal After the feedback I got from Holger and Phillip, I'm currently writing a new version, which drops some of the unneeded requirements and spells out a few more things. Here's a very short version... Installers are modified: * to only follow rel=download links from the /simple/ index page, which have a hash tag (e.g. #md5=…) Sounds like a pretty serious break in backwards compat. Only 29 releases out of 144493 currently have a #md5= in their download_url. Either PyPI will be expected to download url and compute a hash (DoS vector, will need to be coded properly) which is error prone and is likely to break in non obvious ways for maintainers. While I'm obviously not against breaking backwards compatibility, I think if we're going to do that we might as well go whole hog and kill external links completely. This was just the main new download theme. If the new scheme doesn't work, they should revert back to the old scheme, after a BIG warning the user. Later on they could switch to requiring users to use an option to reenable the old scheme. In any case, I'll have to put all this into proper words and will then post it for another review cycle. * will only use the fetched download page if its contents match the hash tag * scan that page for rel=download links, which again have to have a hash tag to be taken into account * only install files for which the hash tag matches the downloaded content This should provide a good way to make sure that the downloaded files are indeed under control of the package maintainer. So far the only practical problem I've found with the approach is that the download page may not contain dynamic data, e.g. a date or timestamp, since that causes the hash tag not to verify. The package maintainer will also have to reregister the package whenever changes to the download page are made - but that's actually intended :-) I like to propose query string-like key/value pairs. key/value pairs are more flexible and allow us to add/remove new information in the future. Good idea. I'll add that as extension mechanism. I also propose that we add the file size in octets (bytes with 8bits in each byte) to the fragment identifier. File size validation prohibits e.g. length extension attacks. It is useful to download tools. I know that HTTP servers usually set a Content-Length header for static files. But the header is set by the CDN while the information in the fragment identifier shall come from PyPI's internal database. Example: defusedxml-0.4.tar.gz#md5=09873c31ce773d48b8a4759571655a2csha1=33821e6891e3fc3829f5a238a93490f939533d62octets=48324 Minor nit: s/octets/size We could probably even add GPG sigs to the link. The only problem with the extension mechanism is that the currently available installers only support #md5=…. pip works just fine with any of the algorithms from hashlib. The installers all also support #egg=, and there might be some others I can't recall offhand. Ah, good to know. Thanks. Perhaps there's some way to trick them into still working with the query-style fragment links ?! -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 07 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] hash tags
On 08.03.2013 13:50, M.-A. Lemburg wrote: On 08.03.2013 13:15, Christian Heimes wrote: I like to propose query string-like key/value pairs. key/value pairs are more flexible and allow us to add/remove new information in the future. Good idea. I'll add that as extension mechanism. I also propose that we add the file size in octets (bytes with 8bits in each byte) to the fragment identifier. File size validation prohibits e.g. length extension attacks. It is useful to download tools. I know that HTTP servers usually set a Content-Length header for static files. But the header is set by the CDN while the information in the fragment identifier shall come from PyPI's internal database. Example: defusedxml-0.4.tar.gz#md5=09873c31ce773d48b8a4759571655a2csha1=33821e6891e3fc3829f5a238a93490f939533d62octets=48324 Minor nit: s/octets/size We could probably even add GPG sigs to the link. The only problem with the extension mechanism is that the currently available installers only support #md5= Perhaps there's some way to trick them into still working with the query-style fragment links ?! Too bad... at least distribute/setuptools enforces this: def check_md5(self, cs, info, filename, tfp): if re.match('md5=[0-9a-f]{32}$', info): ... If it weren't for that '$', we'd have no problem. At least distribute currently doesn't check the download links from the /simple/ page at all, so we can use the extension mechanism there without breaking older versions of the tools. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 07 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] hash tags
On 08.03.2013 20:52, Noah Kantrowitz wrote: On Mar 8, 2013, at 4:50 AM, M.-A. Lemburg wrote: On 08.03.2013 13:15, Christian Heimes wrote: Am 08.03.2013 12:49, schrieb M.-A. Lemburg: Together with the added hash tag on the download file URLs (*), this would solve the availability and the security aspects. Instead of deprecating external links altogether, we could then deprecate non-compliant download links and get an overall very flexible system for Python package distribution. (*) Yes, I know, I still have to deliver the updated proposal - been working on getting our indexes ready to serve as example :-) How does your proposal look like? Here's the first version with the basic idea: http://wiki.python.org/moin/PyPI/DownloadMetaDataProposal After the feedback I got from Holger and Phillip, I'm currently writing a new version, which drops some of the unneeded requirements and spells out a few more things. Here's a very short version... Installers are modified: * to only follow rel=download links from the /simple/ index page, which have a hash tag (e.g. #md5=...) * will only use the fetched download page if its contents match the hash tag * scan that page for rel=download links, which again have to have a hash tag to be taken into account * only install files for which the hash tag matches the downloaded content This should provide a good way to make sure that the downloaded files are indeed under control of the package maintainer. MD5 is _not_ acceptable for anything security related and we shouldn't be adding anything that increases our dependence on it. MD5's only use in the packaging world is to make people who forget that TCP has its own checksums feel all warm and fuzzy that there hasn't been _accidental_ download corruption. I was only using the existing md5 hash tags as example. Tools should migrate to support all hashlib algorithms (pip already does), so the hash tag can be e.g. #sha1=... or #sha256=... For Python 2.4 only md5 and sha1 would work, since it didn't come with a hashlib module. With the extension mechanism Christian proposed, we can also add all sorts of other things as well, e.g. size indications, GPG key ID, GPG sigs, etc. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 07 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] hash tags
On 08.03.2013 20:16, PJ Eby wrote: On Fri, Mar 8, 2013 at 7:50 AM, M.-A. Lemburg m...@egenix.com wrote: After the feedback I got from Holger and Phillip, I'm currently writing a new version, which drops some of the unneeded requirements and spells out a few more things. Here's a very short version... Installers are modified: * to only follow rel=download links from the /simple/ index page, which have a hash tag (e.g. #md5=...) * will only use the fetched download page if its contents match the hash tag * scan that page for rel=download links, which again have to have a hash tag to be taken into account * only install files for which the hash tag matches the downloaded content This should provide a good way to make sure that the downloaded files are indeed under control of the package maintainer. There is, as I said before, a MUCH simpler way to do this, that works right now: put direct #md5 download links in your description, and phase out the rel= attributes altogether. No, that would be a pretty poor design :-) The rel= attributes are good design, since they were meant for exactly this purpose (machine reading and understanding relations between origin and target). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 07 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] hash tags
On 08.03.2013 20:16, PJ Eby wrote: On Fri, Mar 8, 2013 at 7:50 AM, M.-A. Lemburg m...@egenix.com wrote: So far the only practical problem I've found with the approach is that the download page may not contain dynamic data, e.g. a date or timestamp, since that causes the hash tag not to verify. Which is completely unnecessary if one simply exposes the *actual* download links directly on PyPI. The download page is redundant, in a couple different ways. First, since it can't change, there's no point in re-fetching it all the time. Second, since it's only going to be read by tools anyway, there's no point to it containing anything besides the link. So, since the page only contains links, might as well put the links straight on PyPI, or at most have an option/tool to load the links from an external source. I don't follow you. We only have a single download_url field available to store a download link. We'd need to modify the meta data format to allow for more than one such field, which doesn't work if you want to stay backwards compatible. BTW: If we go with the CDN caching model for external files, we'd pull the download page links directly on the /simple/ index page - as files, not external links. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 07 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] hash tags
On 08.03.2013 22:47, Donald Stufft wrote: On Mar 8, 2013, at 4:45 PM, M.-A. Lemburg m...@egenix.com wrote: On 08.03.2013 22:33, Donald Stufft wrote: On Mar 8, 2013, at 4:28 PM, M.-A. Lemburg m...@egenix.com wrote: BTW: If we go with the CDN caching model for external files, we'd pull the download page links directly on the /simple/ index page - as files, not external links. We cannot download and rehost (even if we call it a cache) external files without getting permission from their owners to do so. Well, in the CDN version of the /simple/ dir, they would look like files hosted on the CDN. The download pages would still be feeding the CDN, though. I'm unsure what you're saying here. If it involves downloading files hosted outside of PyPI and putting it on a PSF controlled CDN it's a non starter. My idea was to have PyPI send a redirect to the external URL when getting a request for the file, so we could avoid hosting the files and instead just have the CDN cache them for a certain time period. However, I've now read up on the CloudFront docs, which point out that the CDN won't follow the redirect, but simply forward it to the user, bypassing the CDN: http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/RequestAndResponseBehaviorCustomOrigin.html#ResponseCustomRedirects I suspect other CDNs to work in the same way, so the redirect idea doesn't work. We'd have to use a proxy solution on the PyPI server to make the caching CDN work, but that will likely cause more legal problems than the plain caching of content on the way to the user. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 07 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] revoked certificate error on chrome from PyPI?
On 05.03.2013 12:10, Chris Withers wrote: On 05/03/2013 11:09, Giovanni Bajo wrote: Il giorno 05/mar/2013, alle ore 11:19, Chris Withersch...@simplistix.co.uk ha scritto: On 05/03/2013 10:18, Donald Stufft wrote: On Tuesday, March 5, 2013 at 4:51 AM, Chris Withers wrote: When I go to PyPI on an older Chrome, I get a certificate revoked error and can't view the site. What version of Chrome? v25 sees http://d.stufft.io/image/1J3W01473s42 12.0.742.112. Do you manage to see any specific error message? Can you attache a screenshot? It's the standard this certificate has been revoked page from Chrome. Hmm... wget http://crl.startssl.com/crt2-crl.crl openssl crl -inform DER -in crt2-crl.crl -text | fgrep 013A4D doesn't return anything (013A4D is the PyPI cert serial). A bug in Chrome ? -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 05 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] Deprecate External Links
On 01.03.2013 10:02, Reinout van Rees wrote: On 28-02-13 21:08, holger krekel wrote: I have seen that position in this discussion (I have to upload 120 files per release, so I won't do that, for instance). haven't seen that. Marc-Andre Lemburg said this, which I took to mean 120 uploads per release: However, taking our egenix-mx-base package as example, we have 120 distribution files for every single release. Uploading those to PyPI would not only take long, but also ... Correct, with a total of over 100MB per release. However, the above quote is slightly incorrect: I did not say I won't do that, just that there are issues with doing this: * It currently takes too long uploading that many files to PyPI. This causes a problem, since in order to start the upload, we have to register the release on PyPI, which tools will then immediately find. However, during the upload time, they won't necessarily find the right files to download and then fail. The proposed pull mechanism (see http://wiki.python.org/moin/PyPI/DownloadMetaDataProposal) would work around this problem: tools would simply go to our servers in case they can't find the files on PyPI. * PyPI doesn't allow us to upload two egg files with the same name: we have to provide egg files for UCS2 Python builds and UCS4 Python builds, since easy_install/setuptools/pip don't differentiate between the two variants. This is the main reason why we're hosting our own PyPI-style indexes, one for UCS2 and the other for UCS4 builds: https://downloads.egenix.com/python/index/ucs2/ https://downloads.egenix.com/python/index/ucs4/ * I'm not sure whether we want to import our crypto packages to the US, so for a subset of the files, we'd probably continue to use our servers in Germany. Again, with the above proposal, this shouldn't be a problem. * Ihe PyPI terms are a bummer for us, but this can be fixed, I guess. If we can resolve the issues, we'd have no problem having the files mirrored on PyPI. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 01 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] homepage/download metadata cleaning
On 01.03.2013 11:19, holger krekel wrote: Hi Richard, all, somewhere deep in the threads i mentioned i wrote a little cleanpypi.py script which takes a project name as an argument and then goes to pypi.python.org and removes all homepage/download metadata entries for this project. This sanitizes/speeds up installation because pip/easy_install don't need to crawl them anymore. I just did this for three of my projects, (pytest, tox and py) and it seems to work fine. Does it also cleanup the links that PyPI adds to the /simple/ by parsing the project description for links ? I think those are far nastier than the homepage and download links, which can be put to some good use to limit the external lookups (see http://wiki.python.org/moin/PyPI/DownloadMetaDataProposal) See e.g. https://pypi.python.org/simple/zc.buildout/ for a good example of the mess this generates... even mailto links get listed and file:/// links open up the installers for all kinds of nasty things (unless they explicitly protect against following these). Now before i release this as a tool, i wonder: Is it a good idea to remove download/homepage entries? Is there any current machine use (other than the dreaded crawling) for the homepage/download_url per-release metadata fields? For humans the homepage link is nicely discoverable if the long-description doesn't mention it prominently. But i think there also is a project url or bugtrack url for a project so maybe those could be used to reference these important pages? (i am a bit confused on the exact meaning of those urls, btw). Should we maybe stop advertising homepage and download_url and instead see to extend project-url/bugtrackurl to be used and shown nicely? The latter are independent of releases which i think makes sense - what use are old probably unreachable/borked homepages anyway. And it's also not too bad having to go once to pypi.python.org to set it, usually it seldomly changes. I think it would be better to differentiate between showing the fields on the project pages, where they provide useful resources for people, and their use on the /simple/ index pages which are meant for programs to parse. IMO, the homepage and download links on the project pages are indeed very useful for people. On the /simple/ index a homepage link is probably not all that useful (provided a download link is set). The download links serve the purpose of directing tools to the right location, so those do belong on the /simple/ index listings. I'd completely remove the links parsed from the descriptions, since those don't really provide a good basis for crawling (the description is meant for humans to parse, not programs). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 01 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] PyPI terms (was: Deprecate External Links)
On 01.03.2013 12:30, Jesse Noller wrote: Marc Andre: I'm cc'ing Van: can you explain why the pypi terms are a bummer so we can see if there is actually an issue to be resolved or a matter of taste? We need to protect the foundation while preserving author rights - but I don't want one user / subset dictating how we evolve the technology. I think we should move this discussion to the python-legal-sig list: http://mail.python.org/mailman/listinfo/python-legal-sig Let me know when you've subscribed and then we can hash things out on that list. The catalog sig is not really the suitable place for these discussions. Jesse On Mar 1, 2013, at 4:24 AM, M.-A. Lemburg m...@egenix.com wrote: On 01.03.2013 10:02, Reinout van Rees wrote: On 28-02-13 21:08, holger krekel wrote: I have seen that position in this discussion (I have to upload 120 files per release, so I won't do that, for instance). haven't seen that. Marc-Andre Lemburg said this, which I took to mean 120 uploads per release: However, taking our egenix-mx-base package as example, we have 120 distribution files for every single release. Uploading those to PyPI would not only take long, but also ... Correct, with a total of over 100MB per release. However, the above quote is slightly incorrect: I did not say I won't do that, just that there are issues with doing this: * It currently takes too long uploading that many files to PyPI. This causes a problem, since in order to start the upload, we have to register the release on PyPI, which tools will then immediately find. However, during the upload time, they won't necessarily find the right files to download and then fail. The proposed pull mechanism (see http://wiki.python.org/moin/PyPI/DownloadMetaDataProposal) would work around this problem: tools would simply go to our servers in case they can't find the files on PyPI. * PyPI doesn't allow us to upload two egg files with the same name: we have to provide egg files for UCS2 Python builds and UCS4 Python builds, since easy_install/setuptools/pip don't differentiate between the two variants. This is the main reason why we're hosting our own PyPI-style indexes, one for UCS2 and the other for UCS4 builds: https://downloads.egenix.com/python/index/ucs2/ https://downloads.egenix.com/python/index/ucs4/ * I'm not sure whether we want to import our crypto packages to the US, so for a subset of the files, we'd probably continue to use our servers in Germany. Again, with the above proposal, this shouldn't be a problem. * Ihe PyPI terms are a bummer for us, but this can be fixed, I guess. If we can resolve the issues, we'd have no problem having the files mirrored on PyPI. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 01 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 01 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] PyPI terms
On 01.03.2013 13:18, Jesse Noller wrote: I am subscribed: I made the list. We're both board directors too. Changes to the tos should come from legal counsel, and the board Van and all others who are interested as well ? On Mar 1, 2013, at 6:47 AM, M.-A. Lemburg m...@egenix.com wrote: On 01.03.2013 12:30, Jesse Noller wrote: Marc Andre: I'm cc'ing Van: can you explain why the pypi terms are a bummer so we can see if there is actually an issue to be resolved or a matter of taste? We need to protect the foundation while preserving author rights - but I don't want one user / subset dictating how we evolve the technology. I think we should move this discussion to the python-legal-sig list: http://mail.python.org/mailman/listinfo/python-legal-sig Let me know when you've subscribed and then we can hash things out on that list. The catalog sig is not really the suitable place for these discussions. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 01 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] PyPI terms
On 01.03.2013 12:47, M.-A. Lemburg wrote: On 01.03.2013 12:30, Jesse Noller wrote: Marc Andre: I'm cc'ing Van: can you explain why the pypi terms are a bummer so we can see if there is actually an issue to be resolved or a matter of taste? We need to protect the foundation while preserving author rights - but I don't want one user / subset dictating how we evolve the technology. I think we should move this discussion to the python-legal-sig list: http://mail.python.org/mailman/listinfo/python-legal-sig Let me know when you've subscribed and then we can hash things out on that list. The catalog sig is not really the suitable place for these discussions. I've kicked off the discussion on the other list. See you there. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 01 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] PyPI terms
On 01.03.2013 15:02, Jesse Noller wrote: Okie doke. So we can move on to putting up the CDN and deprecating external links for now? I don't think anyone is against putting up a CDN. It should meet the same security requirements we have for the pypi server itself, ie. HTTPS all the way, proper certificates, operated by the PSF, perhaps run on a different domain, and whatever other goodies Donald can come up with ;-) For the external links we need a migration path... that's in the works. See http://wiki.python.org/moin/PyPI/DownloadMetaDataProposal for a proposal that allows migrating away from relying on external hosts in a backwards compatible and secure way. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 01 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] [Python-legal-sig] PyPI terms
Hi Van, please read my long posting to the python-legal list. This explains the concerns and makes suggestions on how to improve things in a way that is compatible with what PyPI is and how it is used today: http://mail.python.org/pipermail/python-legal-sig/2013-March/00.html PS: I'd prefer if you not cross-post to both lists and keep the discussion to the legal list. Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 01 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] homepage/download metadata cleaning
Thank for the feedback, Holger and Phillip. I'll bake this into a version 0.2 of the proposal over the weekend. On 01.03.2013 17:29, PJ Eby wrote: On Fri, Mar 1, 2013 at 6:17 AM, holger krekel hol...@merlinux.eu wrote: On Fri, Mar 01, 2013 at 06:09 -0500, Donald Stufft wrote: On Friday, March 1, 2013 at 6:04 AM, M.-A. Lemburg wrote: On 01.03.2013 11:19, holger krekel wrote: Hi Richard, all, somewhere deep in the threads i mentioned i wrote a little cleanpypi.py script which takes a project name as an argument and then goes to pypi.python.org (http://pypi.python.org) and removes all homepage/download metadata entries for this project. This sanitizes/speeds up installation because pip/easy_install don't need to crawl them anymore. I just did this for three of my projects, (pytest, tox and py) and it seems to work fine. Does it also cleanup the links that PyPI adds to the /simple/ by parsing the project description for links ? I think those are far nastier than the homepage and download links, which can be put to some good use to limit the external lookups (see http://wiki.python.org/moin/PyPI/DownloadMetaDataProposal) See e.g. https://pypi.python.org/simple/zc.buildout/ for a good example of the mess this generates... even mailto links get listed and file:/// links open up the installers for all kinds of nasty things (unless they explicitly protect against following these). pip at least, and I assume the other tools don't spider those links, but they do consider them for download (e.g. if the link looks installable it will be a candidate for installing, but it won't fetch it, and look for more links like it will donwnload_url/home_page). I believe that's the way it's structured atm. That's right. Even though the long-description extracted links look ugly on a simple/PKGNAME page, neither pip nor easy_install do anything with them except if the href ends in #egg=PKGNAME- in which case they are taken as pointing to a development tarball (e.g. at github or bitbucket). ASFAIK a link like PKGNAME-VER.tar.gz will not be treated as an installation candidate, just the #egg=PKGNAME one. Both are considered primary links. A primary link is a link whose filename portion matches one of the supported distutils or setuptools file formats, or is marked with an #egg tag. Primary links are indexed as to project name and version, so that if that version/format is chosen as the best candidate, it will be downloaded and installed. Links marked with rel=homepage or rel=download are secondary links. Secondary links are actively retrieved and scanned to look for more primary links. No further secondary links are scanned or followed. (Details of all of this can be found at: http://peak.telecommunity.com/DevCenter/setuptools#making-your-package-available-for-easyinstall ) This basically means that MAL's proposal for a download.html file is actually a bit moot: you can just stick direct primary download URLs in your PyPI description field, and the tools will pick them up. They can even include #md5 info. (See http://peak.telecommunity.com/DevCenter/EasyInstall#package-index-api - item 4 mentions the description part.) This means, by the way, that you could make an external link cleaner which spiders the external pages and pulls the candidates onto the description for that release, thereby keeping useful primary links and getting rid of the secondary links used to fetch them. ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 01 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] homepage/download metadata cleaning
On 01.03.2013 23:50, Lennart Regebro wrote: On Fri, Mar 1, 2013 at 8:31 PM, M.-A. Lemburg m...@egenix.com wrote: Hmm, then why not remove links that don't match the above from the /simple/ index pages ? I think we can do that, but if we *start* with that, we will just suddenly, with no warning, break everything. Its' better if the installation tools can first warn, then remove their support for this, and *then* we remove these links from /simple/. That way we break things gradually, with warnings so that package managers can react and adapt. As i understood Holger and Phillip, those linkes are not used by the existing package managers. If there are no users, then nothing should break, right ? Of course, breaking things is a bad idea and I don't want to push for that (migration is much better), I just wondered whether this would be a low hanging fruit to clean up the /simple/ index pages a bit. Is there a tools that scans those non-distribution file links from the package descriptions ? -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 01 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] Next generation package infrastructure (was: Deprecate External Links)
On 28.02.2013 09:43, Nick Coghlan wrote: On Thu, Feb 28, 2013 at 6:12 PM, M.-A. Lemburg m...@egenix.com wrote: On 28.02.2013 07:39, Nick Coghlan wrote: 1. The next generation metadata infrastructure will NOT support external hosting of files indexed on PyPI - if you don't upload the archive files to PyPI, they won't be included in the next generation metadata. If you want external hosting, you will need to run a separate index (this is similar to the yum model - you can host files wherever you want, but you need to run yum createrepo yourself to generate the metadata, and instruct users on how to get their installers to retrieve your metadata. The big difference between PyPI and the yum model is that the default index still won't be curated at all, so there's no review process to get through if you want to use it, thus less need for external hosting). Could you elaborate on this ? AFAIK, the metadata only works on package names, regardless of where an installer finds them. Caveat: this is NOT a final design, and people that aren't me will be working out the exact details. It is, however, how I want it to work. :-) The next generation metadata publication infrastructure is likely to be based on TUF, and thus will consist of pregenerated, signed metadata served as static files. Installers will just download metadata files, sdists and wheels (and probably eggs and tarballs), and never need to contact an active web service. The only active web service technically required will be one to regularly refresh the signed timestamp file that prevents certain kinds of attacks based on providing old, insecure versions of software (a cron job running on the server hosting the metadata would suffice for this task). PyPI itself will have another active service to automatically regenerate the metadata when files are uploaded by maintainers. The delegation of trust within the framework will be defined only for files hosted by PyPI - it will not be extended to allow the declaration of external URLs as a source for the target files. Publishers will still be able to publish on external sites, but they will need to generate their own metadata, and distributions published that way won't be indexed in the next generation metadata on PyPI. This is the same way yum repos work - the metadata for each repo only covers SRPMs and RPMs hosted in that repo. If you want to download software from somewhere else, you have to add another repo definition in the client so it knows where to look for the metadata. APT works in a similar fashion. Thanks for the added details. This sounds like a major overhaul of the whole package infrastructure, you have in mind there :-) In order for this to work out, you will need to get the support of people hosting packages externally and address their concerns. The current discussion has been too dogmatic for my taste. A more pragmatic approach would likely be a more reasonable and successful way to achieve a transition. Just as aside: The reason why apt, yum, zypper, etc. have different repos is that those tools work based on trust which is achieved by reviews of the packages in those repos. As user you trust the review process and want to be sure that you are getting the packages that have actually been reviewed. With PyPI, the situation is different, since we don't have a review process for packages. The only guarantee we can provide is that the packages that you download via tools from PyPI or elsewhere match the ones that the authors of the packages created. This requires more trust on the authors, so we better make sure that they are happy with it ;-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 28 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] PyPI limitations (was: Deprecate External Links)
On 27.02.2013 19:11, Noah Kantrowitz wrote: On Feb 27, 2013, at 9:28 AM, M.-A. Lemburg wrote: On 27.02.2013 18:05, Noah Kantrowitz wrote: M.-A. Lemburg m...@egenix.com wrote: I propose we deprecate the external links that PyPI has published on the /simple/ indexes which exist because of the history of PyPI. Ideally in some number of months (1? 2?) we would turn off adding these links from new releases, leaving the existing ones intact and then a few months later the existing links be removed completely. -1. There are many reasons for not hosting packages and distributions on PyPI itself. [citation needed] We've been through this discussion a couple of times in the past. I'm sure the reasons will get listed again in this discussion :-) Too many distribution files for PyPI to handle, Again, please point at a specific package. I wasn't aware that PyPI limited uploads at all, but if it does we can certainly increase the number if there is a good reason. PyPI limits the size of the distribution files (at 40MB), but it doesn't limit the number of distribution files. However, taking our egenix-mx-base package as example, we have 120 distribution files for every single release. Uploading those to PyPI would not only take long, but also quickly get the PyPI storage requirements up to a few TB if just a few package authors start to do the same. no support for UCS2/UCS4 binary distributions, unsupported distribution file formats (e.g. our prebuilt format), Not sure why PyPI would even care what charset the package files use, but if true thats certainly a bug and we can get that fixed. What file formats do pip/buildout support that PyPI doesn't support for uploads? Not the charset of the package files :-) I'm talking about binary files for Python UCS2 vs. UCS4 builds. You have to ship both variants for Unix platforms. Regarding file formats: PyPI applies a number of checks for the supported file formats which not only check the extension, but also look inside the files to only accept a certain number of formats. See https://bitbucket.org/loewis/pypi/src/9863fa859e4b/verify_filetype.py?at=default for details. I was under the impression that this would filter out our prebuilt format, but I just tried an upload and it does seem to pass the tests, so I have to correct the above - our prebuilt format is supported by PyPI (hey, one problem less to worry about ;-)). About the prebuilt format: We created the prebuilt binary package format a while ago to overcome issues with eggs not being flexible enough and not carrying enough information to differentiate between e.g. UCS2/UCS4 build of Python or properly identifying platforms. The format works with easy_install and pip, because the interface is the same as for sdist files: you unzip the archive, run python setup.py ...commands... and you're done. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 26 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] PyPI terms (was: Deprecate External Links)
On 27.02.2013 19:11, Noah Kantrowitz wrote: On Feb 27, 2013, at 9:28 AM, M.-A. Lemburg wrote: [reasons for not hosting distribution files on PyPI] * giving up control This is the point of running a package server, the author gives up control over distribution in order to reap the benefits of solid infrastructure and discoverability. This is a feature. Please see below. The legal restrictions on code on pypi itself is nothing more than needed to let people actually install things, which is kind of the point of listing on pypi. If someone really wants their own universe, run a package server yourself. What other reasons are there? Agreeing to an extra license would block pip anyway, so no loss there. Huge package files maybe? That's not quite true: http://www.python.org/about/legal/ ... third party content providers grant the PSF and all other users of the web site an irrevocable, worldwide, royalty-free, nonexclusive license to reproduce, distribute, transmit, display, perform, and publish such content, including in digital form. Once you upload the files to PyPI, you completely give up control, because that license is irrevocable. This goes far beyond what the Python license does: http://docs.python.org/2/license.html Changing the PyPI terms to be more author-friendly is on my agenda, but I haven't found the time for that particular discussion yet ;-) You are comparing an artifact distribution requirement with a source code license. PyPI's terms don't say a thing about source code or anything else, just that if you want a package file to be installable, we need to be able to send it to people. There is nothing even remotely author unfriendly here, it is just common sense. Again, PyPI is _not_ the only way to publish packages, we are allowed to expect interoperability from people that choose to participate in our community. The distributions files are the content the license is talking about, just as the Python distribution files are distributed under the Python license, so those two are really addressing the same thing. Unlike the PyPI terms, the Python license does not grant an irrevocable license on the content. It even comes with a termination clause, which explicitly says that the license is revoked in case breached. The PyPI terms, OTOH, do not allow revoking the license to distribute the files. This wouldn't be a problem for the PyPI itself, since we'd, of course, help the author to get the files removed. However, the PyPI terms go beyond this in giving all other users of the website those same irrevocable rights... which is a pretty large crowd to ping in case of problems and ask nicely to take down the files. What makes this worse for the author is that they are not required to comply per the current PyPI terms. This is what I meant with giving up control. Removing the irrevocable in the PyPI terms would already go a long way to make the terms more author-friendly, but this will have to be hashed out with our legal counsel. One of the reasons I had started the CloudPyPI project was to address this aspect: having the whole mirror infrastructure under PSF control would resolve the above issues, since we could then remove the all other users... part of the terms altogether. BTW: I've never seen a hosting website require agreeing to giving users of the website the same distribution rights as the owner of the website. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 26 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] Pypi cdn for hosted packages
On 28.02.2013 13:43, Jesse Noller wrote: Can we please actually look at the free offers we are being given versus paying for something for once Sure. This is just for testing. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 28 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] Pypi cdn for hosted packages
On 28.02.2013 13:56, Donald Stufft wrote: The non /simple/ pages for either of this won't work since PyPI will redirect to https://pypi.python.org/ FWIW. I've fixed this for CloudFront: https://d1t66zoqn9vlte.cloudfront.net/ https://d1t66zoqn9vlte.cloudfront.net/pypi both let you see PyPI front-page using the CDN. The package links all have pypi.python.org hardcoded, though, so don't work on the CDN. On Thursday, February 28, 2013 at 7:53 AM, Donald Stufft wrote: On Thursday, February 28, 2013 at 7:49 AM, M.-A. Lemburg wrote: There you go: https://d1t66zoqn9vlte.cloudfront.net/simple/ Same thing on Fastly http://pypi.python.org.a.prod.fastly.net/simple/ Easy :) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 28 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] Pypi cdn for hosted packages
On 28.02.2013 15:02, M.-A. Lemburg wrote: On 28.02.2013 14:37, Giovanni Bajo wrote: Il giorno 28/feb/2013, alle ore 13:53, Donald Stufft donald.stu...@gmail.com ha scritto: On Thursday, February 28, 2013 at 7:49 AM, M.-A. Lemburg wrote: There you go: https://d1t66zoqn9vlte.cloudfront.net/simple/ Same thing on Fastly http://pypi.python.org.a.prod.fastly.net/simple/ Easy :) Given that /simple doesn't expose HTTP cache headers, you're not basically using the CDN for that URL: $ http HEAD https://pypi.python.org/simple/ HTTP/1.1 200 OK Content-Length: 240286 Content-Type: text/html; charset=utf-8 Content-encoding: gzip Date: Thu, 28 Feb 2013 13:04:26 GMT Server: nginx/1.1.19 Strict-Transport-Security: max-age=86400 So it's not that easy. Plus, neither of them are SSL. CloudFront uses SSL: https://d1t66zoqn9vlte.cloudfront.net/simple/ and the default cache retention is 24h if no cache headers are provided. These are the headers provided by CloudFront, first request: 200 OK Content-Type: text/html; charset=utf-8 Content-Length: 1323539 Connection: keep-alive Server: nginx/1.1.19 Date: Thu, 28 Feb 2013 13:56:22 GMT Strict-Transport-Security: max-age=86400 X-Amz-Cf-Id: vXAxMoustlCxyzFAVjjg3EUJG5OgP-ALefiF1mbvbJlW9ZsHCxtdLg== Via: 1.0 3dee24f419c49cc32df542a9410fda87.cloudfront.net (CloudFront) X-Cache: Miss from cloudfront Second request: 200 OK Content-Type: text/html; charset=utf-8 Content-Length: 1323539 Connection: keep-alive Server: nginx/1.1.19 Date: Thu, 28 Feb 2013 13:56:22 GMT Strict-Transport-Security: max-age=86400 Age: 337 X-Amz-Cf-Id: -2COLjgkKLDF83jrr0iFahyAO4UGOMB0hXNM_ROMFJQpII1goFyi-A== Via: 1.0 3dee24f419c49cc32df542a9410fda87.cloudfront.net (CloudFront) X-Cache: Hit from cloudfront I don't see this need to rush. Let's not redo the mistake of the HTTP redirect that broke everything. Even if it takes two weeks with proper testing and integration with the CDN provider, I think it's more than fast enough. The above setups are for testing, and not meant for deployment. Just for the archives: I've now setup CF to not provide content via HTTP, use HTTPS for fetching data from PyPI, use pypi as root object, use a default retention of 3600 seconds (better for testing, for deployment, a higher value may be better - this can be customized on a per path basis). The changes should be visible by now. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 28 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] remove historic download/homepage links for a project
On 28.02.2013 17:27, Ronald Oussoren wrote: On 28 Feb, 2013, at 14:41, holger krekel hol...@merlinux.eu wrote: That's the #2 thing I hate about some packages: removed releases that I faithfully pinned in my buildout (or requirements.txt). Removing releases is, imho, irresponsible. it's bad, yes. But necessary to have. Or am the only one that accidently released a version that had serious bugs? Nope :-) I think removing such releases after an announcement and some time to have people upgrade their pins is a good way to send out wake-up calls to those who continue using the package in their buildouts or other setups. Removing the packages just because you can, is not a good idea, of course. I guess a warning to educate package owners might help prevent this. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 28 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] Migrating away from scanning home pages
I've added the proposal to the wiki to keep collecting comments and updates: http://wiki.python.org/moin/PyPI/DownloadMetaDataProposal On 28.02.2013 12:55, M.-A. Lemburg wrote: On 28.02.2013 12:45, Donald Stufft wrote: On Thursday, February 28, 2013 at 5:55 AM, M.-A. Lemburg wrote: I think we all agree that scanning arbitrary HTML pages for download links is not a good idea and we need to transition away from this towards a more reliable system. Here's an approach that would work to start the transition while not breaking old tools (sketching here to describe the basic idea): Limiting scans to download_url -- Installers and similar tools preferably no longer scan the all links on the /simple/ index, but instead only look at the download links (which can be defined in the package meta data) for packages that don't host files on PyPI. Going only one level deep - If the download links point to a meta-file named packagename-version-downloads.html#sha256-hashvalue, the installers download that file, check whether the hash value matches and if it does, scan the file in the same way they would parse the /simple/ index page of the package - think of the downloads.html file as a symlink to extend the search to an external location, but in a predefined and safe way. Comments * The creation of the downloads.html file is left to the package owner (we could have a tool to easily create it). * Since the file would use the same format as the PyPI /simple/ index directory listing, installers would be able to verify the embedded hash values (and later GPG signatures) just as they do for files hosted directly on PyPI. * The URL of the downloads.html file, together with the hash fragment, would be placed into the setup.py download_url variable. This is supported by all recent and not so recent Python versions. * No changes to older Python versions of distutils are necessary to make this work, since the download_url field is a free form field. * No changes to existing distutils meta data formats are necessary, since the download_url field has always been meant for download URLs. * Installers would not need to learn about a new meta data format, because they already know how to parse PyPI style index listings. * Installers would prefer the above approach for downloads, and warn users if they have to revert back to the old method of scanning all links. * Installers could impose extra security requirements, such as only following HTTPS links and verifying all certificates. * In a later phase of the transition we could have PyPI cache the referenced distribution files locally to improve reliability. This would turn the push strategy for uploading files to PyPI into a pull strategy for those packages and make things a lot easier to handle for package maintainers. I don't have time to respond to the rest right now, but this isn't doable I don't think. The purpose of that legalese you pointed out is to make it possible for PyPI to serve those files legally. We don't know if those files are something PyPI is allowed to distribute so PyPI can't cache them. Thanks for the note. The legalese could be adapted to make this work (if needed) or we could add a flag to the download.html file which makes the choice explicit on a per package basis - the latter might be the better option to address packages that are subject to export control or other restrictions. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 28 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] Pypi cdn for hosted packages
I've created a wiki page with the CloudFront setup description: http://wiki.python.org/moin/CloudPyPI/ExampleCDN -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 28 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] PyPI terms
On 28.02.2013 18:44, Noah Kantrowitz wrote: On Feb 28, 2013, at 2:22 AM, M.-A. Lemburg wrote: BTW: I've never seen a hosting website require agreeing to giving users of the website the same distribution rights as the owner of the website. You should read terms of service more closely then, this is standard because of how lawyers interpret the general foundation of the internet. Because we cannot promise private caches and such will _ever_ delete something just because it is removed from PyPI we need that bit of legal protection. None of us are lawyers to the best of my knowledge so this is not the right place to discuss such things. If our counsel says that requirement isn't needed, we will remove it, otherwise we won't. Then please point me to a hosting license that requires this :-) I've looked around quite a bit and couldn't find any... That said, you're right in that this list is not the right place for such a discussion. I just wanted to explain what I meant with giving up control. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 28 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] Pypi cdn for hosted packages
On 28.02.2013 18:25, Noah Kantrowitz wrote: You can go ahead and shut this down please, as I said our CDN partner has already been selected. I know. Again: this is for testing a CDN setup with installers, mirrors, etc. It is not meant as permanent solution and will get shut down again, after the real thing is live or I run out of budget for this (whichever comes first ;-)). On Feb 28, 2013, at 9:19 AM, M.-A. Lemburg wrote: I've created a wiki page with the CloudFront setup description: http://wiki.python.org/moin/CloudPyPI/ExampleCDN -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 28 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] Deprecate External Links
On 27.02.2013 16:26, Donald Stufft wrote: PyPI is now being served with a valid SSL certificate, and the tooling has begun to incorporate SSL verification of PyPI into the process. This is _excellent_ and the parties involved should all be thanked. However there is still another massive area of insecurity within the packaging tool chain. For those who don't know, when you attempt to install a particular package a number of urls are visited. The steps look roughly something like this: 1. Visit http://pypi.python.org/simple/Package/ and attempt to collect any links that look like it's installable (tarballs, #egg=, etc). Note: /simple/Package/ contains download_url, home_page, and any link that is contained in the long_description). 2. Visit any link referenced as home_page and attempt to collect any links that look like it's installable. 3. Visit any link referenced in a dependency_links and attempt to collect any links that look like it's installable. 4. Take all of the collected links and determine which one best matches the requirement spec given and download it. 5. Rinse and repeat for every dependency in the requirement set. I propose we deprecate the external links that PyPI has published on the /simple/ indexes which exist because of the history of PyPI. Ideally in some number of months (1? 2?) we would turn off adding these links from new releases, leaving the existing ones intact and then a few months later the existing links be removed completely. -1. There are many reasons for not hosting packages and distributions on PyPI itself. If you use and trust a package, you also have to know and trust its dependencies, no matter where they are hosted, so you're not gaining any security by disabling links to other download locations: if you don't trust the way a package is hosted, you don't use it; if you do, then removing the link breaks the package and all its dependencies. Instead of suggesting to removing support for externally hosted packages, why not propose a mechanism to provide a more direct/secure way to reference them ? -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 26 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] Deprecate External Links
On 27.02.2013 17:43, Donald Stufft wrote: On Wednesday, February 27, 2013 at 11:34 AM, M.-A. Lemburg wrote: On 27.02.2013 16:42, Donald Stufft wrote: On Wednesday, February 27, 2013 at 10:39 AM, M.-A. Lemburg wrote: -1. There are many reasons for not hosting packages and distributions on PyPI itself. If you use and trust a package, you also have to know and trust its dependencies, no matter where they are hosted, so you're not gaining any security by disabling links to other download locations: if you don't trust the way a package is hosted, you don't use it; if you do, then removing the link breaks the package and all its dependencies. You also have to know and trust the hosting locations for all of them, and if they are not available via SSL you have to know and trust that there is not a MITM available. Right. I'm not saying that it's not a good idea to host packages on PyPI, but forcing the community into doing this is not a good idea. Instead of suggesting to removing support for externally hosted packages, why not propose a mechanism to provide a more direct/secure way to reference them ? I did mention a method for doing that in my email. However there are reasons beyond the security ones to require packages being hosted on PyPI. Namely uptime, privacy, and performance. Your proposed uploading of hash values would require listing all distribution files for each release somehow. I don't see how you'd get that to work with older Python versions. 1. It is difficult to secure the process of spidering external links for download. 1a. The only way I can think offhand is by requiring uploading a hash of the expected files to PyPI along with the download link and removing all urls except for the download_url. This has the effect that only 1 file can be associated with a particular release. Uptime and performance have in the past been one of the reasons why people chose not to upload files to PyPI. This could be changed, of course. I don't see how. If PyPI goes down then the packaging tools cannot query /simple/foo/ to see the external links. Adding in additional SPOF's only harms uptime, there is no possible way for it to increase it. Package installers only need access to the static files in the /simple/ index. Those can be put behind a CDN to increase uptime. PyPI itself doesn't have to be up and running if you just want to download the files (unfortunately, that's not true at the moment, because the /simple/ index is dynamically generated, but that can be changed). See http://wiki.python.org/moin/CloudPyPI for details. Another reason for not uploading files to PyPI are the license terms you have to agree to on PyPI and the fact that you can no longer control where your distribution files are made available by agreeing to them. This could be changed as well, but we'd need to add more legalese to the PyPI mirror setup for this to work... not sure whether people providing the mirrors would like this. The legalese doesn't particularly give any more rights than any free/OSS license does. There's not a requirement currently that packages on PyPI be free/OSS but this change would only actually affect people who want to upload non free code to PyPI. It does affect any package author, regardless of the license. Some examples: * you may be forced remove a distribution from the net (think DMCA, patents, trademarks, etc) * the distribution may contain a serious bug that you don't want to spread * you may want to keep more accurate statistics of the reach of your project Security can be had by having installers check the GPG signatures of distribution file. You don't need to trust the download site for that. GPG signatures are good, we don't have them yet. And when we do it's only 1 layer of defense, not the final solution. Sure, you still have to trust the author :-) I'm not sure what you meant with privacy in this context. If I download something from server there is a certain amount of information that by nature of HTTP and networking gets leaked to that host. Additionally if it's done via non TLS connections it also gets leaked to anyone who has a MITM on my connection. This is especially important in countries where the government actively surveils or modifies the traffic of their citizens. I can see an issue with e.g. trying to download code that is illegal to use in a country (e.g. crypto code, exploits, hacks, etc.), but the country officials would probably just block the complete PyPI site than bother with filtering single requests. IMO, that's beyond the scope of what we're discussing here, though. Something that would work even with older Python versions is letting the download URL point to a meta-file which contains the links to the other distribution files. That way you avoid having the installers trying to parse arbitrary websites and you can add more security to the downloads
Re: [Catalog-sig] User profile: PGP Key ID
On 20.02.2013 21:03, Donald Stufft wrote: On Wednesday, February 20, 2013 at 3:02 PM, Daniel Holth wrote: You know how to do S/MIME; how much harder would it be to use X.509 signatures as are supported with openssl and bundled GUI cert managers on all OSs? Signing tech doesn't really matter. I suspect societal and possibly legal requirements will make that choice over technical reasons. Relying only on OpenSSL would have the great advantage of being able to all the verification/signing/key generation in Python. But it's missing an infrastructure to revoke keys, unless you also implement SSL key revocation mechanisms and have users get official paid/free SSL client certificates from certificate vendors that provide CRLs or support OTRS. At that point, the SSL infrastructure becomes just as difficult to deal with as GPG/PGP, so there isn't much to win both ways, IMO. You just have to deal with it... -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 20 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] User profile: PGP Key ID
On 20.02.2013 21:12, M.-A. Lemburg wrote: On 20.02.2013 21:03, Donald Stufft wrote: On Wednesday, February 20, 2013 at 3:02 PM, Daniel Holth wrote: You know how to do S/MIME; how much harder would it be to use X.509 signatures as are supported with openssl and bundled GUI cert managers on all OSs? Signing tech doesn't really matter. I suspect societal and possibly legal requirements will make that choice over technical reasons. Relying only on OpenSSL would have the great advantage of being able to all the verification/signing/key generation in Python. But it's missing an infrastructure to revoke keys, unless you also implement SSL key revocation mechanisms and have users get official paid/free SSL client certificates from certificate vendors that provide CRLs or support OTRS. Sorry, s/OTRS/OCSP/ .. though using a ticket system for revocations doesn't sound all that strange either :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 20 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] User profile: PGP Key ID
On 20.02.2013 21:18, Christian Heimes wrote: Am 20.02.2013 21:12, schrieb M.-A. Lemburg: On 20.02.2013 21:03, Donald Stufft wrote: On Wednesday, February 20, 2013 at 3:02 PM, Daniel Holth wrote: You know how to do S/MIME; how much harder would it be to use X.509 signatures as are supported with openssl and bundled GUI cert managers on all OSs? Signing tech doesn't really matter. I suspect societal and possibly legal requirements will make that choice over technical reasons. Relying only on OpenSSL would have the great advantage of being able to all the verification/signing/key generation in Python. But it's missing an infrastructure to revoke keys, unless you also implement SSL key revocation mechanisms and have users get official paid/free SSL client certificates from certificate vendors that provide CRLs or support OTRS. At that point, the SSL infrastructure becomes just as difficult to deal with as GPG/PGP, so there isn't much to win both ways, IMO. You just have to deal with it... David Wolever has send me this link: https://github.com/singpolyma/OpenPGP-Python I guess it could also be implemented on top of openssl if Python provides bindings to RSA primitives. If you really want to, you can do all this in pure Python, e.g. using http://stuvel.eu/rsa -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 20 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] HTTPS now promoted on PyPI
On 19.02.2013 14:23, Giovanni Bajo wrote: Il giorno 19/feb/2013, alle ore 06:13, Richard Jones r1chardj0...@gmail.com ha scritto: Hi all, I've just altered the nginx configuration to promote (ie. redirect to) HTTPS for all GET/HEAD requests. This includes HSTS, but I've set the lifetime to 1 day just in case there's some HTTPS compatibility issues. Once it's bedded down I'll bump it to a year. What is the benefits of redirects? I think they just hide potential problems, and they still can be exploited by MITM through ssl-stripping. Plus, they cause breakage and/or UX problems in existing tools. Given that they give basically no security, I would suggest their removal until we fix all important issues in all third-party tools. For browsers, since you can still serve HSTS headers even without redirects, we can get it included in Chrome and Firefox builtin HSTS list. 2. incorporate some monkey-patching into distribute and setuptools and promote those, I think this is our best bet for an immediate and global solution for outdated versions of Python as well. I will work to prepare a distutils patch that is compatible with 2.6 (which includes SSL), and then adapt it for 2.7 and 3.x. Do we have numbers of how many 2.5-compatible packages have been updated in the last 6 months? Older Zope and Plone installations still use Python 2.4, so I guess that's the first version we'd have to support. zc.buildout is used by those, which in return uses setuptools. AFAIR, the ssl module (https://pypi.python.org/pypi/ssl/) doesn't work well - we tried using it for our mxODBC Connect product and found too many issues/deficiencies, so dropped the idea. pyOpenSSL does support Python 2.4+ and does the job nicely. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 19 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] HTTPS now promoted on PyPI
On 19.02.2013 14:47, M.-A. Lemburg wrote: On 19.02.2013 14:23, Giovanni Bajo wrote: Il giorno 19/feb/2013, alle ore 06:13, Richard Jones r1chardj0...@gmail.com ha scritto: Hi all, I've just altered the nginx configuration to promote (ie. redirect to) HTTPS for all GET/HEAD requests. This includes HSTS, but I've set the lifetime to 1 day just in case there's some HTTPS compatibility issues. Once it's bedded down I'll bump it to a year. What is the benefits of redirects? I think they just hide potential problems, and they still can be exploited by MITM through ssl-stripping. Plus, they cause breakage and/or UX problems in existing tools. Given that they give basically no security, I would suggest their removal until we fix all important issues in all third-party tools. For browsers, since you can still serve HSTS headers even without redirects, we can get it included in Chrome and Firefox builtin HSTS list. 2. incorporate some monkey-patching into distribute and setuptools and promote those, I think this is our best bet for an immediate and global solution for outdated versions of Python as well. I will work to prepare a distutils patch that is compatible with 2.6 (which includes SSL), and then adapt it for 2.7 and 3.x. Do we have numbers of how many 2.5-compatible packages have been updated in the last 6 months? Older Zope and Plone installations still use Python 2.4, so I guess that's the first version we'd have to support. zc.buildout is used by those, which in return uses setuptools. AFAIR, the ssl module (https://pypi.python.org/pypi/ssl/) doesn't work well - we tried using it for our mxODBC Connect product and found too many issues/deficiencies, so dropped the idea. pyOpenSSL does support Python 2.4+ and does the job nicely. These are the stats for binary files hosted on PyPI, broken down by Python version and based on the new stats file Richard uploaded: # wc *.csv 485 485 24074 2013-02-19-py2.3.csv 64586458 389553 2013-02-19-py2.4.csv 66396659 353739 2013-02-19-py2.5.csv 76297631 426457 2013-02-19-py2.6.csv 55195526 295462 2013-02-19-py2.7.csv 13511355 70731 2013-02-19-py3.x.csv 154857 155175 7917838 2013-02-19-totals.csv Broken down by file types: # wc *files.csv 25585 25598 1431013 2013-02-19-egg-files.csv 46194640 236694 2013-02-19-exe-files.csv 254 255 13402 2013-02-19-msi-files.csv 104691 104853 5251962 2013-02-19-tar-gz-files.csv 24 241221 2013-02-19-whl-files.csv 17937 18022 905913 2013-02-19-zip-files.csv 153110 153392 7840205 total I'm sure a lot more useful information could be extracted from the stats. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 19 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] Massive download problems using https://
Same here. The web interface got really slow after the switch. On 19.02.2013 14:55, Andreas Jung wrote: Hi there, since the switch to https:// I have massive problems running larger buildouts. After every second or third pulled package I receive a connection reset by peer error. Andreas -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 19 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
[Catalog-sig] Problem switching to https://pypi.python.org/pypi (and work-around)
I wanted to switch to the HTTPS address of PyPI today, but the change in my .pypirc did not result in the expected seemless upgrade ;-) Here's my working .pypirc (fairly standard): [distutils] index-servers = pypi [pypi] repository = http://pypi.python.org/pypi username = xyz password = abc If I change just the http:// to https://, the register command asks me for login details (without doing any communication with the server). The same happens when changing the repository to anything other than the above default, e.g. http://www.python.org/pypi I tried this with Python 2.6 and 2.7. Looking at the code in Lib/distutils/config.py this appears to be due to the ._read_pypirc() method choosing the server entry based on the repository that was given as command line option to the register command (or the default setting), so the change in .pypirc to https:// is not enough to get distutils updated; I'll also have to add a [register] section to the ~/pydistutils.cfg. I would have expected distutils to lookup pypi and simply use whatever is defined there as repository. After looking closer, I found there's a trick one can use to avoid the pydistutils.cfg change. If the server section is named after the default repository URL, distutils will use the section and still read the (new) repository URL: [distutils] index-servers = http://pypi.python.org/pypi [http://pypi.python.org/pypi] repository = https://pypi.python.org/pypi username = xyz password = abc I verified with wireshark that this does result in HTTPS communication. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 18 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] Mandatory Reset of PyPI Passwords
On 14.02.2013 00:17, Richard Jones wrote: On 13 February 2013 22:32, Giovanni Bajo ra...@develer.com wrote: Il giorno 13/feb/2013, alle ore 12:14, Richard Jones rich...@python.org ha scritto: 2. fix the email password reset debacle (mostly written, not tested), Is this committed anywhere I can take a look? It will be presently. In short, the old procedure was: 1. user enters username in form and is emailed a link back to PyPI which embeds the username and password, 2. user clicks link and, on receiving both username and email address a new password is generated and mailed to the email address. If the user knows both the username and email address they can skip straight to step 2. The new scheme involves: 1. user enters username in I've forgotten my password form, 2. PyPI emails user with a link back to itself with a reset OTK (32 random chars from letters+digits) valid for 6 hours, 3. On clicking the link the user sees a password reset form where they enter their new password, and 4. On submitting the reset form the OTK is deleted and password changed. If an invalid username is entered PyPI will say so: the set of pypi usernames is public anyway through APIs and general web scraping and this behaviour is more user-friendly than the more common I may or may not have emailed you a reset email. Thanks for sending the scheme. To help prevent phishing attacks, you could add a user token field to the form in step 1, which is sent in the step 2 email. A user can then more easily detect whether s/he requested the password reset. VISA/MasterCard use a similar approach with their user defined welcome message. The scheme does not protect against email sniffing attacks, but I'm not sure how that could be done without adding some form of two factor authentication. Here's the scenario: * it's PyCon again * attacker sets up a script that runs the password reset form for a few hundred interesting accounts * attacker sets up a WLAN sniffer to look for pypi reset emails * attacker starts the script and waits for reset emails * attacker finds an email, uses the link and sets a new password on the account before the account owner can intervene While many people will probably use secure methods to access their email, there will likely be some that also receive emails on mobile phones or pads using plain text transmission. It may be helpful to add some form of surge protection to PyPI to detect and help prevent the above attack. Such a surge protection would likely also help detect unwanted PyPI crawling. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 14 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] Allowing the upload of .py files at PyPI
On 14.02.2013 20:28, Tarek Ziadé wrote: Hello Some tools (setuptools, distribute, zope, pip) use bootstrap files to get installed, In order to have a more secured installation process, we'd like to be able to push those files on PyPI so people can download them through https using the PSF certificate. As Phillip Eby noticed, that requires changing this method https://bitbucket.org/loewis/pypi/src/f18ce0fbe947c1ce778761ea81d6704572cebb24/webui.py?at=default#cl-2233 by: - allowing .py extensions, - allowing arbitrary file names when they have the .py extension Any objection if I provide a pull request for this ? +1 -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 14 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] Allowing the upload of .py files at PyPI
On 14.02.2013 23:10, Nick Coghlan wrote: On 15 Feb 2013 05:50, Tarek Ziadé ta...@ziade.org wrote: On 2/14/13 8:37 PM, Donald Stufft wrote: On Thursday, February 14, 2013 at 2:28 PM, Tarek Ziadé wrote: Hello Some tools (setuptools, distribute, zope, pip) use bootstrap files to get installed, In order to have a more secured installation process, we'd like to be able to push those files on PyPI so people can download them through https using the PSF certificate. As Phillip Eby noticed, that requires changing this method https://bitbucket.org/loewis/pypi/src/f18ce0fbe947c1ce778761ea81d6704572cebb24/webui.py?at=default#cl-2233 by: - allowing .py extensions, - allowing arbitrary file names when they have the .py extension Arbitrary file names is a bad idea imo. What's to stop me from uploading setup_distribute.py and linking to it as if it was distribute_setup.py and installing a malware'd distribute. If you can upload in that location, it means you are a legit owner/maintainer of the project AFAIK I'm more concerned about phishing style attacks. I don't want the PyPI admins to have to start scanning for hostile names like distirbute. So how often do the bootstrap files change? If relatively frequently, I would prefer this to be a project-specific privilege granted by the PyPI admins (at least for now). If rarely, then I'd be happy enough if the update process required PyPI admin involvement (the project whitelist is probably a better idea, though). I don't follow the reasoning here. What's the difference between uploading a .py file and a .tar.gz file ? AFAIK, the only reason why the file extensions are restricted is to prevent people from uploading MP3s, movies or other material that doesn't belong on PyPI - not because there are security concerns. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 14 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] Allowing the upload of .py files at PyPI
On 14.02.2013 23:38, Donald Stufft wrote: On Thursday, February 14, 2013 at 5:34 PM, M.-A. Lemburg wrote: I don't follow the reasoning here. What's the difference between uploading a .py file and a .tar.gz file ? AFAIK, the only reason why the file extensions are restricted is to prevent people from uploading MP3s, movies or other material that doesn't belong on PyPI - not because there are security concerns. Personally (might by different for Nick) it's less a problem with uploading .py files and more a problem with allowing arbitrary names. Ok, then I guess allowing [a-zA-Z0-9_-]+\.py is enough for starters... we don't need to support the whole Unicode range on PyPI ;-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 14 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] Mandatory Reset of PyPI Passwords
On 13.02.2013 13:36, Jesse Noller wrote: Direct email has a better conversion rate, that's basic marketing :) Indeed, and if you can handle the load of support emails, that's certainly a better way to do this. It depends on how many account there are on PyPI. I'm just mentioning this, because we had to make the same decision a while ago and decided against doing emails. On Feb 13, 2013, at 7:27 AM, M.-A. Lemburg m...@egenix.com wrote: On 13.02.2013 13:13, Antoine Pitrou wrote: Richard Jones richard at python.org writes: 3. send email to all registered users indicating that all users must change their password and a forced reset will take place in a week's time for users who have not done so, and What about users who've already changed their password? Depending on the number of users you might rather want to use a banner on the website and a blog post instead of emailing them directly. Given the 11k users on the Python wiki, we chose not to send out emails... just think of the number of emails with questions you'd get and have to answer. Regarding the timing, I'd use a longer period. People don't do releases every two weeks and you normally don't check in to PyPI to search for a package. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 13 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 13 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] Mandatory Reset of PyPI Passwords
Hi Richard, On 13.02.2013 12:14, Richard Jones wrote: My intention is to: 2. fix the email password reset debacle (mostly written, not tested), Could you post a description of the new procedure ? Not that I wouldn't trust your capabilities :-) ... I just think more eyes would be good to make sure it's waterproof and doesn't allow highjacking accounts by sniffing email traffic. Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 13 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] Pull request to migrate PyPI to bcrypt
Richard Jones wrote: Given the discussion on the pull request I think I'll hold off. There seems to be some question regarding its appropriateness which I'm not really in a position to judge. FWIW, the DoS problem with the multi-round hash algorithms was also an issue for moin. They chose to use passlib with moin: http://hg.moinmo.in/moin/1.9/file/tip/docs/CHANGES and the default hash algorithm is sha512_crypt. Everything was made configurable to be able to easily switch algorithms and use different number of rounds to adjust for the use cases. See these links for a discussion on the hash algorithms and rounds values: http://pythonhosted.org/passlib/new_app_quickstart.html#choosing-a-hash http://pythonhosted.org/passlib/password_hash_api.html#choosing-the-right-rounds-value Also note that these password hashes mainly protect against the case where a user uses the same password for multiple services. If an attacker gets access to the stored password hashes, he'll already have all the power he needs to change any aspect of an arbitrary number of accounts, including changing the passwords, so we're not gaining any protection for *PyPI* by using a high number of rounds. The additional number of rounds only protect against use of the passwords on other services. As a result, the DoS problem weighs more in this context than the protection against brute-force or rainbow table attacks. Let's please not get paranoid over all this. As long as the parameters remain configurable, we can approach these things in small steps and don't need to get all tied up in discussions about how to turn PyPI into Fort Knox :-) Richard On 10 February 2013 21:57, Richard Jones rich...@python.org wrote: Thanks, I'll be reviewing that tomorrow if Martin doesn't beat me to it. Richard On 10 February 2013 14:26, Giovanni Bajo ra...@develer.com wrote: Hi, I went ahead with an important task in my security design doc: migration of PyPI to bcrypt. This is the pull request: https://bitbucket.org/loewis/pypi/pull-request/2/use-bcrypt-instead-of-unsalted-sha1/diff -- Giovanni Bajo :: ra...@develer.com Develer S.r.l. :: http://www.develer.com My Blog: http://giovanni.bajo.it ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ ::: Try our new mxODBC.Connect Python Database Interface for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] Pull request to migrate PyPI to bcrypt
Giovanni Bajo wrote: Il giorno 11/feb/2013, alle ore 13:25, Jesse Noller jnol...@gmail.com ha scritto: Actually I was thinking about this in the shower: the likelihood that pypi users used the same passwords as they did on the wiki is probably much higher than any of us assume. Given that the passwords were unsalted in both instances, a set intersection is enough to verify. The moin wiki passwords were salted. The reason we reset the passwords, was that the attackers had access to both the salt and the hashes. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ ::: Try our new mxODBC.Connect Python Database Interface for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] Pull request to migrate PyPI to bcrypt
On 11.02.2013 14:49, Christian Heimes wrote: Am 11.02.2013 14:38, schrieb Donald Stufft: On Monday, February 11, 2013 at 8:15 AM, M.-A. Lemburg wrote: Giovanni Bajo wrote: Il giorno 11/feb/2013, alle ore 13:25, Jesse Noller jnol...@gmail.com mailto:jnol...@gmail.com ha scritto: Actually I was thinking about this in the shower: the likelihood that pypi users used the same passwords as they did on the wiki is probably much higher than any of us assume. Given that the passwords were unsalted in both instances, a set intersection is enough to verify. The moin wiki passwords were salted. The reason we reset the passwords, was that the attackers had access to both the salt and the hashes. What were they hashed with? Even with a salt a fast hash is trivial to bruteforce for a large number of passwords in practically no time with trivial hardware. It uses SSHA, that's sha1(password + salt) with a seven char salt. Right, should have added that information. BTW: I wonder why salt and password are usually stored together in the same place. The moin implementation also did not add any application salt to the password string before calculating the hash value (ie. x = hash(random_salt + application_salt + password)). Not sure whether passlib does, either. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 11 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] Pull request to migrate PyPI to bcrypt
On 11.02.2013 20:41, Antoine Pitrou wrote: M.-A. Lemburg mal at egenix.com writes: Let's please not get paranoid over all this. As long as the parameters remain configurable, we can approach these things in small steps and don't need to get all tied up in discussions about how to turn PyPI into Fort Knox Fort Knox is in the US, which would create issues with crypto export laws, no? Oh, PyPI is as well... and it comes with built-in export via the mirror framework ;-) Here's some background information, if you want to host crypto code on PyPI: http://www.bis.doc.gov/encryption/encfaqs6_17_02.html#2 http://www.ecfr.gov/cgi-bin/retrieveECFR?gp=SID=306f1d2ac33ae74f6d060e14258309bar=PARTn=15y2.1.3.4.22#15:2.1.3.4.22.0.1.3 http://www.ecfr.gov/cgi-bin/text-idx?c=ecfrrgn=div5view=textnode=15:2.1.3.4.25idno=15#15:2.1.3.4.25.0.1.13 For a more comprehensible writeup, see how the Apache foundation handles these things: http://www.apache.org/dev/crypto.html Finally, to make things more complicated, this whole EAR thing is constantly changing, see e.g. http://cryptome.org/0003/bis010711.htm http://www.bis.doc.gov/encryption/default.htm http://www.bis.doc.gov/encryption/question2.htm At the moment, it's probably best to send an email to the BIS if you make the software available at some URL for the first time. That satisfies the TSU reporting requirements. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 11 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] PyPI and setuptools
Giovanni Bajo wrote: Il giorno 10/feb/2013, alle ore 00:43, M.-A. Lemburg m...@egenix.com ha scritto: On 10.02.2013 00:13, Stephen Thorne wrote: Hello, One of my concerns with the recent pip dramas that have seen some excellent and timely action from catalog-sig and others, is that 'setuptools' is still widely distributed and used instead of distribute/pip. Just as data point: distribute isn't using HTTPS either and the distribute bootstrap site doesn't work with HTTPS: http://python-distribute.org/ (https://python-distribute.org/ gives Error code: ssl_error_rx_record_too_long in Firefox) By redirecting the PyPI main and mirror sites from HTTP to HTTPS you can upgrade older installations. Alas, this redirection wouldn't fix the main issue, because a MITM can still proxy the connection, swallow the redirection, and insert a malware in the downloaded package. The only way to really fix it is to patch all PyPI clients, including distribute. The main problem at the moment is transferring passwords in plain text :-) If you gain access to the password of an account that manages popular packages, you don't need any of the MITM attacks - you simply modify the existing packages on the PyPI server. Moving to HTTPS will be a first step in making this harder. An alternative approach would be to make people more aware of the possibility to configure the PyPI site URL in a distutils config file (even globally) and changing the URL from HTTP to HTTPS there: * distutils config files: http://docs.python.org/2/install/index.html#inst-config-files * setuptools: http://peak.telecommunity.com/DevCenter/EasyInstall#configuration-files http://peak.telecommunity.com/DevCenter/EasyInstall#command-line-options (the option is called --index-url) * distribute: http://pythonhosted.org/distribute/easy_install.html#configuration-files http://pythonhosted.org/distribute/easy_install.html#reference-manual (the option is called --index-url) The problem with this approach is that Python standard library does not validate SSL certificates. So even if you force a urllib-based tool to access PyPI through https, it doesn't help at all in case of a MITM attack. I know, but it's already a lot better than using HTTP (see above) :-) If we could get all servers talking HTTPS using validating certificates, that would already be a major step forward. This includes servers that provide bootstrapping for distribute/setuptools and pip, as well as the main PyPI server and all mirrors. PyPI will soon get a validating certificate. I'm not sure about distribute and the mirror servers. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ ::: Try our new mxODBC.Connect Python Database Interface for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] PyPI doesn't serve the correct mimetypes
On 10.02.2013 18:00, Antoine Pitrou wrote: $ curl -I http://pypi.python.org/packages/source/z/zope.interface/zope.interface-4.0.3.tar.gz HTTP/1.1 200 OK Server: nginx/1.1.19 Date: Sun, 10 Feb 2013 16:59:29 GMT Content-Type: application/octet-stream Content-Length: 140124 Last-Modified: Mon, 31 Dec 2012 18:23:12 GMT Connection: keep-alive Accept-Ranges: bytes But: [] mimetypes.guess_type(foo.tar.gz) ('application/x-tar', 'gzip') I suppose this is done to make sure that browsers open the save as dialog on the original file. If you instead set the correct MIME type on these files, browsers often tend to transparently decompress the files before offering to save them, which is not what you want in this use case. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 10 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] PyPI doesn't serve the correct mimetypes
On 10.02.2013 18:11, Antoine Pitrou wrote: M.-A. Lemburg mal at egenix.com writes: On 10.02.2013 18:00, Antoine Pitrou wrote: $ curl -I http://pypi.python.org/packages/source/z/zope.interface/zope.interface-4.0.3.tar.gz HTTP/1.1 200 OK Server: nginx/1.1.19 Date: Sun, 10 Feb 2013 16:59:29 GMT Content-Type: application/octet-stream Content-Length: 140124 Last-Modified: Mon, 31 Dec 2012 18:23:12 GMT Connection: keep-alive Accept-Ranges: bytes But: [] mimetypes.guess_type(foo.tar.gz) ('application/x-tar', 'gzip') I suppose this is done to make sure that browsers open the save as dialog on the original file. If you instead set the correct MIME type on these files, browsers often tend to transparently decompress the files before offering to save them, which is not what you want in this use case. It's precisely what I want, actually. I wanted the file to open in the archive manager, not in my text editor (or any other program chosen at random by the OS / browser). Well, yes, but you normally want to be able to save the original foo.tar.gz file instead of the foo.tar file. Perhaps PyPI could send the mime type application/x-compressed-tar to allow having both. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 10 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] PyPI and setuptools
On 10.02.2013 00:13, Stephen Thorne wrote: Hello, One of my concerns with the recent pip dramas that have seen some excellent and timely action from catalog-sig and others, is that 'setuptools' is still widely distributed and used instead of distribute/pip. Just as data point: distribute isn't using HTTPS either and the distribute bootstrap site doesn't work with HTTPS: http://python-distribute.org/ (https://python-distribute.org/ gives Error code: ssl_error_rx_record_too_long in Firefox) By redirecting the PyPI main and mirror sites from HTTP to HTTPS you can upgrade older installations. The only problem with this approach is that some Python installations may not have OpenSSL available, so HTTPS doesn't work for them. For those installations, the redirect would mean a complete cut-off from PyPI. An alternative approach would be to make people more aware of the possibility to configure the PyPI site URL in a distutils config file (even globally) and changing the URL from HTTP to HTTPS there: * distutils config files: http://docs.python.org/2/install/index.html#inst-config-files * setuptools: http://peak.telecommunity.com/DevCenter/EasyInstall#configuration-files http://peak.telecommunity.com/DevCenter/EasyInstall#command-line-options (the option is called --index-url) * distribute: http://pythonhosted.org/distribute/easy_install.html#configuration-files http://pythonhosted.org/distribute/easy_install.html#reference-manual (the option is called --index-url) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 10 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] [Draft] Package signing and verification process
On 07.02.2013 12:49, Giovanni Bajo wrote: Il giorno 07/feb/2013, alle ore 11:59, M.-A. Lemburg m...@egenix.com ha scritto: Sorry, if this has already been mentioned, but we could make GPG signing very user friendly for the PyPI users by: - having the PyPI server verify the uploaded file against the registered GPG key of the uploader - have the PyPI server sign the uploaded file using its own key (so you have two .asc signature files per upload - one coming directly from the uploader and another one from the PyPI server) - have package managers verify the downloaded file against the signature applied by PyPI Package managers would only have to know the PyPI public key for this to work. Users who want to apply an extra check, could also verify the uploader's .asc signature file, but this would require downloading and installing the uploader's GPG key; in return for the extra work, they'd get two way verification, though. The concept is based on trust: PyPI trusts the uploader provided that s/he is using the registered GPG key. Package managers (and users) trust PyPI. This has been already proposed (first mail in this thread), but I still fail to see, from a security perspective, what the additional signature performed by PyPI buys us. It is complicated and delicate to handle on the server side, it would require key management, rotation, etc. and I still don't see what is the point. As long as PyPI tells the client key ABCD1234 is authoritative for package django, and it tells it through a (verified) SSL connection, I don't think the signature itself is useful. Can you please describe an attack that can be mounted against PyPI/pip that is prevented by having this additional signature? This is not about preventing some kind of attack. It's to simplify the setup for the user of PyPI (via the package manager). The user will no longer have to install several tens or even hundreds of different uploader GPG keys locally just to be able to verify the downloads. Instead, just the PyPI key is needed. I think that's important to not disrupt the PyPI user experience. Additionally, as already mentioned by Lennart, all the GPG interaction could be handled by the package managers. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 07 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] [Draft] Package signing and verification process
On 07.02.2013 15:13, Giovanni Bajo wrote: Il giorno 07/feb/2013, alle ore 12:55, M.-A. Lemburg m...@egenix.com ha scritto: Can you please describe an attack that can be mounted against PyPI/pip that is prevented by having this additional signature? This is not about preventing some kind of attack. It's to simplify the setup for the user of PyPI (via the package manager). The user will no longer have to install several tens or even hundreds of different uploader GPG keys locally just to be able to verify the downloads. Instead, just the PyPI key is needed. I think that's important to not disrupt the PyPI user experience. Additionally, as already mentioned by Lennart, all the GPG interaction could be handled by the package managers. Yes, but *all* of the above requirements can be obtained by simply having PyPI tell pip key ABCD1234 is authoritative for package django. pip can then tell GPG to go getting the key automatically from a first-party or third-party keyserver (eg: launchpad). I'm absolutely *not* suggesting the user to go downloading tons of GPG keys manually. I don't think anyone would want to have pip installing hundreds of PyPI uploader GPG keys locally, even less so, if just one is enough :-) I, for one, certainly wouldn't want to have my keyring cluttered up with all those GPG keys, or managing the trust state of all those keys to prevent GPG warnings such as: gpg: WARNING: This key is not certified with a trusted signature! gpg: There is no indication that the signature belongs to the owner. Having PyPI sign the file would also provide a possibility to keep files, for which the uploader key was later revoked or which expired, in a verifiable state. I will draft an updated document, based on Heimes' proposal, so that we can all synchronize. Ok. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 07 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] [Draft] Package signing and verification process
On 07.02.2013 16:04, Giovanni Bajo wrote: Il giorno 07/feb/2013, alle ore 15:35, M.-A. Lemburg m...@egenix.com ha scritto: On 07.02.2013 15:13, Giovanni Bajo wrote: Il giorno 07/feb/2013, alle ore 12:55, M.-A. Lemburg m...@egenix.com ha scritto: Can you please describe an attack that can be mounted against PyPI/pip that is prevented by having this additional signature? This is not about preventing some kind of attack. It's to simplify the setup for the user of PyPI (via the package manager). The user will no longer have to install several tens or even hundreds of different uploader GPG keys locally just to be able to verify the downloads. Instead, just the PyPI key is needed. I think that's important to not disrupt the PyPI user experience. Additionally, as already mentioned by Lennart, all the GPG interaction could be handled by the package managers. Yes, but *all* of the above requirements can be obtained by simply having PyPI tell pip key ABCD1234 is authoritative for package django. pip can then tell GPG to go getting the key automatically from a first-party or third-party keyserver (eg: launchpad). I'm absolutely *not* suggesting the user to go downloading tons of GPG keys manually. I don't think anyone would want to have pip installing hundreds of PyPI uploader GPG keys locally, even less so, if just one is enough :-) OK so we need to both make happy Jesse that doesn't even want pip to run GPG under the hood without him even realizing that gpg exists and is being used as a crypto primitive, and you that want to keep a clean keychain that might become too cluttered by too many keys :) I'm sure Jesse doesn't care if the GPG keychain (which he doesn't even want to have) becomes too cluttered, because he doesn't even want to learn how to dump the keychain contents, or to install a GUI tool to inspect it. I think this will be the case for the large majority of users that simpy run apt-get install gpg once and then forget about it and go on with their normal pip work (with a fully transparent level of additional security). I, for one, certainly wouldn't want to have my keyring cluttered up with all those GPG keys, or managing the trust state of all those keys to prevent GPG warnings such as: gpg: WARNING: This key is not certified with a trusted signature! gpg: There is no indication that the signature belongs to the owner. You wouldn't need to manage the trust of any key. The trust is on PyPI. Once PyPI tells you that key ABCD1234 is trusted for package django, you just check whether there is a valid signature from ABCD1234 for all downloads related to package Django. That is irrespective on your default trust level for key ABCD1234. In other words, I don't think it's correct to use the trust level in the keychain; if ABCD1234 is Denis Bilenko's keys, I don't want to say this is trusted, so please install *any* packaged signed by ABCD1234; I want to say this is trusted FOR GEVENT, so please install gevent only if signed by Denis. This cannot be expressed by the GPG keychain trust levels. My idea is that PyPI will supply this list of trusts to users by default eg: as a text configuration file, downloadable over HTTPS, that can be automatically updated by pip every once in a while: gevent = abcd1234 django = 45678ad,bd14578,ce1244ab [...] Very advanced users might want to hand-edit it in some way (eg: trimming the list of packages, so that some packages cannot be installed on that system, so to block possible attack vectors), and even block automatic updates of such list from PyPI (so not to trust PyPI for it). (PS: I'm using short fingerprints in all my examples, but I'm aware of the security implications, and I think we should use the full key ID everywhere). I'm not sure I follow you. The GPG output is generated when running the verify command on a signature where you do have the key in the keyring, but have not set the trust setting of that key in the ring. The trust flag in GPG is normally used to indicate that you have checked that the key does indeed belong to the person it is assigned to. Such checks can be done at code signing parties, over the telephone, etc. You'd normally not set the trust flag without having gone through such a procedure, so the above warning won't go away. If you only have to manage one key, the PyPI key, you can verify the key by looking on the PyPI site and comparing fingerprints. It will most likely also get signatures from core developers, so that information can also be used to check the key. As a result, trusting this one key is well possible and easily manageable. You could even ship pip with a keyring that already has that key setup with the trust flag and use this keyring for verification, leaving the user's own keyring completely untouched. Having PyPI sign the file would also provide a possibility to keep files, for which
Re: [Catalog-sig] [Draft] Package signing and verification process
On 06.02.2013 21:33, Donald Stufft wrote: On Wednesday, February 6, 2013 at 3:31 PM, Vinay Sajip wrote: Donald Stufft donald.stufft at gmail.com (http://gmail.com) writes: * Do we have bindings to GPG that we can use? There's python-gnupg [1][2] which I maintain. I test it on Linux, Mac OS X and Windows. It relies on an already installed GnuPG executable being available, and works through the subprocess module to talk to it. It covers most GnuPG functions which don't require back-and-forth interaction with a user (such as editing keys). Regards, Vinay Sajip [1] https://code.google.com/p/python-gnupg/ [2] http://packages.python.org/python-gnupg/ Yea I'm actually aware of that, However it requires installing GPG like you said which is pretty unfriendly in general on Windows, and adds another barrier to release. Try gnupg-w32cli which is really easy to install and doesn't get in your way: http://lists.gnupg.org/pipermail/gnupg-announce/2012q1/000313.html -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 06 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] [Draft] Package signing and verification process
On 06.02.2013 22:05, Jesse Noller wrote: On Wednesday, February 6, 2013 at 4:02 PM, Donald Stufft wrote: On Wednesday, February 6, 2013 at 4:01 PM, Vinay Sajip wrote: M.-A. Lemburg mal at egenix.com (http://egenix.com) writes: Try gnupg-w32cli which is really easy to install and doesn't get in your way: http://lists.gnupg.org/pipermail/gnupg-announce/2012q1/000313.html Or, to fast-track to the binaries, look in here: ftp://ftp.gnupg.org/gcrypt/binary/ As MAL says, installation with these installers is fairly painless. Average end user: What's a GPG Or even those of us familiar and using it day to day Oh Jeez not again Get Enigmail for GPG key management (http://www.enigmail.net/home/index.php) and the pain is over... :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 06 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] Use user-specific site-packages by default?
On 05.02.2013 09:02, Lennart Regebro wrote: On Tue, Feb 5, 2013 at 8:42 AM, M.-A. Lemburg m...@egenix.com wrote: On 05.02.2013 02:36, Nick Coghlan wrote: Something that caught my attention in the recent security discussions is the observation that one of the most common insecure practices in the Python community is to run sudo pip with unsigned packages (sometimes on untrusted networks). To my mind, this is a natural reaction to the user experience of pip: you run pip install package, it complains it can't write to the system site packages directory, so you run sudo pip install package to give it the permissions it clearly wants. If pip used the user site packages by default (when running as anyone other than root), that dangerous UI flow wouldn't happen. Even when pip was run outside a virtualenv, it would just work from the users perspective. It also has the advantage of keeping systems cleaner by default, since there will be a clear separation between system packages and pip-installed packages. Thoughts? -1. You'd be hiding a real problem by not telling the user that there's a permission problem to think about. One problem is that the user is trying to install some random package to the system python. This is only likely to happen on a personal machine (I do hope sysadmins have more sense than that) and installing it to user site packages will then still make it available for all python software that uses the system python that runs under that user. And that's probably quite a lot. Hence security issues remain, in that this package can get picked up by other python software running, and on Linux systems, that's quite a lot. :-) But, it's still a lot better than running it as sudo, in which case the setup.py file could simply decide to install a rootkit. That said, I think it would be better to explain to the user what happens. I could imagine that if you try to install where you don't have the right, it asks if you meant to install it to the user site packages or to a virtualenv, for example? That would be a much better idea, IMO. The solution Nick proposed also has another issue: it would install packages meant for a virtualenv in the user's site packages dir (outside the virtualenv)... If pip used the user site packages by default (when running as anyone other than root),... Looks like a slippery road if you try to make pip guess what the right installation dir should be, e.g. by trying to detect that it's running in a virtualenv, the Python3 venv, pyrun or a user's local Python installation. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 05 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] Use user-specific site-packages by default?
On 05.02.2013 14:06, Lennart Regebro wrote: On Tue, Feb 5, 2013 at 1:51 PM, Donald Stufft donald.stu...@gmail.com wrote: PyPI will need to change for this to happen realistically if I recall. There is a hard limit on how large of a distribution can be uploaded to PyPI and there are, if I recall, valid distributions which are larger than that. Anyone know which ones? scipy is the largest I know of, at 6-7 MB. Personally I want the installers to only install from PyPI so my suggestion if this is something that (the proverbial) we want to do, PyPI should gain some notion of a soft limit for distribution upload (to prevent against DoS) with the ability to increase that size limit for specific projects who can file a ticket w/ PyPI to have their limit increased. That sounds sensible. PyPI would need to be able to provide storage for a lot more distribution files than you typically find on PyPI nowadays to make the above practical. As an example: the files (sources, eggs, installers and prebuilt binaries, for 3 Python versions, two Unicode build variants, both 32/64-bit architectures and 4 different platforms) we host for our egenix-mx-base distribution use up 545MB for a single release. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 05 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] Use user-specific site-packages by default?
On 05.02.2013 14:18, Donald Stufft wrote: On Tuesday, February 5, 2013 at 8:13 AM, Lennart Regebro wrote: That will mean that a man in the middle-attack might poison PyPI's cache. I don't think that's a feasible path forward. Packages does not need to be cached, as they are not supposed to change. If you change the package you should really release a new version. (Unless you made a mistake and discovered it before anyone actually downloaded it). So what you are proposing is really that PyPI downloads the package from an untrusted source, if the maintainer doesn't upload it. I prefer that we demand that the maintainer upload it. I agree with this. External packages are inherently less able to be validated than something uploaded to PyPI. We should not disguise them or make them appear to be something they aren't. Hmm, packages aren't validated on PyPI either. You'd need an appstore team for that :-) Note that file storage itself can be insecure without any problem. You only have to make sure that the file's contents of the downloaded version matches the one that the author registered with PyPI (and, of course, you have to make that registration process secure), regardless of where you downloaded it from. IMO, PyPI would scale a lot better if it were to only manage the meta data and security aspect of the package distribution and not also deal with distribution of the files themselves, but yeah, that's a different discussion ;-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 05 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] Use user-specific site-packages by default?
On 05.02.2013 02:36, Nick Coghlan wrote: Something that caught my attention in the recent security discussions is the observation that one of the most common insecure practices in the Python community is to run sudo pip with unsigned packages (sometimes on untrusted networks). To my mind, this is a natural reaction to the user experience of pip: you run pip install package, it complains it can't write to the system site packages directory, so you run sudo pip install package to give it the permissions it clearly wants. If pip used the user site packages by default (when running as anyone other than root), that dangerous UI flow wouldn't happen. Even when pip was run outside a virtualenv, it would just work from the users perspective. It also has the advantage of keeping systems cleaner by default, since there will be a clear separation between system packages and pip-installed packages. Thoughts? -1. You'd be hiding a real problem by not telling the user that there's a permission problem to think about. Apart from that it's also not possible to do permission separation when everything is installed under the user account, e.g. it would be easy for malicious setup.pys to overwrite parts of the already installed modules with versions that contain nasty hooks, etc. The latter is what eventually killed the moin installations on wiki.python.org. The plugins directory was writeable by the user and the whole situation very similar to the user packages setup you are describing above. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 05 2013) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] disabling the serving of links from description_html?
On 18.12.2012 15:54, Holger Krekel wrote: Hi Richard, hi all, While reading the pypi main and other sources i wondered how we could switch off serving links from description_html, at least on a per-project basis. It's really annoying that when you start to add some links to a long_description that installation of your package will thus slow down around the world. Even if you remove the links from the next release. How could we arrange for a maintainer to communicate to the pypi-server that a particular project should not ever serve links from description_html (and maybe not even from the homepage while we are at it)? Preferably it should be something that can be done from existing setup.py files, like adding a special trove-classifier or keyword. But a little custom tool or a web page form would be ok as well. If maintainers could easily switch off these extra links, then this means less stress for the pypi server and a global considerable speedup of installing python packages as often most of the pip/easy_install time is spent with checking out these URLs. Are you sure about about this ? AFAIK, setuptools/distribute only looks at links with rel=homepage or rel=download attributes, not all links on the PyPI project page. The links from the description don't receive such attributes. See e.g. http://pypi.python.org/simple/pytest/ -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Dec 18 2012) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ 2012-12-14: Released mxODBC.Connect 2.0.2 ... http://egenix.com/go38 2012-12-05: Released eGenix pyOpenSSL 0.13 ...http://egenix.com/go37 2013-01-22: Python Meeting Duesseldorf ... 35 days to go eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] disabling the serving of links from description_html?
On 18.12.2012 18:54, Holger Krekel wrote: On Tue, Dec 18, 2012 at 5:46 PM, M.-A. Lemburg m...@egenix.com wrote: On 18.12.2012 15:54, Holger Krekel wrote: Hi Richard, hi all, While reading the pypi main and other sources i wondered how we could switch off serving links from description_html, at least on a per-project basis. It's really annoying that when you start to add some links to a long_description that installation of your package will thus slow down around the world. Even if you remove the links from the next release. How could we arrange for a maintainer to communicate to the pypi-server that a particular project should not ever serve links from description_html (and maybe not even from the homepage while we are at it)? Preferably it should be something that can be done from existing setup.py files, like adding a special trove-classifier or keyword. But a little custom tool or a web page form would be ok as well. If maintainers could easily switch off these extra links, then this means less stress for the pypi server and a global considerable speedup of installing python packages as often most of the pip/easy_install time is spent with checking out these URLs. Are you sure about about this ? AFAIK, setuptools/distribute only looks at links with rel=homepage or rel=download attributes, not all links on the PyPI project page. The links from the description don't receive such attributes. See e.g. http://pypi.python.org/simple/pytest/ You are right, Marc. Only the download and home page links (from all versions ever published) are considered from pip/easy_install. I just examined it more closely via urlsnarf. They were so many in some projects and mixed with the other links so i didn't see it clearly before (although i did notice the rel classification). So to avoid the overhead one could retroactively remove all download links and maybe also all homepage links except the one for the latest version or so. But that can be done without changes to pypi itself i guess. It may be useful to add rel=description to the links from the descriptions. That way, a download tool could more easily detect the origin of the links. And perhaps rel=distribution_file to links of the distribution files. Given that the simple index lists links for all releases, it may also be useful to add a new version=x.y.z attribute to the links, so that a download tool can more easily determine which links belong to which release. (More correct would be to add the version to the rel attribute, but doing so would break setuptools, since it does s substring search rather than parse the HTML.) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Dec 18 2012) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ 2012-12-14: Released mxODBC.Connect 2.0.2 ... http://egenix.com/go38 2012-12-05: Released eGenix pyOpenSSL 0.13 ...http://egenix.com/go37 2013-01-22: Python Meeting Duesseldorf ... 35 days to go eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig
Re: [Catalog-sig] current repo of pypi
On 30.11.2012 10:05, Holger Krekel wrote: Hello, The http://wiki.python.org/moin/CheeseShopDev page mentioned that the repo is undergoing migration. Is there some (even intermediate) url which i could pull today? AFAIK, this is still the current repo: https://bitbucket.org/loewis/pypi There was a discussion to get it moved to the PSF bitbucket account: http://python.6.n6.nabble.com/PyPI-code-now-on-bitbucket-td4622130.html but this doesn't appear to have happened yet: https://bitbucket.org/PSF -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 30 2012) Python Projects, Consulting and Support ... http://www.egenix.com/ mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ 2012-11-28: Released eGenix mx Base 3.2.5 ... http://egenix.com/go36 2013-01-22: Python Meeting Duesseldorf ... 53 days to go ::: Try our new mxODBC.Connect Python Database Interface for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig