Re: [Distutils] Indexing modules in Python distributions

2017-02-13 Thread Donald Stufft
> On Feb 13, 2017, at 12:25 PM, Thomas Kluyver wrote: > > Thanks. So the current size is about 0.5 TB, and presumably if people > are maintaining full mirrors, PyPI itself can cope with that much > outgoing bandwidth being used. > Yea, PyPI does something like 16TB a day of bandwidth :) — Do

Re: [Distutils] Indexing modules in Python distributions

2017-02-13 Thread Thomas Kluyver
Thanks. So the current size is about 0.5 TB, and presumably if people are maintaining full mirrors, PyPI itself can cope with that much outgoing bandwidth being used. Steve & Chris: does downloading & scanning that volume of data sound like something you'd want to do on Azure? Does anyone there ha

Re: [Distutils] Indexing modules in Python distributions

2017-02-09 Thread Jeremy Stanley
On 2017-02-08 18:14:38 + (+), Thomas Kluyver wrote: [...] > What I'm proposing differs in that it would need to download files from > PyPI - basically all of them, if we're thorough about it. I imagine > that's going to involve a lot of data transfer. Do we know what order of > magnitude we

Re: [Distutils] Indexing modules in Python distributions

2017-02-09 Thread Nick Coghlan
On 8 February 2017 at 19:14, Thomas Kluyver wrote: > What I'm proposing differs in that it would need to download files from PyPI > - basically all of them, if we're thorough about it. I imagine that's going > to involve a lot of data transfer. Do we know what order of magnitude we're > talking ab

Re: [Distutils] Indexing modules in Python distributions

2017-02-09 Thread Thomas Kluyver
On Wed, Feb 8, 2017, at 11:06 PM, Wes Turner wrote: > So, IIUC, > you're looking to emit > ((URL, release, platform), namespaces_odict) > for each new and all existing packages; > by uncompressing every package and running every setup.py (hopefully > in a container)? Something like that, yes

Re: [Distutils] Indexing modules in Python distributions

2017-02-08 Thread Wes Turner
On Wednesday, February 8, 2017, Thomas Kluyver wrote: > Thanks Steve, Chris, > > On Tue, Feb 7, 2017, at 04:49 PM, Chris Wilcox wrote: > > I may be able to help jump-start this a bit and provide a platform for > this to run on. I deployed a small service that scans PyPI to figure out > statistics

Re: [Distutils] Indexing modules in Python distributions

2017-02-08 Thread Thomas Kluyver
Thanks Steve, Chris, On Tue, Feb 7, 2017, at 04:49 PM, Chris Wilcox wrote: > I may be able to help jump-start this a bit and provide a platform for > this to run on. I deployed a small service that scans PyPI to figure > out statistics on Python 2 vs Python 3 support using PyPI Classifiers. > T

Re: [Distutils] Indexing modules in Python distributions

2017-02-07 Thread Chris Wilcox via Distutils-SIG
: Steve Dower [mailto:steve.do...@python.org] Sent: Tuesday, 7 February, 2017 6:39 To: Thomas Kluyver ; distutils-sig@python.org Cc: Chris Wilcox Subject: RE: [Distutils] Indexing modules in Python distributions I'm interested, and potentially in a position to provide funded infrastructure

Re: [Distutils] Indexing modules in Python distributions

2017-02-07 Thread Steve Dower
g to be in hooking up to those and turning it into a scan task. Cheers, Steve Top-posted from my Windows Phone -Original Message- From: "Thomas Kluyver" Sent: ‎2/‎7/‎2017 3:30 To: "distutils-sig@python.org" Subject: [Distutils] Indexing modules in Python distribut

[Distutils] Indexing modules in Python distributions

2017-02-07 Thread Thomas Kluyver
For a variety of reasons, I would like to build an index of what modules/packages are contained in which distributions ('packages') on PyPI. For instance: - Identifying requirements by static analysis of code: 'import zmq' -> requires pyzmq - Finding corresponding packages from different packaging