On 12/30/2009 10:57 AM, [email protected] wrote:
On Dec 30, 2009, at 1:48 PM, Sebastien Douche wrote:

>  On Sun, Dec 27, 2009 at 11:47, Lennart Regebro<[email protected]>  wrote:
>
>>  Out of a total of 8522 packages on PyPI, there are 203 packages (2.4%)
>>  whose latest release does not provide either a package on PyPI, nor a
>>  download url. Of these 16 does not provide any contact data.
>
>  Hi Lennart,
>  Glad to see someone is interested by a PyPI mirror, I have one here
>  and it's a pity.
>
>  Statistics (from the creation of the mirror / proxy. The goal is to
>  avoid external download, like an internal debian mirror):
>  2009-12-15 21:37:20,855 DEBUG  Found (cached):         0
>  2009-12-15 21:37:20,855 DEBUG  Stored (downloaded):    15367
>  2009-12-15 21:37:20,855 DEBUG  Not found (404):        188
>  2009-12-15 21:37:20,855 DEBUG  Invalid packages:       0
>  2009-12-15 21:37:20,855 DEBUG  Invalid URLs:           54
>  2009-12-15 21:37:20,855 DEBUG  Runtime:                208m38s
>
>  The root issue (for me) is: packages out of the PyPI. A lot of broken
>  links, broken html pages or stupid scripts (cf. old SourceForge).
I will put a way of getting this data out, thanks for the heads up.

Greetings Sebastien and Steve,

The way of getting [external packages] was already implemented. It is called `setuptools.package_index` which is what we use in our internal mirror program (planning to open-source and, perhaps also, host it publicly) which also does the metadata extraction (PKG-INFO, requires.txt) and index files that I mentioned earlier.

It is of no use to pity z3c.pypimirror or any other mirror program, because the issue is not with those programs, but with the lack of a central archive from which all sources and metadata can be reliably mirrored.

I will, once again, draw the reader's attention to the following:

[Steffen Mueller]
My thesis is that the huge success of the CPAN has been facilitated by
two factors[2]. The first is simplicity. When Jarkko Hietaniemi
originally came up with it, the CPAN was (and mostly still is) just an
FTP archive with a by-author directory structure that is mirrored many
times.
http://www.mail-archive.com/[email protected]/msg10537.html

-srid
_______________________________________________
Distutils-SIG maillist  -  [email protected]
http://mail.python.org/mailman/listinfo/distutils-sig

Reply via email to