On 2010-07-27, at 9:31 AM, Sridhar Ratnakumar wrote:

> 
> On 2010-07-27, at 2:55 AM, M.-A. Lemburg wrote:
> 
>> Alexis Metaireau wrote:
>>> On Tue, 2010-07-27 at 09:52 +0100, Chris Withers wrote:
>>>> there is now a way to request release information in JSON,
>>>>> see http://tinyurl.com/38lefsp 
>>> That's indeed cool, and useful, but we can't rely on this while
>>> crawling, too bad this JSON is not replicated on the mirrors.
>>> 
>>> It could help a lot, since there is currently no way to request the
>>> metadatas statically in others way that downloading the distribution
>>> archives and extracting them. (we also could use xmlrpc, but that's not
>>> static).
>>> 
>>> What's the process I have to follow in order to get this mirrored ? Does
>>> that sounds good for you ? IOW, whats needed to have this as a
>>> requirements for mirrors? 
>> 
>> Easiest would be to dump the complete release information
>> (PKG-INFO) to a text file using the name format <version>_pkg_info
>> in the simple/ index.
> 
> What we ended up doing for our internal comprehensive Python package mirror 
> is this:
> 
> - for pkg in changed_since_yesterday(pypi): 
> download_source_using_easy_install(pkg) 
> - extract PKG-INFO out of source
> - extract 'requires.txt' (if it exists) out of source 
> 
> If you want to find the dependencies of a package, it can only be found in 
> requires.txt (not PKG-INFO).
> 
> But then even if PKG-INFO/requires.txt is provided by /simple, keep in mind 
> that it won't be comprehensive. Not all package authors use PyPI for serving 
> their source distributions. (This is why we also had to use 
> setuptools.package_index).

And, of course, not all source distributions include a PKG-INFO file, in which 
case it becomes mandatory to run "python setup.py egg_info" (after patching 
setup.py to do 'import setuptools') to generate it.  Twisted-10.1.0.tar.bz2 is 
one example of this.

There are still several other minor issues with metadata mirroring that I don't 
recall at the moment.

-srid
_______________________________________________
Catalog-SIG mailing list
[email protected]
http://mail.python.org/mailman/listinfo/catalog-sig

Reply via email to