On 12/24/2009 12:33 AM, "Martin v. Löwis" wrote:
1/ Missing packages (eg: Twisted is not there); which is why
easy_install/pip had to resolve to scrapping project webpages for
guessing download links. In CPAN, almost all module authors upload their
sources via PAUSE.

How do you propose to change that?

Bt requiring authors to upload sdists + metadata now onwards.

'sdist upload' would upload the sdist to /packages/source and also have PyPI generate the metadata from the uploaded sdist. Eg:

  /packages/source/f/foo-0.1.tar.gz
  /packages/source/f/foo-0.1.tar.gz.PKG-INFO
  /packages/source/f/foo-0.1.tar.gz.requires.txt (optional)

If the author prefers to use the web browser to upload, then their sdist must contain setup.py and PKG-INFO (w/ at least 'name' and 'version').

I would leave the existing setup as it is .. so easy_install/pip would continue to install packages like Twisted, ClientCookie that, at the moment, do not have their sdists uploaded in PyPI.

[Martin]
I think it should be the choice of the package authors whether they
upload their software to the central repository, or to their own home
page.

>> [Ben]
Why do you think that should continue? Some of the costs of that
inconsistency have already been described in this thread. What are the
benefits to PyPI users of this inconsistency, and are we sure that the
benefits outweigh the costs?
>
> [Martin]
The benefits are not to the package users, clearly.Instead, they are
to the package authors, which don't have to change their release
processes (as also described in this thread).

Is it because of this benefit to package authors that we are withholding the implementation of a simple archive that would: 1) simplify the tools to no rely on adhoc web scrapping, 2) reduce the downtime for users by rsync/ftp mirroring, 3) have package sources mirrored so project owners do not have to worry about downtime of their servers. 4) enable proliferation of third-party tools like CPAN?

2/ No metadata: When only source tarballs are stored
[pypi.python.org/packages/source/P/Pylons/], what is the reliable way to
a) get the source for latest version,

Extract a version number from each file name, and sort the versions,
then use the largest (which is 0.9.7 at the moment).

b) get the source for a particular version?

Put the version number into the file name, and access the resulting
file.

This assumes that source tarballs are named in a particular format, such as: ${name}-${version}.tar.gz .. which need not always be the case (I've come across packages whose source distribution is simply named "latest"). This is why we rely on PKG-INFO to retrieve the version.

The reason for asking the two questions above, as pointed out to Lennart in other email, is this:

"""Perhaps if I were to rephrase the question, it would be clear this time: When only source tarballs are stored [pypi.python.org/packages/source/P/Pylons/], what is the reliable way to a) get the source for the latest version (when the /P/Pylons contains multiple versions -- in other words, how do I find the later version in first place?), b) get the source for a particular version (**without** having to construct the filename, or do a adhoc matching with filenames to guess that Pylons-1.2.3.tar.gz corresponds to version 1.2.3)? If the answer is to do a HTTP GET first, then please see the next response. """ [emphasis added]

My next response was:

"""As the CPAN .meta example was given in the context of having a simple directory structure that can be mirrored using existing tools like rsync, what I was pointing out is the lack of such an implementation, not the functionality itself (which, as you have shown, is currently supported by doing a HTTP GET that would return a XML content -- not something that is rsync-friendly). """

To explain: it is all about making the PyPI data (sdist + metadata) mirror-friendly / rsync-friendly.

The former is more of a community issue. Often Python package authors
are not using `sdist upload` (whereas this seems to be the convention in
the Perl world).

My guess is that this is enforced by the tools. If they don't upload
to PAUSE, CPAN.pm won't be able to download it.

Now, you are free to build a tool that enforces the same restriction.
I would doubt that people would use it, since it couldn't install
many packages.

My original intention is to have a simple archive that can be mirroed using rsync.

What this means is that PyPI has to serve the purpose of being a central
package repository (like CPAN) by a) disallowing mere listings (without
sources) and requiring sources to be stored in the server, b) storing
the metadata along with the sources (so anyone processing it wouldn't
have to extract the source and rely on a PKG-INFO file - which may or
may not exist).

If you want to retrieve the metadata for a specific version without
XML-RPC, you can access

http://pypi.python.org/pypi?:action=doap&name=Pylons&version=0.9.7

As pointed above, the purpose is not to do away with XmlRpc as such, but to have a simple archive that can be mirrored in entirety using existing tools like rsync. To facilitate this, one should be able to retrieve the metadata from the archive itself (filesystem) instead of having to do HTTP requests (via plain GET or XmlRpc).

-srid
_______________________________________________
Distutils-SIG maillist  -  Distutils-SIG@python.org
http://mail.python.org/mailman/listinfo/distutils-sig

Reply via email to