At 10:48 AM 3/19/2008 -0700, Guido van Rossum wrote: >I don't understand PyPI all that well; it seems poor design that the >browsing via keywords is emphasized but there is no easy way to >*search* for a keyword (the list of all packages is not emphasized >enough on the main page -- it occurs in the side bar but not in the >main text). I assume there's a programmatic API (XML-RPC?) but I >haven't found it yet.
http://wiki.python.org/moin/CheeseShopXmlRpc There's also a REST API that setuptools uses: http://peak.telecommunity.com/DevCenter/EasyInstall#package-index-api The API was originally designed for screen-scraping an older version of PyPI, but that has been replaced with a "lite" version served from: http://pypi.python.org/simple/ The "lite" version is intended for tools such as easy_install to process, as it consists strictly of links and can be statically cached. Zope Corp., for example, maintains a static mirror of this API, to guard themselves against PyPI outages and slowdowns, since their buildouts can involve huge numbers of eggs, both their own and external dependencies. >I'd love it if you could write or point me to code that takes a >package name and optional version and returns the URL for the source >archive, and the type (in case it can't be guessed from the filename >or the Content-type header). You can probably do that with the XML-RPC API. There's a function to get the versions of a package, given a (case-sensitive) name, and there's a function to get information for uploaded archives, given a name and a version. I originally intended to use it for the PEP 365 approach, but you can get the necessary information in just one static roundtrip using the REST (/simple) HTML API, if you're willing to parse the URLs for version information. (The catch of course being that distutils source distributions don't have unambiguously parseable filenames.) >Hm. Why not just use the existing convention for running setup.py >after unpacking? This works great in my experience, and has the >advantage of having an easy fallback if you end up having to do this >manually for whatever reason. Because I want bootstrap-ees to be able to use the bootstrap mechanism. For example, I expect at some point that setuptools will use other, non-self-contained packages, and other package managers such as zc.buildout et al also want to depend on setuptools without bundling it. > > * calling the bootstrap module 'bootstrap', as in 'python -m > > bootstrap projectname optionalversion'. The module would expose an > > API to allow it to be used programmatically as well as the command > > line, so that bootstrapped packages can use the bootstrap process to > > locate dependencies if they so desire. (Today's package management > > tools, at least, are all based on setuptools, so if it's not present > > they'll need to download that before beginning their own > > bootstrapping process.) > >This sounds like going beyond bootstrapping. My vision is that you use >the bootstrap module (with the command line you suggest above) once to >install setuptools or the alternate package manager of your choice, >and then you can use easy_install (or whatever alternative) to install >the rest. Well, I noticed that the other package managers were writing bootstrap scripts that then download setuptools' bootstrap script and run it as part of *their* bootstrap process... and then I got to thinking that it sure would be nice for setuptools to not have to be a giant monolithic download if I wanted to start using other packages in it... and that it sure would be nice to get rid of all these bootstrap scripts downloading other bootstrap scripts... and then I wrote PEP 365. :) One other thing that PEP 365 does for these use cases that your approach doesn't, is that pkg_resources could detect whether a desired package of a usable version was *already* installed, and skip it if so. So, we've already scaled back the intended use cases quite a bit, as people will have to write their own "is it already there?" and "is it the right version?" checks. > > Without one or the other, the bootstrap tool would have to grow a > > version parsing scheme of some type, and play guessing games with > > file extensions. (Which is one reason I limited PEP 365's scope to > > downloading eggs actually *uploaded* to PyPI, rather than arbitrary > > packages *linked* from PyPI.) > >There are two version parsers in distutils, referenced by PEP 345, the >PyPI 1.2 metadata standard. Yes, and StrictVersion doesn't parse release candidates. And neither LooseVersion nor StrictVersion supports handling multiple pre/post-release tags correctly. (E.g. "1.1a1dev-r2753") > > So, if I had to propose something right now, I would be inclined > to propose: > > > > * using setuptools' version parsing semantics for interpretation of > > alpha/beta/dev/etc. releases > >Can you point me to the code for this? What is its advantage over >distutils.version? It implements version comparison semantics that are closer to programmer expectations. It has also been far more widely used and exposed to more feedback. distutils.version, as far as I know, is really only used by the PEP 345 metadata standard -- which isn't used by *any* automated tools as far as I know, and I'm not sure how many packages bother declaring it. In addition to alpha/beta/candidate/dev versions, it also supports post-release (patchlevel) tags such as svn revision or date-based tags. Here is the code; the docstring is actually longer than the bits that do anything: def parse_version(s): """Convert a version string to a chronologically-sortable key This is a rough cross between distutils' StrictVersion and LooseVersion; if you give it versions that would work with StrictVersion, then it behaves the same; otherwise it acts like a slightly-smarter LooseVersion. It is *possible* to create pathological version coding schemes that will fool this parser, but they should be very rare in practice. The returned value will be a tuple of strings. Numeric portions of the version are padded to 8 digits so they will compare numerically, but without relying on how numbers compare relative to strings. Dots are dropped, but dashes are retained. Trailing zeros between alpha segments or dashes are suppressed, so that e.g. "2.4.0" is considered the same as "2.4". Alphanumeric parts are lower-cased. The algorithm assumes that strings like "-" and any alpha string that alphabetically follows "final" represents a "patch level". So, "2.4-1" is assumed to be a branch or patch of "2.4", and therefore "2.4.1" is considered newer than "2.4-1", which in turn is newer than "2.4". Strings like "a", "b", "c", "alpha", "beta", "candidate" and so on (that come before "final" alphabetically) are assumed to be pre-release versions, so that the version "2.4" is considered newer than "2.4a1". Finally, to handle miscellaneous cases, the strings "pre", "preview", and "rc" are treated as if they were "c", i.e. as though they were release candidates, and therefore are not as new as a version string that does not contain them, and "dev" is replaced with an '@' so that it sorts lower than than any other pre-release tag. """ parts = [] for part in _parse_version_parts(s.lower()): if part.startswith('*'): if part<'*final': # remove '-' before a prerelease tag while parts and parts[-1]=='*final-': parts.pop() # remove trailing zeros from each series of numeric parts while parts and parts[-1]=='00000000': parts.pop() parts.append(part) return tuple(parts) component_re = re.compile(r'(\d+ | [a-z]+ | \.| -)', re.VERBOSE) replace = {'pre':'c', 'preview':'c','-':'final-','rc':'c','dev':'@'}.get def _parse_version_parts(s): for part in component_re.split(s): part = replace(part,part) if not part or part=='.': continue if part[:1] in '0123456789': yield part.zfill(8) # pad for numeric comparison else: yield '*'+part yield '*final' # ensure that alpha/beta/candidate are before final To check a parse_version() value for stability, you can just loop over it looking for any part <"*foo" where "foo" is the desired minimum stability. That is, if you find a '*a' and you don't want alphas, then this version's no good. This lets you also distinguish between a beta that you might accept, from an in-development snapshot of a beta, that you wouldn't. >What's wrong with just running "setup.py install"? I'd rather continue >existing standards / conventions. Of course, it won't work when >setup.py requires setuptools; Actually, it will, if the setup script uses the current ez_setup bootstrapping method for setuptools. However, I'd like to get *rid* of that bootstrapping method, and replace it with this one. That's why I'd prefer that the bootstrap approach use a different entry point for launching, and why I want the module to expose an API, and why I don't really want the bootstrapper to actually "install" anything. For one thing, it means dealing with installation *options*. Your prototype doesn't pass through any command-line options to the script, so people would have to use a ~/.pydistutils.cfg file in order to control the installation options, for example. (Which then can break if the packager included a setup.cfg that was supposed to be overridden on the command line...) Probably this seems a lot more messy to me, because I've had my face directly planted in the mess for a number of years now, and I know that, for example, people bitched and moaned excessively about not being able to use --prefix with easy_install, the way they could with 'setup.py install'. And maybe my experiences aren't all relevant here; I'm just not very good at turning them off. My skepticism for the setup.py-based approach is at close to "new scheme for removing the GIL" level, because I've gone through a lot of pain to get easy_install from the stage where it looked a lot like your bootstrap prototype, to something that actually works, most of the time, for arbitrary distutils packages. :) And unfortunately, some of the hurdles will require a few release cycles to show up. And hey, if you're okay with that, cool. I just think that as soon as it gets out in the field, people will use it far outside anything we expect it to be used for, and if there's not a bright line for the *packager* to cross, I think we'll have people unhappy with the tool. If you have to do a special step to make something bootstrappable, then when the tool doesn't work, the user will ask the packager to take the special step. However, if the tool allows the user to *point* it at any package, and it randomly (from the user's POV) fails, then the tool (and Python) will be blamed for the failure. Because even though the bootstrap tool is "not a package manager", if it's close enough to look like "a simpler easy_install", people will try to use it as one, and blog about how bootstrap is broken and should support installation options, etc. (I suppose at this point easy_install is something of a counter-example to this worry; people can and do now give packagers patches to make their setup scripts more compatible with easy_install, in cases where the package does extensive distutils modification. OTOH, easy_install is a de facto standard, where bootstrap will be de jure. What does that mean in practice? Heck if I know. :) I guess people will hate on you instead of me, then, so maybe I should view that as an improvement. :) (It also makes it easier to understand your reluctance to be in any way associated with eggs, but there's a big difference between eggs and easy_install, and IMO your approach leans more towards the relative vices of easy_install than the relative virtues of eggs. But oh well.)) _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com