At 02:01 PM 10/7/2005 +0200, M.-A. Lemburg wrote: >Sorry, maybe I wasn't clear: a package builder needs >to *build* a package (rpm, egg, .tar.gz drop in place >archive, etc.) without the dependency checks.
bdist_egg simply builds an egg. Dependency checking is a function of *installing* the egg, not building it. >For the user to be able to turn off the dependency checks >when installing an egg using an option is also an often >needed feature. Yes, and it has been on my to-do list for some time. However, the majority of packages in eggs today don't have any dependencies declared anyway, because they're not packages that use setuptools. So the option, if it existed, wouldn't have been very useful until quite recently. In any case, the main refactoring I needed to do before that option could be added is done, so I'll probably add it in the next non-bugfix release. > rpm often requires this when you want >to install packages in different order, in automated >installs or due to conflicts in the way different >packages name the dependencies. I guess, eggs will >exhibit the same problems over time. I'm not sure I follow you here, but in any case there's nothing stopping people from installing eggs by just dropping them in a directory on sys.path without doing any installation steps at all. It's only if you want the egg to be on sys.path at startup without manually munging PYTHONPATH or a .pth file or calling require(), or if you want to install any scripts that you need to run easy_install on the egg. > > There is a simple trick that packagers can use to make their legacy > > packages work as eggs: build .egg-info directories for them in the > sys.path > > directory where the package resides, so that the necessary metadata is > > present. This does not require the use of .pth files, but it does slow > > down the process of package discovery for things that do use pkg_resources > > to locate their dependencies. It also still requires them to repackage > > existing packages, but doesn't require changing the layout. > >Where would you have to put these directories and what >do they contain ? You put them in the directory where the unmanaged packages are installed. At minimum, they contain a PKG-INFO file, and if the package ordinarily uses setuptools, they should also contain whatever else the egg's EGG-INFO directory contained. The directory name is ProjectName.egg-info, where ProjectName is the project's name on PyPI, with non-alphanumerics condensed by the pkg_resources.safe_name() function. >I must admit that I haven't followed the discussions about >these .egg-info directories. Is there a good reason not to >use the already existing PKG-INFO files that distutils builds >and which are used by PyPI (aka cheeseshop) ? I don't know if there's such a reason or not, but in any case that's what we use as part of the egg-info directories. However, we *also* allow for unlimited metadata resources to be provided in egg-info, as this is what allows us to carry things like plugin metadata and scripts in the egg. There are other metadata files listing the C extensions in the package, the "namespace packages" that the egg participates in, and so on. >Hmm, you seem to be making things unnecessarily complicated. That probably just means you're not familiar with the requirements. My first post here about the issues was about this time last year, discussing application plugins and their packaging. The use of eggs for general Python libraries as well as plugins only came into play this January, at Bob Ippolito's urging. So, while there may potentially exist solutions that might be somewhat simpler for certain kinds of Python library packaging, they don't even begin to address the issues for application plugin packaging, which is the raison d'etre of eggs. Trac, for example, lets you simply drop eggs into a plugin directory in order to use them. At some point, Chandler should be allowing this as well, and maybe someday Zope will support it too. It's primarily for these use cases that eggs exist; it just so happens that they make a fine way to manage installed Python packages as well. >Why not just rely on the import mechanism and put all >eggs into a common package, e.g. pythoneggs ?! >Your EasyInstall script could then modify a file in that >package called e.g. database.py which includes all the >necessary information about all the installed packages >in form of a dictionary. You completely lost me. A major feature of eggs is that for an application needing plugins, it can simply scan a directory of downloaded eggs and plug them into itself. Having a required installation mechanism other than "download the egg and put it here" breaks that. What's more, putting them in a single package makes it impossible to have eggs installed in more than one directory, since packages can't span directories, at least not without using setuptools' namespace package facility. And using that facility would mean the runtime would have to always get imported whenever you used an egg - which is *not* required right now unless you're using a zipped egg with a C extension in it. And even then the runtime only gets imported if you actually try to import the C extension. So, it seems to me your approach creates more I/O overhead for using installed packages. Finally, don't forget that eggs allow simultaneous installation of multiple versions of a package. So, you'd *still* have to have sys.path manipulation. >This would have the great advantage of allowing introspection >without too much fuzz and reduces the need to search paths, >directories and so-on which causes a lot of I/O overhead >and slows down startup times for applications needing >to check dependency requirements a lot. And the disadvantage of absolutely requiring install/uninstall steps, which is anathema. Note that with the exception of .egg-info markers (which aren't really intended for production use, anyway, they're a feature for deploying packages under development without needing to build a "real" egg), eggs can be fully introspected from their *filename* for dependency processing purposes. So, if the needed eggs are all on sys.path already, no additional I/O gets done. Identifying all the eggs available in a given directory is one listdir() operation, but it only happens if a suitable package isn't already on sys.path, and the listdir()s happen at most once during a given instance of dependency processing. > >>Please make sure that your eggs catch all possible > >>Python binary build dimensions: > >> > >>* Python version > >>* Python Unicode variant (UCS2, UCS4) > >>* OS name > >>* OS version > >>* Platform architecture (e.g. 32-bit vs. 64-bit) > > > > > > As far as I know, all of this except the Unicode variant is captured in > > distutils' get_platform(). And if it's not, it should be, since it > affects > > any other kind of bdist mechanism. > >Agreed. > >So you use get_platform() for the egg names ? Yes - except on Mac OS X, which has a changed platform string. > >>and please also make this scheme extendable, so that > >>it is easy to add more dimensions should they become > >>necessary in the future. > > > > It's extensible by changing the get_platform() and compatible_platform() > > functions in pkg_resources. > >Ah, that's monkey patching. Isn't there some better way ? Well, my presumption here is that we're going to get the scheme right for Python at large, and make it standard. Are you saying that some packages should have their own scheme? That's not really workable since in order to import the package and use its scheme, we would have to first know that the package was compatible! > > If you have suggestions, please make them known, and let's get them into > > the distutils in general, not just our own offshoots thereof. > >This is what we use: > >def py_version(unicode_aware=1, include_patchlevel=0): > >[snip] >The result is a build system that can be used to build >all binaries for a single platform without getting >conflicts and binaries that include a proper platform >string, e.g. > >egenix-mxodbc-zopeda-1.0.9.darwin-8.2.0-Power_Macintosh-py2.3_ucs2.zip >egenix-mxodbc-zopeda-1.0.9.linux-i686-py2.3_ucs2.zip >egenix-mxodbc-zopeda-1.0.9.linux-i686-py2.3_ucs4.zip eggs put the Python version before the platform, because "pure" eggs that don't contain any C code don't include the platform string. We also don't have a UCS flag, but if we did it should be part of the platform string rather than the Python version, since "pure" eggs don't care about the UCS mode, and even if they did, that'd be a requirement of the package rather than the egg itself being platform specific. > > A single .pth file is certainly an option, and it's what easy_install > > itself uses. > >Fair enough. > >Could this be enforced and maybe also removed >completely by telling people to add the egg directory to >PYTHONPATH ? If by "egg directory" you mean a single .egg directory (or zipfile) for a particular package, then yes, for that particular package you could do that. If you mean, can you just put the directory *containing* eggs on PYTHONPATH, then the answer is no, if you want the package to be on sys.path without any special action taken (like calling pkg_resources.require()). >Note that the pythonegg package approach would pretty much >remove the need for these .pth files. Only in the sense that it would require reinventing them in a different form. :) _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig