Re: multiple pythons and the default
Bruce Sass wrote: /usr/bin/python provided by the python package. Right now it's 2.3.5. So it is arbitrary, as in there is no technical reason which makes 2.3.5 most suitable. That impression is incorrect. There was a technical reason when the default was defined: it was the most recent version that tat time. The next default will have the same property: it will be the most recent release. So the decision what Python version is the default is *not* arbitrary. Therefore it should be possible to choose any Python as the default so long as the dependencies of any package depending on the official default Python can be satisfied, and any problem encountered in doing so would be problems with the implementation of a default. That conclusion is false, or at least misleading. A package depending on the default version might not just depend on other packages that the default Python would have to provide - it also might depend on the specific behavior of the default Python version. IOW, when the default Python version changes, some applications may break, because they have not been ported to that other version. To minimize the breakage, it is desirable that the default Python version changes rarely (so that Python applications don't need to get ported to a different version that often), and that the default only ever changes to newer versions (so that applications never need to get backported, only forward). Debian's support for multiple interpreters should be more than a convenient apt-get install some other Python interpreter, it should be the infrastructure necessary to manage multiple Pythons. Consider that if the system is designed so that an admin can easily change the default Python, then Debian can also. What system is designed so that an admin can easily change the default Python? An admin might break his installation by changing the default; his users will blame him for doing that. Debian shouldn't break the users' systems so lightly. If a package depends on Python-2.4 then it should actually depend on python2.4 and not some other package which just happens to pull in the necessary interpreter... Why? This will give you many unnecessary hard-coded package dependencies. Packages that are reasonably expected to work with this current version and any future version should depend on the default Python. Regards, Martin -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: when and why did python(-minimal) become essential?
Martin Michlmayr wrote: I definitely agree we should listen to the Python community, Well, my *personal* view is this: I agree that it is highly desirable that the python package is the entire thing, with all batteries included. I'm uncertain what to think about offering systems that only have a minimal python, which would have python not installed, yet /usr/bin/python present. On the one hand, I think it is fair to require people to install the python package if they want Python. OTOH, it is likely also confusing to tell people that they need to install python even though /usr/bin/python is already present. I cannot guess how many support requests we would get from people which fail to install the python package. We surely get a lot of requests from people asking why some Python program fails, just because some Linux distributions manage to install an incomplete library even though the user requested the python package of that distribution. In that category, the most frequent issue is that people cannot run distutils applications, either because the entire distutils library is missing, or because the header files are missing. The next most frequent issue is that people complain they cannot run IDLE (because Tkinter was not installed). Regards, Martin -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [Distutils] formencode as .egg in Debian ??
As for terminology, you seem to suggest to use distribution where Debian uses package. So Debian package would become Debian distribution. This does not sound right, because Debian distribution is the entire collection of packages that is released e.g. on a DVD-ROM. I'll try to use project in your sense and package in the Python sense whenever I can. Phillip J. Eby wrote: An egg is a distribution of a project that is importable and can carry both standardized and individualized metadata that can be read by the pkg_resources module. There are various distribution *formats* in which an egg may be physically manifested, but the egg itself is a logical concept, not a physical one. It is therefore, as I said, not merely a distribution format. Is that any clearer? Yes. When I said an egg, I meant a zipfile with a .egg extension, or a directory with a .egg extension. In response to # [...] who will quite simply need eggs for many packages. # If Debian doesn't provide them, the users will be forced to obtain # them elsewhere. I meant Debian should provide the distributions, but not as .egg files; it should provide the distribution as a deb file. So users are provided with the project, but in a form that is not one of the three forms an egg could have. The contradiction in terms was that I took your meaning of package to be the same as my term project - i.e., a functional collection of Python resources. Projects that *are* eggs, can't be provided but not as eggs. They *are* eggs, so not providing them as eggs means not providing them at all. I would expect that you can unegg a project. You can distribute the project as a collection of Python modules, not as a collection of Python resources. The Debian developer could (and I was suggesting he should) just ignore the entire egg structure, and distribute the code of the library only. If so, Debian should not distribute them. This is what I don't understand, as it has nothing to do whether or not is a distribution format, at least not that I can see. My statement was that eggs are not merely a distribution format; they are a logical concept that can be physically packaged in various ways, and if it's necessary to invent yet another physical layout, well, we can do that too. Yes, but this logical concept is in the way of Debian packages/distributions (atleast if done naively by the Debian developer). This is what started the entire discussion: Matthias Urlichs complained that Bob Tanner included the egg structure in the formencode Debian package/distribution. The specific initial complaints where: - you can't use it with a simple import formencode, - pydoc does not work on eggs. I would add the complaint: - it increases sys.path for no good reason. Which would be the same as saying you wouldn't distribute, say, setuptools itself. Setuptools is an egg, and can't function except as an egg, because it is more than a Python package. Again, an egg is some specific release of a project and its introspectable metadata. I could rewrite setuptools to function as a regular Python package. After a shallow inspection, there aren't many places where it really needs the pkg_resources functionalities for itself - I could only identify the part that locates cli.exe. As this is used on Windows only, a Debian port of setuptools could simply ignore this code. It is not a distutils setup because it does not invoke distutils.core.setup. Now I really don't understand you. Line 43 of setuptools/__init__.py reads: setup = distutils.core.setup So, how is it not invoking distutils.core.setup? Ah, I didn't look so far. I noticed that when I replace from setuptools import setup with from distutils.core import setup I get warnings about package_data and extras_require, and assumed this means setup was a different function; instead, it really is the import that plays tricks here. Extending distutils is fine. An extension is a feature that, if not invoked, has no effect. easy_setup changes install in a way that has an effect. So do all the packages that rework install_data to be more to their liking - and there are quite a lot of them, as I discovered when I began testing easy_install. Right. It really isn't that much about what is and is not conforming; it more matters what the practical effects on the Debian developer are. If setup.py install just puts some files into some locations, and the files don't conflict with files in other Debian packages/distributions, the developer can easily package the entire thing. If setup.py install does other things, like editing an existing file, it is not so easy anymore. That is not true. Usability also suffers if sys.path becomes long. How? I don't understand this. People will often inspect sys.path to understand where Python is looking for their code. They can do so manually if sys.path fits on one or two lines of terminal output. On my system, it is now four lines, primarily
Re: [Distutils] formencode as .egg in Debian ??
Phillip J. Eby wrote: I was referring to how the distribution is *installed*. You don't use things directly from a deb file, they have to be installed on the system. When you install an egg, you must use one of the three forms, or the system as a whole will not function. That depends on whether the system (pkg_resources, I assume) is used at all. If the project is just a Python library, you can install it as a Python package in site-python, not as an egg. Eggs that depend on the egg will not be able to find it, nor use any plugins it contains. Not sure what an egg plugin is, so I cannot comment on that. As for other eggs finding the one: In Debian, there normally shouldn't be any need to, since there will be also a Debian package providing the other project, and then a plain import will be sufficient to find the Python package. Of course, any usage of the pkg_resource API would break. One way to deal with that is to encourage upstream authors to have a fallback mode where they can work without pkg_resource; another is to provide a fallback implementation of pkg_resource. So, when I say it is a contradiction in terms to install an egg in a non-egg form, I mean that it is nonsensical to say that you have installed it, because it will be unusable (by other eggs), nonfunctional (by itself), or both. That makes me not like the egg infrastructure: too many subtle dependencies, and you are too much forced into using the structures that the setuptools authors came up with. Of course, the pragmatic view is just to bite the bitter pill (is this the idiom?) and find some strategy that makes pkg_resource work, without any of the drawbacks of setuptools. I would expect that you can unegg a project. For projects that make use of eggs, you expect wrong. Try it with setuptools, and you will find that it is unable to even run its own tests, because the test command is registered via an entry point. I would have to rewrite the code, of course. I do all registration that needs to be done in __init__.py Entry points are just one kind of project metadata that can be registered; other projects like Trac and SQLObject have their own kinds of metadata as well. None of this metadata is accessible without the EGG-INFO or .egg-info directory; removing it is like removing the JavaBean metadata or the deployment descriptors from Java jars, rendering the jar useless in many contexts, despite the fact that all the code remains. Sure, *just* removing it would be wrong. I have to replace it with Python code. The only projects that can be unegged, then, are ones that no egg project depends on, and which do not themselves depend on any eggs. The number of projects that are not depended on by other projects will be smaller and smaller over time, as will the number that do not depend on other eggs. Define depends on. If this is imports, I don't see a problem with unegging the package. If the dependent package is installed, the import statement will just succeed right away. In essence, trying to work around the absence of egg metadata is a bottomless pit, because over time there will be an ever-increasing amount of functionality in the field that is based on the use of metadata. That is really sad. I would add the complaint: - it increases sys.path for no good reason. It is only true that it increases the length in the case of the two .egg forms, not the .egg-info form. Ok, then I think this is what Debian should use. The no good reason part is an interesting opinion, although in my view it is rather narrow-minded. Being able to support multi-version importing is a very good reason indeed, as is avoiding the need for a platform-specific package management tool in order to manage Python projects. I don't see why multi-version support necessarily requires to increase sys.path. In the case of eggs, version dependencies are expressed explicitly in the code (through require() calls), so that essentially replace the standard Python import search algorithm. Because of that, you could have a default version inside site-packages, and additional versions elsewhere, only found when require() is called. Regards, Martin -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [Distutils] formencode as .egg in Debian ??
Phillip J. Eby wrote: Yes, it's true, zipfile import processing is faster than normal import processing; it is in fact one of the reasons zipfile imports were added to Python, because the zip directories are cached. A zipfile import lookup is a single dictionary lookup, whereas a directory import lookup requires multiple stat() calls. For all practical purposes, zipfiles added to sys.path are free after the initial directory read operation. OTOH, it does add an overhead on startup, as it will have to read the TOC of all zipfiles on sys.path, atleast if the module you are looking for is in the last zipfile on the path. It then also adds memory overhead, as the TOC of all files is cached in memory. Note that the need for a .pth is a limitation caused by the requirement to have packages importable at startup. Packages installed in multi-version or deactivated mode are only added to sys.path upon request and have no impact on startup time. Relatively few eggs *need* to be installed with a .pth file; we are simply in a transitional period where people still expect installed packages to be importable without an additional require() operation. People reasonable will have this expectation for a Debian package. If you install a Debian package with some library, you expect the library to be usable right away. Finally, I think it's important to note that what Debian should or should not use isn't really relevant to Debian's users, who will quite simply need eggs for many packages. If Debian doesn't provide them, the users will be forced to obtain them elsewhere. Debian should provide the packages, but not as eggs. For a Debian user, eggs do not add advantages, and for a Debian Developer, they only add additional hassle. Over time, the number of packages that users need in egg form will continue to increase, and there will be an increasing number of users wanting to know why Debian can't provide them. It's perfectly reasonable not to redo existing Debian packages to use eggs, but for some packages, *not* using eggs is simply not an option. Debian developers should work with upstream authors to keep a distutils-based setup.py operational. Regards, Martin -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [Distutils] formencode as .egg in Debian ??
Phillip J. Eby wrote: If you have many zipfiles on sys.path, all applications will suffer from having to read the TOC of all those zipfiles, even if they need none of them. OTOH, if you had packages inside site-python, the contents of the unused packages is simply ignored. I'm sorry, but this is, shall we say, fact challenged? .pth files' contents are added to the *end* of sys.path. This means that stdlib imports and normal site-packages imports are satisfied *before* any hypothetical overhead from .pth entries, whether they're zipfiles or directories. Correct. I was not talking about stdlib imports. I was talking about imports satisfied from the end of sys.path, or imports resulting in ImportErrors. If Python never reaches the .pth entries at runtime, it will not even read the zipfile TOCs, let alone attempting to stat() for contained packages. Correct. However, a false preposition can imply anything: Python *always* reaches the .pth entries atleast once, in a typical installation, while looking for sitecustomize. This will cause a load of all zipfiles on sys.path, before site.py is done. Please check your facts before spreading untruths like this I did check: I have a file a.pth in site-packages, which refers to a.zip (in the same directory), and I have an empty Python file e.py. Running strace -o xxx python e.py shows, among others open(/usr/lib/python2.3/site-packages/a.zip, O_RDONLY|O_LARGEFILE) = 5 ... ead(5, PK\3\4\n\0\0\0\0\0\202\274v3\265\267\r\16\0\0\0\16\0\0..., 132) = 132 So a.zip is read even though the program does not contain a single import statement. What is the untruth I'm spreading? Regards, Martin -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [Distutils] formencode as .egg in Debian ??
Phillip J. Eby wrote: This is simply not true. If you don't believe PEP 302 and site.py, measure it for yourself. The *only* addition to startup is the time to actually read the .pth file and append the entries to the list. I did. strace shows that all zip files are loaded. And how often do programs attempt to import non-existing modules along performance critical paths? Every time. Atleast sitecustomize is imported in most programs (except those skipping site.py), and is not present in most installations. The standard library catches ImportError about 250 times, although fewer expect the failure in a typical installation. Regards, Martin -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [Distutils] formencode as .egg in Debian ??
Phillip J. Eby wrote: Debian should provide the packages, but not as eggs. For packages that only operate as eggs, and/or require their dependencies as eggs, you are stating a contradiction in terms. Eggs are not merely a distribution format, any more than Java .jar files are. So I should say Debian should not provide eggs, period, since what Debian provides are packages, and eggs are not? Debian developers should work with upstream authors to keep a distutils-based setup.py operational. It's perfectly operational; clearly the entire egg system is *well* within the Python runtime's intended operating parameters, as it uses only well-defined and published aspects of the Python language, API, stdlib, and build process. I didn't say the egg system in inoperational. I said that distutils setup is not operational for, for example, FormEncode: this uses another packaging library in setup.py, not distutils setup. Perhaps you have some other definition of operational in mind? I had *distutils-based* setup.py in mind. As I've already stated, applying this same policy to Java libraries would be to demanding that all the .class files be extracted to the filesystem and any manifest files be deleted, before Debian would consent to package them. In other words, it would be silly and pointless, because the users would then ignore the packages in favor of actual jars, because then their applications would actually work. This is not the same. A java .jar file is deployed by putting it on disk. For an egg, an (apparently undocumented) number of additional steps is necessary, such as editing easy-install.pth. In Java, the drawback of course is that each user has to edit CLASSPATH to include all the jar files desired. easy_setup makes this unnecessary, but in a way unfriendly to dpkg (and I assume other Linux package formats). Regards, Martin -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [Distutils] formencode as .egg in Debian ??
Phillip J. Eby wrote: The only thing that occurs to me as even a possibility would be some kind of frequently-used system administration utility, like if you were going to rewrite all the bash builtin commands as Python scripts. This whole discussion is not about whether the start time actually matters - it is about whether it is a fact or not that eggs improve the startup. Some people said it does, others said it doesn't, and this is just the finding-of-facts phase. Anyway, I'm terribly curious what Python applications exist for whom: 1. Startup time is a consideration, that 2. Haven't already been refactored to a long-running process. For this, CGI scripts come to mind. Many people use them, and they are often short-running, and they often get invoked frequently. Then why was the python##.zip entry added to sys.path in Python 2.3? My understanding was that it was added to allow Python to start faster by cutting down on extraneous stat() calls. PEP 273 doesn't give much rationale: Booting ... Just as there are default directories in sys.path, there must be one or more default zip archives too. IIRC, it was to simplify deployment, having the entire library in a single file. Regards, Martin -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: formencode as .egg in Debian ??
Bob Tanner wrote: Note also that in many cases, the package will be a single .egg *file*, (analagous to a Java .jar file) rather than a directory, and files are preferable to directories in most cases as they make Python import processing faster. I don't think Debian should use the egg structure. It apparently relies on building a long sys.path (even though through only a single .pth file); this adds additional costs to all import statements on startup. It gets worse if these are zipfiles, because then each import statement will have to look into each zipfile (until the import is resolved). If there is no way to install the package directly into site-packages using the provided setup.py, I think setup.py should be modified/ignored. In the specific case of formencode, replacing the first three lines of setup.py with from distutils.core import setup seems to work (except for the warning that there are unsupported options). Regards, Martin -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: Python policy proposed changes
To what cost? How many gigabytes of mirror space and bandwidth are we wasting with python2.X-libprout stuff nobody ever uses? I don't know. What is the answer to this question? I wouldn't expect it to be more than 1GiB per mirror, though, likely much less. On i386, for example, the useless python2.[124]- packages for example seem to add up to 59MiB, if I counted correctly. Even in a situation like the current one, when we're stuck with 2.3 as the default when there's 2.4 available, there are only a few python packages which actually need the 2.4 version. What do you mean, actually need? Every python2.3-foo package actually needs python2.4. If you have only python2.3-foo installed, and do ~$ python2.4 Python 2.4.1 (#2, May 5 2005, 11:32:06) [GCC 3.3.5 (Debian 1:3.3.5-12)] on linux2 Type help, copyright, credits or license for more information. import foo Traceback (most recent call last): File stdin, line 1, in ? ImportError: No module named foo This is because python2.3-foo installed into python2.3's site-packages, so it won't be available in python2.4. You really need a separate package for 2.4. In this case, the policy states they should be built as python2.4-foo, until python2.4 becomes the default. That's also why modules needed by a lot of binary packages should be built as multi-binary packages, as there is a probability a script requires both modules. This I don't understand. You mean, a script might require both python2.3-foo and python2.4-foo if foo contains an extension module? But I'm not talking about python-gtk here, I'm talking about those hundreds of modules actually used by zero or one binary packages. Do we need multi-binary packages for them? Compared to the waste of human and computer resources this implies, I'm pretty sure it's not worth the deal. It's a policy decision, obviously. I wonder how many users you have interviewed or what other criteria you have used to decide what is best for the users. IOW, even if this policy is chosen, it lacks rationale, IMO. Of course, supporting versions older than the default version is rarely needed, except when there are applications that require such older versions. So when 2.4 becomes the default, only 2.4 (and perhaps 2.5) packages should be built. Don't you understand that it's even more added work to remove the legacy python 2.1 to 2.3 packages in hundreds of packages ? It is more work, but I don't understand why it is significantly more work. Maintainers just have to remove all traces of 2.1, 2.2, and 2.3. from their debian directory, and rebuild, no? Anyway, it's hardly hundreds of. I counted 194 python2.3- packages, 82 python2.2- packages, and 46 python2.1- packages. There are also 125 python2.4- packages, so the majority of the packages has already prepared for the transition. Regards, Martin -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: Python policy proposed changes
Josselin Mouette wrote: Apart from a typo and the FSF address, the changes are about which packaging variants are mandated, recommending to provide only one python-foo package for each module, except when depending applications mandate another python version. This way, we could enforce that policy during the transition, removing hundreds of cruft python2.X-foo packages. I don't like this policy. the python2.X-foo are not at all cruft; they provide a useful feature. With the multi-version build, you can support Python versions more recent than the default python version, which is useful for people who would like to use more recent versions: they don't have to rebuild all their extension modules themselves. It also simplifies the transition from one Python version to the next: people can build and test their packages against newer versions long before Debian updates the default. That way, when the default version changes, they just have to turn a switch in the default package. This reduces the amount of work necessary in the transition phase itself. Of course, supporting versions older than the default version is rarely needed, except when there are applications that require such older versions. So when 2.4 becomes the default, only 2.4 (and perhaps 2.5) packages should be built. Regards, Martin -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: Bug#229370: python2.3: Default site.py breaks stuff
Matthias Klose wrote: The default /etc/python2.3/site.py specifies ascii as a system encoding. This causes errors if non-ascii characters are fed to python programs unaware of i18n/l10n issues (eq. libglade-convert script). Please make utf-8 (which is backwards compatible but will not cause fatal errors) or enable locales in default site.py. I would strongly advise against making it locale-aware - this would mean that locale is considered in strange places, causing moji-bake, and the cause of the moji-bake would be difficult to find. It also means that the same program may work for some users and fail for others. Setting it to utf-8 would work, but it would mean that Debian deviates from all other Python installations in the world. Changing it locally is somewhat recommended; such changes should be carried out through sitecustomize.py, instead of editing site.py. The real solution is to fix the buggy applications, i.e. libglade-convert in this case. Regards, Martin
Re: zip archive in python search path
Torsten Landschoff wrote: Today I got the attached two mails. I wonder how this happens and how to fix it. Is it correct that zip archives are supported in sys.path now? Yes, see PEP 273. In that case probably python-gtk needs fixing. Otherwise something in python is wicked. No, it is behaving according to the spec. Regards, Martin