Re: multiple pythons and the default

2006-05-07 Thread Martin v. Löwis
Bruce Sass wrote:
 /usr/bin/python provided by the python package. Right now it's 2.3.5.
 
 So it is arbitrary, as in there is no technical reason which makes 2.3.5
 most suitable.

That impression is incorrect. There was a technical reason when the
default was defined: it was the most recent version that tat time.
The next default will have the same property: it will be the most recent
release. So the decision what Python version is the default is *not*
arbitrary.

 Therefore it should be possible to choose any Python as the
 default so long as the dependencies of any package depending on the official
 default Python can be satisfied, and any problem encountered in doing so
 would be problems with the implementation of a default.

That conclusion is false, or at least misleading. A package depending
on the default version might not just depend on other packages that the
default Python would have to provide - it also might depend on the
specific behavior of the default Python version.

IOW, when the default Python version changes, some applications may
break, because they have not been ported to that other version. To
minimize the breakage, it is desirable that the default Python version
changes rarely (so that Python applications don't need to get ported
to a different version that often), and that the default only ever
changes to newer versions (so that applications never need to get
backported, only forward).

 Debian's support for multiple interpreters should be more than a
 convenient apt-get install some other Python interpreter, it should be
 the infrastructure necessary to manage multiple Pythons. Consider that if
 the system is designed so that an admin can easily change the default
 Python, then Debian can also.

What system is designed so that an admin can easily change the default
Python?

An admin might break his installation by changing the default; his
users will blame him for doing that. Debian shouldn't break the users'
systems so lightly.

 If a package depends on Python-2.4 then it should actually depend on
 python2.4 and not some other package which just happens to pull in the
 necessary interpreter...

Why? This will give you many unnecessary hard-coded package
dependencies. Packages that are reasonably expected to work with this
current version and any future version should depend on the default
Python.

Regards,
Martin


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: when and why did python(-minimal) become essential?

2006-01-21 Thread Martin v. Löwis
Martin Michlmayr wrote:
 I definitely agree we should listen to the Python community,

Well, my *personal* view is this: I agree that it is highly
desirable that the python package is the entire thing, with
all batteries included. I'm uncertain what to think about
offering systems that only have a minimal python, which
would have python not installed, yet /usr/bin/python present.

On the one hand, I think it is fair to require people to install
the python package if they want Python. OTOH, it is likely
also confusing to tell people that they need to install
python even though /usr/bin/python is already present.

I cannot guess how many support requests we would get
from people which fail to install the python package.

We surely get a lot of requests from people asking
why some Python program fails, just because some Linux
distributions manage to install an incomplete library
even though the user requested the python package
of that distribution.

In that category, the most frequent issue is that
people cannot run distutils applications, either
because the entire distutils library is missing, or
because the header files are missing.

The next most frequent issue is that people complain
they cannot run IDLE (because Tkinter was not
installed).

Regards,
Martin


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [Distutils] formencode as .egg in Debian ??

2005-11-23 Thread Martin v. Löwis

As for terminology, you seem to suggest to use distribution where
Debian uses package. So Debian package would become Debian
distribution. This does not sound right, because Debian distribution
is the entire collection of packages that is released e.g. on a DVD-ROM.
I'll try to use project in your sense and package in the
Python sense whenever I can.

Phillip J. Eby wrote:
An egg is a distribution of a project that is importable and can 
carry both standardized and individualized metadata that can be read by 
the pkg_resources module.  There are various distribution *formats* in 
which an egg may be physically manifested, but the egg itself is a 
logical concept, not a physical one.  It is therefore, as I said, not 
merely a distribution format.  Is that any clearer?


Yes. When I said an egg, I meant a zipfile with a .egg extension,
or a directory with a .egg extension. In response to

# [...] who will quite simply need eggs for many packages.
# If Debian doesn't provide them, the users will be forced to obtain
# them elsewhere.

I meant

Debian should provide the distributions, but not as .egg files;
it should provide the distribution as a deb file. So users are provided
with the project, but in a form that is not one of the three forms
an egg could have.

The contradiction in terms was that I took your meaning of package 
to be the same as my term project - i.e., a functional collection of 
Python resources.  Projects that *are* eggs, can't be provided but not 
as eggs.  They *are* eggs, so not providing them as eggs means not 
providing them at all.


I would expect that you can unegg a project. You can distribute the
project as a collection of Python modules, not as a collection of
Python resources. The Debian developer could (and I was suggesting
he should) just ignore the entire egg structure, and distribute
the code of the library only.


 If so, Debian should not distribute them.



This is what I don't understand, as it has nothing to do whether or not 
is a distribution format, at least not that I can see.  My statement was 
that eggs are not merely a distribution format; they are a logical 
concept that can be physically packaged in various ways, and if it's 
necessary to invent yet another physical layout, well, we can do that too.


Yes, but this logical concept is in the way of Debian 
packages/distributions (atleast if done naively by the Debian

developer). This is what started the entire discussion: Matthias
Urlichs complained that Bob Tanner included the egg structure
in the formencode Debian package/distribution.

The specific initial complaints where:
- you can't use it with a simple import formencode,
- pydoc does not work on eggs.

I would add the complaint:
- it increases sys.path for no good reason.

Which would be the same as saying you wouldn't distribute, say, 
setuptools itself.  Setuptools is an egg, and can't function except as 
an egg, because it is more than a Python package.  Again, an egg is 
some specific release of a project and its introspectable metadata.


I could rewrite setuptools to function as a regular Python package.
After a shallow inspection, there aren't many places where it really
needs the pkg_resources functionalities for itself - I could only
identify the part that locates cli.exe. As this is used on Windows
only, a Debian port of setuptools could simply ignore this code.


It is not a distutils setup because it does not invoke
distutils.core.setup.



Now I really don't understand you.  Line 43 of setuptools/__init__.py 
reads:


setup = distutils.core.setup

So, how is it not invoking distutils.core.setup?


Ah, I didn't look so far. I noticed that when I replace

from setuptools import setup

with

from distutils.core import setup

I get warnings about package_data and extras_require, and assumed this
means setup was a different function; instead, it really is the import
that plays tricks here.


Extending distutils is fine. An extension is a feature that, if not
invoked, has no effect. easy_setup changes install in a way that
has an effect.


So do all the packages that rework install_data to be more to their 
liking - and there are quite a lot of them, as I discovered when I began 
testing easy_install.


Right. It really isn't that much about what is and is not conforming;
it more matters what the practical effects on the Debian developer
are. If setup.py install just puts some files into some locations,
and the files don't conflict with files in other Debian
packages/distributions, the developer can easily package the entire
thing. If setup.py install does other things, like editing an
existing file, it is not so easy anymore.


That is not true. Usability also suffers if sys.path becomes long.



How?  I don't understand this.


People will often inspect sys.path to understand where Python
is looking for their code. They can do so manually if sys.path
fits on one or two lines of terminal output. On my system, it
is now four lines, primarily 

Re: [Distutils] formencode as .egg in Debian ??

2005-11-23 Thread Martin v. Löwis

Phillip J. Eby wrote:
I was referring to how the distribution is *installed*.  You don't use 
things directly from a deb file, they have to be installed on the 
system.  When you install an egg, you must use one of the three forms, 
or the system as a whole will not function.


That depends on whether the system (pkg_resources, I assume) is used
at all. If the project is just a Python library, you can install it
as a Python package in site-python, not as an egg.

Eggs that depend on the egg 
will not be able to find it, nor use any plugins it contains.


Not sure what an egg plugin is, so I cannot comment on that.
As for other eggs finding the one: In Debian, there normally shouldn't
be any need to, since there will be also a Debian package providing
the other project, and then a plain import will be sufficient to
find the Python package.

Of course, any usage of the pkg_resource API would break. One way
to deal with that is to encourage upstream authors to have a fallback
mode where they can work without pkg_resource; another is to provide
a fallback implementation of pkg_resource.

So, when I say it is a contradiction in terms to install an egg in a 
non-egg form, I mean that it is nonsensical to say that you have 
installed it, because it will be unusable (by other eggs), nonfunctional 
(by itself), or both.


That makes me not like the egg infrastructure: too many subtle
dependencies, and you are too much forced into using the structures
that the setuptools authors came up with.

Of course, the pragmatic view is just to bite the bitter pill (is
this the idiom?) and find some strategy that makes pkg_resource
work, without any of the drawbacks of setuptools.


I would expect that you can unegg a project.



For projects that make use of eggs, you expect wrong.  Try it with 
setuptools, and you will find that it is unable to even run its own 
tests, because the test command is registered via an entry point.  


I would have to rewrite the code, of course. I do all registration
that needs to be done in __init__.py

Entry points are just one kind of project metadata that can be 
registered; other projects like Trac and SQLObject have their own kinds 
of metadata as well.  None of this metadata is accessible without the 
EGG-INFO or .egg-info directory; removing it is like removing the 
JavaBean metadata or the deployment descriptors from Java jars, 
rendering the jar useless in many contexts, despite the fact that all 
the code remains.


Sure, *just* removing it would be wrong. I have to replace it with
Python code.

The only projects that can be unegged, then, are ones that no egg 
project depends on, and which do not themselves depend on any eggs.  The 
number of projects that are not depended on by other projects will be 
smaller and smaller over time, as will the number that do not depend on 
other eggs.


Define depends on. If this is imports, I don't see a problem with
unegging the package. If the dependent package is installed, the import
statement will just succeed right away.

In essence, trying to work around the absence of egg metadata is a 
bottomless pit, because over time there will be an ever-increasing 
amount of functionality in the field that is based on the use of metadata.


That is really sad.


I would add the complaint:
- it increases sys.path for no good reason.



It is only true that it increases the length in the case of the two .egg 
forms, not the .egg-info form.


Ok, then I think this is what Debian should use.

The no good reason part is an interesting opinion, although in my view 
it is rather narrow-minded.  Being able to support multi-version 
importing is a very good reason indeed, as is avoiding the need for a 
platform-specific package management tool in order to manage Python 
projects.


I don't see why multi-version support necessarily requires to
increase sys.path. In the case of eggs, version dependencies are
expressed explicitly in the code (through require() calls), so
that essentially replace the standard Python import search algorithm.
Because of that, you could have a default version inside site-packages,
and additional versions elsewhere, only found when require() is
called.

Regards,
Martin


--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [Distutils] formencode as .egg in Debian ??

2005-11-22 Thread Martin v. Löwis

Phillip J. Eby wrote:
Yes, it's true, zipfile import processing is faster than normal import 
processing; it is in fact one of the reasons zipfile imports were added 
to Python, because the zip directories are cached.  A zipfile import 
lookup is a single dictionary lookup, whereas a directory import lookup 
requires multiple stat() calls.  For all practical purposes, zipfiles 
added to sys.path are free after the initial directory read operation.


OTOH, it does add an overhead on startup, as it will have to read
the TOC of all zipfiles on sys.path, atleast if the module you are
looking for is in the last zipfile on the path. It then also adds
memory overhead, as the TOC of all files is cached in memory.

Note that the need for a .pth is a limitation caused by the requirement 
to have packages importable at startup.  Packages installed in 
multi-version or deactivated mode are only added to sys.path upon 
request and have no impact on startup time.  Relatively few eggs *need* 
to be installed with a .pth file; we are simply in a transitional period 
where people still expect installed packages to be importable without 
an additional require() operation.


People reasonable will have this expectation for a Debian package. If
you install a Debian package with some library, you expect the library
to be usable right away.

Finally, I think it's important to note that what Debian should or 
should not use isn't really relevant to Debian's users, who will quite 
simply need eggs for many packages.  If Debian doesn't provide them, the 
users will be forced to obtain them elsewhere.


Debian should provide the packages, but not as eggs. For a Debian user,
eggs do not add advantages, and for a Debian Developer, they only add
additional hassle.

Over time, the number of 
packages that users need in egg form will continue to increase, and 
there will be an increasing number of users wanting to know why Debian 
can't provide them.  It's perfectly reasonable not to redo existing 
Debian packages to use eggs, but for some packages, *not* using eggs is 
simply not an option.


Debian developers should work with upstream authors to keep a
distutils-based setup.py operational.

Regards,
Martin


--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [Distutils] formencode as .egg in Debian ??

2005-11-22 Thread Martin v. Löwis

Phillip J. Eby wrote:

If you have many zipfiles on sys.path, all applications will suffer
from having to read the TOC of all those zipfiles, even if they need
none of them. OTOH, if you had packages inside site-python, the
contents of the unused packages is simply ignored.



I'm sorry, but this is, shall we say, fact challenged?  .pth files' 
contents are added to the *end* of sys.path.  This means that stdlib 
imports and normal site-packages imports are satisfied *before* any 
hypothetical overhead from .pth entries, whether they're zipfiles or 
directories.


Correct. I was not talking about stdlib imports. I was talking about 
imports satisfied from the end of sys.path, or imports resulting in

ImportErrors.

If Python never reaches the .pth entries at runtime, it 
will not even read the zipfile TOCs, let alone attempting to stat() for 
contained packages.


Correct. However, a false preposition can imply anything: Python
*always* reaches the .pth entries atleast once, in a typical
installation, while looking for sitecustomize. This will cause
a load of all zipfiles on sys.path, before site.py is done.


Please check your facts before spreading untruths like this


I did check: I have a file a.pth in site-packages, which refers to
a.zip (in the same directory), and I have an empty Python file e.py.
Running

strace -o xxx python e.py

shows, among others

open(/usr/lib/python2.3/site-packages/a.zip, O_RDONLY|O_LARGEFILE) = 5
...
ead(5, PK\3\4\n\0\0\0\0\0\202\274v3\265\267\r\16\0\0\0\16\0\0..., 
132) = 132


So a.zip is read even though the program does not contain
a single import statement.

What is the untruth I'm spreading?

Regards,
Martin


--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [Distutils] formencode as .egg in Debian ??

2005-11-22 Thread Martin v. Löwis

Phillip J. Eby wrote:
This is simply not true.  If you don't believe PEP 302 and site.py, 
measure it for yourself.  The *only* addition to startup is the time to 
actually read the .pth file and append the entries to the list.


I did. strace shows that all zip files are loaded.

And how often do programs attempt to import non-existing modules along 
performance critical paths?


Every time. Atleast sitecustomize is imported in most programs (except
those skipping site.py), and is not present in most installations.
The standard library catches ImportError about 250 times, although
fewer expect the failure in a typical installation.

Regards,
Martin


--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [Distutils] formencode as .egg in Debian ??

2005-11-22 Thread Martin v. Löwis

Phillip J. Eby wrote:

Debian should provide the packages, but not as eggs.



For packages that only operate as eggs, and/or require their 
dependencies as eggs, you are stating a contradiction in terms.  Eggs 
are not merely a distribution format, any more than Java .jar files are.


So I should say

Debian should not provide eggs, period, since what Debian provides
are packages, and eggs are not?


Debian developers should work with upstream authors to keep a
distutils-based setup.py operational.



It's perfectly operational; clearly the entire egg system is *well* 
within the Python runtime's intended operating parameters, as it uses 
only well-defined and published aspects of the Python language, API, 
stdlib, and build process.


I didn't say the egg system in inoperational. I said that distutils
setup is not operational for, for example, FormEncode: this uses
another packaging library in setup.py, not distutils setup.

Perhaps you have some other definition of operational in mind? 


I had *distutils-based* setup.py in mind.

As 
I've already stated, applying this same policy to Java libraries would 
be to demanding that all the .class files be extracted to the filesystem 
and any manifest files be deleted, before Debian would consent to 
package them.  In other words, it would be silly and pointless, because 
the users would then ignore the packages in favor of actual jars, 
because then their applications would actually work.


This is not the same. A java .jar file is deployed by putting it on 
disk. For an egg, an (apparently undocumented) number of additional

steps is necessary, such as editing easy-install.pth.

In Java, the drawback of course is that each user has to edit
CLASSPATH to include all the jar files desired. easy_setup
makes this unnecessary, but in a way unfriendly to dpkg (and
I assume other Linux package formats).

Regards,
Martin


--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [Distutils] formencode as .egg in Debian ??

2005-11-22 Thread Martin v. Löwis

Phillip J. Eby wrote:
The only thing that occurs to me as even a possibility would be some 
kind of frequently-used system administration utility, like if you were 
going to rewrite all the bash builtin commands as Python scripts.


This whole discussion is not about whether the start time actually
matters - it is about whether it is a fact or not that eggs improve
the startup. Some people said it does, others said it doesn't, and this
is just the finding-of-facts phase.

Anyway,

 I'm terribly curious what Python applications exist for whom:
 1. Startup time is a consideration, that
 2. Haven't already been refactored to a long-running process.

For this, CGI scripts come to mind. Many people use them, and they
are often short-running, and they often get invoked frequently.

Then why was the python##.zip entry added to sys.path in Python 2.3?  My 
understanding was that it was added to allow Python to start faster by 
cutting down on extraneous stat() calls.


PEP 273 doesn't give much rationale:

Booting
...
Just as there are default directories in sys.path, there must be
one or more default zip archives too.

IIRC, it was to simplify deployment, having the entire library in
a single file.

Regards,
Martin


--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: formencode as .egg in Debian ??

2005-11-21 Thread Martin v. Löwis

Bob Tanner wrote:
Note also that in many cases, the package will be a single .egg *file*, 
(analagous to a Java .jar file) rather than a directory, and files are 
preferable to directories in most cases as they make Python import 
processing faster.


I don't think Debian should use the egg structure. It apparently relies
on building a long sys.path (even though through only a single .pth
file); this adds additional costs to all import statements on startup.
It gets worse if these are zipfiles, because then each import statement
will have to look into each zipfile (until the import is resolved).

If there is no way to install the package directly into site-packages
using the provided setup.py, I think setup.py should be
modified/ignored.

In the specific case of formencode, replacing the first three lines
of setup.py with

from distutils.core import setup

seems to work (except for the warning that there are unsupported
options).

Regards,
Martin


--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: Python policy proposed changes

2005-10-18 Thread Martin v. Löwis

To what cost? How many gigabytes of mirror space and bandwidth are we
wasting with python2.X-libprout stuff nobody ever uses?


I don't know. What is the answer to this question? I wouldn't expect
it to be more than 1GiB per mirror, though, likely much less. On
i386, for example, the useless python2.[124]- packages for example
seem to add up to 59MiB, if I counted correctly.


Even in a situation like the current one, when we're stuck with 2.3 as
the default when there's 2.4 available, there are only a few python
packages which actually need the 2.4 version. 


What do you mean, actually need? Every python2.3-foo package actually
needs python2.4. If you have only python2.3-foo installed, and do

~$ python2.4
Python 2.4.1 (#2, May  5 2005, 11:32:06)
[GCC 3.3.5 (Debian 1:3.3.5-12)] on linux2
Type help, copyright, credits or license for more information.
 import foo
Traceback (most recent call last):
  File stdin, line 1, in ?
ImportError: No module named foo

This is because python2.3-foo installed into python2.3's site-packages,
so it won't be available in python2.4. You really need a separate
package for 2.4.


In this case, the policy
states they should be built as python2.4-foo, until python2.4 becomes
the default. That's also why modules needed by a lot of binary packages
should be built as multi-binary packages, as there is a probability a
script requires both modules.


This I don't understand. You mean, a script might require both
python2.3-foo and python2.4-foo if foo contains an extension module?


But I'm not talking about python-gtk here, I'm talking about those
hundreds of modules actually used by zero or one binary packages. Do we
need multi-binary packages for them? Compared to the waste of human and
computer resources this implies, I'm pretty sure it's not worth the
deal.


It's a policy decision, obviously. I wonder how many users you have
interviewed or what other criteria you have used to decide what is
best for the users. IOW, even if this policy is chosen, it lacks
rationale, IMO.


Of course, supporting versions older than the default version is rarely
needed, except when there are applications that require such older
versions. So when 2.4 becomes the default, only 2.4 (and perhaps 2.5)
packages should be built.



Don't you understand that it's even more added work to remove the legacy
python 2.1 to 2.3 packages in hundreds of packages ?


It is more work, but I don't understand why it is significantly more
work. Maintainers just have to remove all traces of 2.1, 2.2, and 2.3.
from their debian directory, and rebuild, no?

Anyway, it's hardly hundreds of. I counted 194 python2.3- packages,
82 python2.2- packages, and 46 python2.1- packages. There are also
125 python2.4- packages, so the majority of the packages has already
prepared for the transition.

Regards,
Martin


--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: Python policy proposed changes

2005-10-17 Thread Martin v. Löwis

Josselin Mouette wrote:

Apart from a typo and the FSF address, the changes are about which
packaging variants are mandated, recommending to provide only one
python-foo package for each module, except when depending applications
mandate another python version.

This way, we could enforce that policy during the transition, removing
hundreds of cruft python2.X-foo packages.


I don't like this policy. the python2.X-foo are not at all cruft; they
provide a useful feature.

With the multi-version build, you can support Python versions more 
recent than the default python version, which is useful for people

who would like to use more recent versions: they don't have to rebuild
all their extension modules themselves.

It also simplifies the transition from one Python version to the next:
people can build and test their packages against newer versions long
before Debian updates the default. That way, when the default version
changes, they just have to turn a switch in the default package. This
reduces the amount of work necessary in the transition phase itself.

Of course, supporting versions older than the default version is rarely
needed, except when there are applications that require such older
versions. So when 2.4 becomes the default, only 2.4 (and perhaps 2.5)
packages should be built.

Regards,
Martin


--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: Bug#229370: python2.3: Default site.py breaks stuff

2004-01-25 Thread Martin v. Löwis
Matthias Klose wrote:
The default /etc/python2.3/site.py specifies ascii as a system
encoding. This causes errors if non-ascii characters are fed to
python programs unaware of i18n/l10n issues (eq. libglade-convert
script). Please make utf-8 (which is backwards compatible but will not
cause fatal errors) or enable locales in default site.py.
I would strongly advise against making it locale-aware - this would
mean that locale is considered in strange places, causing moji-bake,
and the cause of the moji-bake would be difficult to find. It also
means that the same program may work for some users and fail for
others.
Setting it to utf-8 would work, but it would mean that Debian
deviates from all other Python installations in the world.
Changing it locally is somewhat recommended; such changes should
be carried out through sitecustomize.py, instead of editing site.py.
The real solution is to fix the buggy applications, i.e.
libglade-convert in this case.
Regards,
Martin



Re: zip archive in python search path

2003-09-11 Thread Martin v. Löwis
Torsten Landschoff wrote:
Today I got the attached two mails. I wonder how this happens and how to
fix it. Is it correct that zip archives are supported in sys.path now?
Yes, see PEP 273.
In that case probably python-gtk needs fixing. Otherwise something in
python is wicked.
No, it is behaving according to the spec.
Regards,
Martin