Re: [Python-Dev] PyString - PyBytes C API renaming (Stabilizing the C API of 2.6 and 3.0)

2008-05-28 Thread M.-A. Lemburg

On 2008-05-28 14:02, Christian Heimes wrote:

M.-A. Lemburg schrieb:

I have a feeling that we should be looking for better merge
tools, rather than implement code changes that cause more trouble
than do good, just because our existing tools aren't smart
enough.


We don't have better tools at our hands. I don't think we'll get any
tools in time or chance the VCS right before a major release.


Wouldn't it be possible to have a 2to3.py converter
take the 2.x code (including the C code), convert it and then
apply any changes to the 3.x branch ?


Such a converter would be nice for 3rd party code but it's not an option
for the core. In the past few months I've merged a lot of code from
trunk to py3k. A 2to3 C converter doesn't help with merge conflicts.
Naming differences make any merge more painful


I was suggesting to not use SVN to merge changes directly, but to
instead use an intermediate step in the process:

Init:

 1. grab the latest trunk

 2. apply a 2to3 converter to the Python code and the C code,
applying any renaming that may be necessary

 3. save this converted version in a separate branch merge-branch

Update:

 1. checkout the merge-branch,
  . grab the latest trunk and 3.x branch

 2. apply a 2to3 converter to the Python code and the C code,
applying any renaming that may be necessary

 3. copy the files over your working copy of the merge-branch

 4. create a diff on the merge-branch

 5. apply the diffs to 3.x branch, resolving any conflicts
as necessary

This doesn't require new tools (except for some C renaming
support in the 2to3 tool). It only changes the procedure.

We'd basically follow our own suggestions w/r to porting to 3.x,
which is to make changes in the 2.x code, apply 2to3 and then
apply remaining fixes there.

I'm suggesting this, since 3.x is likely to introduce more
Python stdlib and C API changes. The process would likely also
makes a lot of other changes more easily manageable and reduce
the overall merge conflicts.


I find the approach less confusing than your suggestion and my initial
idea.

I disagree on that.

Renaming old APIs to use the new names by adding a header file with
#define oldname newname is standard practice.

Renaming the old APIs in the source code and undoing the renaming
with a header file is not.


I wasn't talking about standard practice here. I talked about less
confusion for core developers. My approach doesn't split our internal
API in two.


No, but it does apply a well hidden renaming which will cause
confusion when using a debugger to trace calls in C code.

If you use PyBytes APIs, you expect to find PyBytes functions in
the libs and also set breakpoints on these.

With the renaming we don't have two sets of APIs (old and new) exposed
in the lib, like what we normally do when applying changes to API names.


And by the way it *is* a standard approach fore Python. Guido told me
that the same approach was used during the 1.x to 2.0 migration.


There was no API change between 1.6 and 2.0.

You are probably talking about the great renaming between 1.4 and 1.5.
That was different, since it changes almost all C APIs in Python.
And it used the standard practice... from rename2.h in Python 1.5:

/* This file contains a bunch of #defines that make it possible to use
   old style names (e.g. object) with the new style Python source
   distribution. */

#define True Py_True
#define False Py_False
#define None Py_None

ie. #define oldname newname


And all this, just because Subversion can't handle merging of
symbol renaming.


As I said earlier we don't have better tools at our disposal. We have to
make some compromises. Sometimes practicality beat purity.


See above.


Please discuss any changes of the 2.x code base on python-dev.

Such major changes do need more discussion and possibly a PEP as well.


In the last few months I started at least three topics about the C API
renaming. It's in the thread 2.6 and 3.0 tasks
http://permalink.gmane.org/gmane.comp.python.devel/93016


Thanks. I stopped reading that thread after Guido's reply in

http://comments.gmane.org/gmane.comp.python.devel/92541

It would really help if subject lines were more specific.

This thread also uses a much to general subject line (which is
why I changed it).

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 28 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2008-07-07: EuroPython 2008, Vilnius, Lithuania39 days to go

 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht

Re: [Python-Dev] Importing bsddb 4.6.21; with or without AES encryption?

2008-05-23 Thread M.-A. Lemburg

On 2008-05-23 01:15, Bill Janssen wrote:

That's all fine, but then I'm missing the OpenSSL license and
attribution notice somewhere in the installer, the README of the
installation or elsewhere.


Good point.  We need this for both the ssl module and the hashlib
module.


FYI: I've opened ticket #2949 to track this.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 23 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] buffer interface for C extensions

2008-05-22 Thread M.-A. Lemburg

On 2008-05-19 00:59, Dan Lenski wrote:

Hi all,

I've written a small C extension to submit commands to SCSI devices via 
Linux's sg_io driver (for a camera hacking project).  The extension is 
just a wrapper around a couple ioctl()'s with Pythonic exception handling 
thrown in.  One of my extension methods is called like this from python:


sg.write(fd, command[, data, timeout)

Both command and data are binary strings.  I would like to be able to use 
either a regular Python string or an array('B', ...) for these read-only 
arguments.  So I tried to use the t# argument specifier to indicate that 
these arguments could be either strings or objects that implement the read-

only buffer interface:

if (!PyArg_ParseTuple(args, it#|t#i:write, sg_fd, cmd,
  cmdLen, buf, bufLen, timeout))
return NULL;

Now, this works fine with strings, but when I call it with an array I get 
a TypeError:


TypeError: write() argument 2 must be string or read-only character 
buffer, not array.array


So, I then tried changing t# to w# to indicate that the arguments must 
implement the /read-write/ buffer interface.  Now the array objects work, 
but when I try a string argument, I naturally get this error:


TypeError: Cannot use string as modifiable buffer

So here's what I don't understand.  Why doesn't the t# argument 
specifier support read-write buffers as well as read-only buffers?  Aren't 
read-write buffers a *superset* of read-only buffers??  Is there something 
I'm doing wrong or a quick fix to get this to work appropriately?


You should probably ask such questions on the capi-sig list.

To answer your question:

t# requires support for the read-only 8-bit character buffer interface
s# can use the read buffer interface
w# requires support for the write buffer interface

Those are two different buffer interface slots, so whether a
particular object works with t# or w# depends on whether it
implements the slots in question.

You should probably try s#, as this will also work with objects
that implement the getreadbuffer slot.

The details can be found in Python/getargs.c

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 22 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Importing bsddb 4.6.21; with or without AES encryption?

2008-05-22 Thread M.-A. Lemburg

On 2008-05-20 00:46, Jesus Cea wrote:

Trent Nelson wrote:
| I downloaded the source that includes AES encryption, for no reason
| other than it was first on the list.  I'm now wondering if we should
| only be importing the 'NC' source that doesn't contain any
| encryption?  Jesus, does pybsddb use any of the Berkeley DB
| encryption facilities?  Would anything break if we built the
| bsddb module without encryption?

Yes, pybsddb3 4.6.4 supports cryptography if the underlying Berkeley DB
library is crypto enabled.

In principle, you can compile BDB without crypto, and pybsddb3 should
work, but you would lose ability to open any DB formerly created using
page encryption or page checksum.

Export laws aside, we better compile with crypto :).


I hope you're only talking about the Windows build...

In any case, if you do include crypto code in the Windows installer,
please make sure that the PSF is informed, so that the proper
reporting procedure can be put in place (whatever it is nowadays
in the US).

The installer already includes the ssl module, so it's not problem to
include crypto code in general.

BTW: AFAIK the _ssl module is built against OpenSSL. Since I couldn't
find any OpenSSL DLLs in my Python install dir and due to the size
of the _ssl.pyd, I assume that it is statically linked against OpenSSL.
That's all fine, but then I'm missing the OpenSSL license and
attribution notice somewhere in the installer, the README of the
installation or elsewhere.

Thanks,
--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 22 2008)

Python/Zope Consulting and Support ...http://www.egenix.com/
mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/



 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: new environment variable PYTHONSTDOUTENCODING

2008-05-20 Thread M.-A. Lemburg

On 2008-05-20 10:22, Martin v. Löwis wrote:

I'd like to propose a new environment variable PYTHONSTDOUTENCODING.
This is meant to solve various problems that people had with Python
not detecting their terminal encoding correctly; it would override
any detection that Python would use for determining the encoding of
stdout (and stdin - but that's less relevant in 2.x).


How is this relevant for 2.x ?

In 2.x, stdin and stdout are just files without any io wrappers
around them.

Writing Unicode to stdout will still use the default encoding
ASCII to convert it to an 8-bit string. All other 8-bit strings
will be passed to stdout as-is.

For 3.x, I'd like to see a PYTHONSTDINENCODING, because the current
way of relying on the terminal encoding does work well... it then
falls back to ASCII, which prevents entering e.g. German Umlauts.


In particular, setting this environment variable would also disable
the detection of whether stdout is a terminal. This is desirable
for cases as the pydev eclipse plugin, where Python currently
fails to detect that the output is a terminal (and technically,
what Eclipse provides is not a terminal, but just a pipe, as you
can't do pseudoterms in Java).

This would have the additional effect that the encoding also gets
in effect when redirecting stdout to a file. Whether or not this
is a good thing might be debatable; giving the user the control over
it (to set or clear that variable) is a good thing, IMO.

Naming contest: it probably would be the longest of the PYTHON*
variables. I would not want to call it PYTHONENCODING, or
PYTHONSTDENCODING, though, because people might infer that it
affects sys.getdefaultencoding(), which it shouldn't.


--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 20 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: new environment variable PYTHONSTDOUTENCODING

2008-05-20 Thread M.-A. Lemburg

On 2008-05-20 12:16, Thomas Wouters wrote:

On Tue, May 20, 2008 at 10:41 AM, M.-A. Lemburg [EMAIL PROTECTED] wrote:


On 2008-05-20 10:22, Martin v. Löwis wrote:


I'd like to propose a new environment variable PYTHONSTDOUTENCODING.
This is meant to solve various problems that people had with Python
not detecting their terminal encoding correctly; it would override
any detection that Python would use for determining the encoding of
stdout (and stdin - but that's less relevant in 2.x).


How is this relevant for 2.x ?

In 2.x, stdin and stdout are just files without any io wrappers
around them.

Writing Unicode to stdout will still use the default encoding
ASCII to convert it to an 8-bit string. All other 8-bit strings
will be passed to stdout as-is.



You're forgetting about print; in Python 2.x, when stdout is connected to a
terminal, the locale settings (typically the LANG, LC_ALL and LC_CTYPE
environment variables) are taken into account when 'print' writes to
sys.stdout.


Thanks for reminding me. I had forgotten about that special case.

So sys.stdout.write(unicode) will always use the default encoding,
whereas print unicode uses the sys.stdout.encoding, correct ?

Hmm, wouldn't it be better to always use .encoding and also make
it adjustable from Python (it is adjustable from C) ?!

PYTHONSTDOUTENCODING could then provide the default to
sys.stdout.encoding.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 20 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: new environment variable PYTHONSTDOUTENCODING

2008-05-20 Thread M.-A. Lemburg

On 2008-05-20 20:23, Martin v. Löwis wrote:

Writing Unicode to stdout will still use the default encoding
ASCII to convert it to an 8-bit string.


That's not true.


Are you sure ?

 setenv LC_ALL de_DE.utf8
 python2.5
Python 2.5 (r25:51908, May  9 2007, 00:53:06)
 u = u'äöü'
 sys.stdout.write(u)
Traceback (most recent call last):
  File stdin, line 1, in module
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: 
ordinal not in range(128)
 print u
äöü


Only print will set the Py_PRINT_RAW flag to trigger the conversion from
Unicode to 8-bit strings using .encoding in PyFile_WriteObject().

If not set, the default encoding is used.

I'm not exactly sure why, since using .encoding would be useful
in all cases.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 20 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Module renaming and pickle mechanisms

2008-05-19 Thread M.-A. Lemburg

On 2008-05-18 22:24, Brett Cannon wrote:

On Sun, May 18, 2008 at 6:14 AM, Nick Coghlan [EMAIL PROTECTED] wrote:

M.-A. Lemburg wrote:

Perhaps I have a misunderstanding of the reasoning behind
doing the renaming in the 2.x branch, but it appears that
the only reason is to get used to the new names. That's a
rather low priority argument in comparison to the breakage
the renaming will cause in the 2.x branch.

I think this is the key point here. The possibility of breaking pickling
compatibility never came up during the PEP 3108 discussions, so wasn't taken
into account in deciding whether or not backporting the name changes was a
good idea.

I think it's pretty clear that the code needs to be moved back into the
modules with the old names for 2.6. The only question is whether or not we
put any effort into making the new stdlib organisation usable in 2.x, or
just rely on 2to3 to fix it (note that the increasing the common subset
argument doesn't really apply, since you can catch the import errors in
order to try both names).


Problem with this is it makes forward-porting revisions to 3.0 a PITA.
By keeping the module names consistent between the versions merging a
revision is just a matter of ``svnmerge merge`` with the usual
3.0-specific changes. Reverting the modules back to the old name will
make forward-porting much more difficult as I don't think svn keeps
rename information around (and thus map the old name to the new name
in terms of diffs).


svnmerge is written in Python, so wouldn't it be possible to add
support for maintaining such renaming to that tool ?

I don't think that an administrative problem such as forward-
porting patches to 3.x warrants breakage in the 2.x branch.

After all, the renaming was approached for Python 3.0 and not
2.6 *because* it introduces major breakage.

AFAIR, the discussion on the stdlib-sig also didn't include the
plan to backport such changes to 2.6. Otherwise, we would have
hashed them out there.


Alexandre's idea of teaching pickle the mapping of old names to new
might be the best solution. We could have a flag to pickle that
deactivates the renaming. Otherwise we could bump the pickle version
number so that the new number doesn't do the mapping while the old
versions to the implicit module mapping.

And as Greg and Glpyh have pointed out, this is a problem that might
need to be addressed in the future with some changes to our
serialization method (I have no clue how since I don't deal with
pickle very much).


It is possible to make pickle aware of the module renames, but
that doesn't solve problems with other forms of serialization
or use of the .__module__ attribute in general.

Why can't we just provide a from __future__ import renamed_modules
which then provides all the new name to old name mappings in
some form (e.g. module proxies or whatever) and leave the
existing modules in 2.x untouched ?

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 19 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Module renaming and pickle mechanisms

2008-05-18 Thread M.-A. Lemburg

On 2008-05-17 16:59, Alexandre Vassalotti wrote:

On Sat, May 17, 2008 at 5:05 AM, M.-A. Lemburg [EMAIL PROTECTED] wrote:

I'd like to bring a potential problem to attention that is caused
by the recent module renaming approach:

Object serialization protocols like e.g. pickle usually store the
complete module path to the object class together with the object.


Thanks for bringing this up. I was aware of the problem myself, but I
hadn't yet worked out a good solution to it.



It can also happen in storage setups where Python
objects are stored using e.g. pickle, ZODB being a prominent
example. As soon as a Python 2.6 application starts writing to
such storages, Python 2.5 and lower versions will no longer be
able to read back all the data.



The opposite problem exists for Python 3.0, too. Pickle streams
written by Python 2.x applications will not be readable by Python 3.0.
And, one solution to this is to use Python 2.6 to regenerate pickle
stream.

Another solution would be to write a 2to3 pickle converter using the
pickletools module. It is surely not the most elegant or robust
solution, but I could work.


I'm not really worried much about going from 2.x to 3.x.
Breakage is allowed for that transition.

However, the case is different for going from 2.5 to 2.6. Breakage
should be avoided if at all possible.


Now, I think there's a way to solve this puzzle:

Instead of renaming the modules (e.g. Queue - queue), we leave
the code in the existing modules and packages and instead add
the new module names and package structure with pointers and
redirects to the existing 2.5 modules.


This would certainly work for simple modules, but what about packages?
For packages, you can't use the ``sys.modules[__name__] = Queue`` to
preserve module identity. Therefore, pickle will use the new package
name when writing its streams. So, we are back to the same problem
again.

A possible solution could be writing a compatibility layer for the
Pickler class, which would map new module names to their old at
runtime. Again, this is neither an elegant, nor robust, solution, but
it should work in most cases.


While it's possible to fix pickle (at least the Python version),
this would not help with other serialization formats that rely
on the .__module__ attribute mapping to an existing module.

It's better to address the problem at the module level.

Perhaps I have a misunderstanding of the reasoning behind
doing the renaming in the 2.x branch, but it appears that
the only reason is to get used to the new names. That's a
rather low priority argument in comparison to the breakage
the renaming will cause in the 2.x branch.

I think it's much better to have 2to3.py do the renaming
and only add warnings to the renamed modules in 2.x
(without actually applying any renaming).

It would also be possible to seed sys.modules with module
proxy objects (see e.g. mx.Misc.LazyModule from egenix-mx-base)
which only turn into real module object if the module is
referenced.

This would allow adding a from __future__ import new_module_names
which then results in loading proxies for all renamed modules
(without actually loading the modules until they are used under
their new names).

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 18 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Module renaming and pickle mechanisms

2008-05-17 Thread M.-A. Lemburg

I'd like to bring a potential problem to attention that is caused
by the recent module renaming approach:

Object serialization protocols like e.g. pickle usually store the
complete module path to the object class together with the object.

They access this module path by looking at the __module__ attribute
of the object classes.

With the renaming, all objects which use classes from the renamed
modules will now refer to the renamed modules in their serialized
form, e.g. queue.Queue instead of Queue.Queue (just to name one
example).

While this is nice for forward compatibility, it causes rather serious
problems for making object serialization backwards compatible, since
the older Python versions can no longer unserialize objects due
to missing modules.

This can happen in client-server setups where e.g. the server
uses Python 2.6 and the clients some other Python version (e.g.
Python 2.5).

It can also happen in storage setups where Python
objects are stored using e.g. pickle, ZODB being a prominent
example. As soon as a Python 2.6 application starts writing to
such storages, Python 2.5 and lower versions will no longer be
able to read back all the data.

Now, I think there's a way to solve this puzzle:

Instead of renaming the modules (e.g. Queue - queue), we leave
the code in the existing modules and packages and instead add
the new module names and package structure with pointers and
redirects to the existing 2.5 modules.

Code can (and probably should) still be changed to try to import
the new module name. In cases where backwards compatibility is
needed, this can also be done using

try:
import newname
except ImportError:
import oldname

Later on, when porting applications to 3.0, the 2to3 script can
then apply the final renaming in the source code.

Example:

queue.py:
-
import sys, Queue
sys.modules[__name__] = Queue

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 17 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] How best to handle the docs for a renamed module?

2008-05-16 Thread M.-A. Lemburg

On 2008-05-12 04:34, Brett Cannon wrote:

For the sake of argument, let's consider the Queue module. It is now
named queue. For 2.6 I plan on having both Queue and queue listed in
the index, with Queue deprecated with instructions to use the new
name.

But what to do about all the references. Should we leave them pointing
at Queue to lessen confusion for people who read about some module on
some other site that isn't using the new name, or update everything in
2.6 to use the new name?


How hard would it be to add a redirects from the old pages to the
new ones ?

mod_rewrite does wonders - well, provided you find the right patterns...

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 16 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Distutils configparser rename

2008-05-16 Thread M.-A. Lemburg

On 2008-05-15 22:33, A.M. Kuchling wrote:

Python 2.6 renames the ConfigParser module to be configparser.

Distutils imports ConfigParser in various places.  I just made a
commit updating the import in one places, and then noticed that part
of commit r63248, which made the same change, was reverted in order to
preserve backward-compatibility.  Instead, the default path will
include lib-old again to keep the old module name available.

I suggest dropping that goal, though.  We've preserved compatibility
but I'm not aware that anyone uses the Python 2.x Distutils with
earlier versions of Python.  In particular:

* There's no standalone distutils package on PyPI, nor can I find
  such a package with a general web search.  Am I missing it?

* I do not see users advising other users to use a later version of 
  Distutils to fix their problems.


Is anyone actually benefiting from the effort of maintaining backward
compatibility?


Yes: all the folks who want to create distutils packages for more than
just the current Python version.

I've argued for this a couple of times in the past. Some background:

In order to build a Python package for a previous Python version,
you have to run distutils using that older Python version.

Now, as distutils evolves, new features are added, bugs are fixed,
etc. so as packager you always want to use the latest distutils
version available - even with older Python releases. In some cases,
e.g. PyPI registration, this may even be necessary, since the
new versions of those commands need to be kept in sync with the
PyPI repository.

Another aspect is keeping package setup.py files working.

If you need to support multiple Python versions, then your
setup.py will have to work with multiple different versions
of distutils.

Since performance doesn't really matter for distutils, it is well
possible and easy to keep compatibility with a few releases back.

This has worked great in the past and I don't see why we should
break this, as recent distutils checkins have done.

Note that Python doesn't exactly make it easy to ship Python
packages. You have several different dimensions to take into
consideration:

 * Python version
 * UCS2/UCS4
 * Platform and processor type
 * 32/64-bit

So there already is a lot of porting effort needed to support
a reasonable number of targets.

I don't think it takes a lot of effort to keep distutils
running with Python 2.3 and 2.4.

In the past I've usually rewritten parts of distutils that
were modified in incompatible ways. I haven't been able to
that for the recent checkins that broke distutils even on
Python 2.4.

Thanks,
--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 16 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Symbolic errno values in error messages

2008-05-16 Thread M.-A. Lemburg

On 2008-05-16 16:15, Nick Coghlan wrote:

Alexander Belopolsky wrote:

Yannick Gingras ygingras at ygingras.net writes:

2) Where can I find the symbolic name in C?


Use standard C library char* strerror(int errnum) function.   You can see
an example usage in Modules/posixmodule.c (posix_strerror).


I don't believe that would provide adequate Windows support.


Well, there's still the idea of a winerror module:

http://bugs.python.org/issue1505257

Perhaps someone can pick it up and turn it into a (generated) C
module ?!

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 16 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Symbolic errno values in error messages

2008-05-16 Thread M.-A. Lemburg

On 2008-05-16 17:02, Alexander Belopolsky wrote:

On Fri, May 16, 2008 at 10:52 AM, Yannick Gingras [EMAIL PROTECTED] wrote:


print e

[Errno 21] Is a directory

So now I am not sure what OP is proposing.  Do you want to replace 21
with EISDIR in the above?

Yes, that's what I had in mind.



In this case, I have a more drastic proposal.  Lets change
EnvironmentError errno attribute (myerrno in C) to string.  


-1

You never want to change an integer field to a string.


'EXYZ'
strings can be interned, which will make them more efficient than
integers for lookups and comparisons (to literals).  A half-way and
backward compatible solution would be to stick 'EXYZ' code at the end
of the args tuple and add an errnosym attribute.


Actually, you don't have to put it into any tuple. Just add it
to the error object as attribute.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 16 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Addition of pyprocessing module to standard lib.

2008-05-14 Thread M.-A. Lemburg

On 2008-05-14 14:15, Jesse Noller wrote:

On Wed, May 14, 2008 at 5:45 AM, Christian Heimes [EMAIL PROTECTED] wrote:

Martin v. Löwis schrieb:


I'm worried whether it's stable, what user base it has, whether users

  (other than the authors) are lobbying for inclusion. Statistically,
  it seems to be not ready yet: it is not even a year old, and has not
  reached version 1.0 yet.

 I'm on Martin's side here. Although I like to see some sort of multi
 processing mechanism in Python 'cause I need it for lots of projects I'm
 against the inclusion of pyprocessing in 2.6 and 3.0. The project isn't
 old and mature enough and it has some competitors like pp (parallel
 processing).

 On the one hand the inclusion of a package gives it an unfair advantage
 over similar packages. On the other hand it slows down future
 development because a new feature release must be synced with Python
 releases about every 1.5 years.

 -0.5 from me

 Christian



I said this in reply to Martin - but the competitors (in my mind) are
not as compelling due to the alternative paradigm for application
construction they propose. The processing module is an easy win for
us if included.

Personally - I don't see how inclusion in the stdlib would slow down
development - yes, you have to stick with the same release cycle as
python-core, but if the module is feature complete and provides a
stable API as it stands I don't see following python-core timelines as
overly onerous.

The module itself doesn't change that frequently - the last release in
April was a bugfix release and API consistency change (the API would
have to be locked for inclusion obviously - targeting a 2.7/3.1
release may be advantageous to achieve this).


Why don't you start a parallel-sig and then hash this out with other
distributed computing users ?

You could then reach a decision by the time 2.7 is scheduled for release
and then add the chosen module to the stdlib.

The API of the processing module does look simple and nice, but
parallel processing is a minefield - esp. when it comes to handling
error situations (e.g. a worker failing, network going down, fail-over,
etc.).

What I'm missing with the processing module is a way to spawn processes
on clusters (rather than just on a single machine).

In the scientific world, MPI is the standard API of choice for doing
parallel processing, so if we're after standards, supporting MPI
would seem to be more attractive than the processing module.

http://pypi.python.org/pypi/mpi4py

In the enterprise world, you often find CORBA based solutions.

http://omniorb.sourceforge.net/

And then, of course, you have a gazillion specialized solutions
such as PyRO:

http://pyro.sourceforge.net/

OTOH, perhaps the stdlib should just include entry-level support
for some form of parallel processing, in which case processing
does look attractive.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 14 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Tool for converting %-formatting to .format()ing ?

2008-05-10 Thread M.-A. Lemburg

On 2008-05-10 01:18, Martin v. Löwis wrote:

Is there a tool available that can convert 2.x code automagically
to the .format() method syntax ?

Just did a quick grep of our code base and it has some 2000 lines of code
that would need to be changed.


Why do you think this code needs to change?

I'd leave all the code as-is, and might not start using .format before
Python 3.2, unless some coding convention says I have to.


True, just wanted to know whether there is such a tool.

I personally like the %-notation a lot, mainly because it's more
or less the same as in C.

%i, %s and %r are by far the most used format characters in our code base.
Determining the position index and writing {0!s} or {0!r} instead
(which requires quite a finger dance on a German keyboard) doesn't
make .format() really attractive, IMHO.

Perhaps you're right and it's better to wait a few rounds of
refinements of .format() before jumping on that train :-)

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 10 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-3000] [Python-checkins] r62848 - python/trunk/Objects/setobject.c

2008-05-09 Thread M.-A. Lemburg

On 2008-05-08 13:59, Barry Warsaw wrote:

On May 8, 2008, at 7:54 AM, Benjamin Peterson wrote:


On Thu, May 8, 2008 at 6:32 AM, Barry Warsaw [EMAIL PROTECTED] wrote:

Since the trunk buildbots appear to be mostly happy (well those that are
connected anyway), and because I couldn't get the releases out last 
night,
I'll let this one slide.  I'd like to find a way to more forcefully 
enforce

commit freezes for the betas though.



I wonder if you couldn't alter the server side commit hook to reject
everything with the message Sorry, we're in a freeze. (You'd have to
make an exception for yourself.)


This is exactly what I'm thinking about!


+1, that's easy to do with Subversion and doesn't hurt anyone.

Please also use a term like freeze or frozen in the subject line
of the announcement - perhaps even in capital letters.

Thanks,
--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 09 2008)

Python/Zope Consulting and Support ...http://www.egenix.com/
mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/



 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Tool for converting %-formatting to .format()ing ?

2008-05-09 Thread M.-A. Lemburg

Is there a tool available that can convert 2.x code automagically
to the .format() method syntax ?

Just did a quick grep of our code base and it has some 2000 lines of code
that would need to be changed.

Thanks,
--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 09 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Tool for converting %-formatting to .format()ing ?

2008-05-09 Thread M.-A. Lemburg

On 2008-05-09 15:29, [EMAIL PROTECTED] wrote:

mal Is there a tool available that can convert 2.x code automagically
mal to the .format() method syntax ?

mal Just did a quick grep of our code base and it has some 2000 lines
mal of code that would need to be changed.

I suggested a 2to3 fixer for this but was shot down.


Well, ideally such a tool should address 2to2 :-)

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 09 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-3000] Reminder: last alphas next Wednesday 07-May-2008

2008-05-04 Thread M.-A. Lemburg

On 2008-05-04 18:14, Christian Heimes wrote:

First, Skip, I *only* care about the default behavior.  There's already
a way to do it differently: PYTHONPATH.  So, Fred, I think what you're
arguing for is to drop this feature entirely.  Or is there some other
use for a new way to allow users to explicitly add something to
sys.path, aside from PYTHONPATH?  It seems that it would add more
complexity and I can't see what the value would be.


PYTHONPATH is lacking one feature which is important for lots of
packages and setuptools. The directories in PYTHONPATH are just added to
sys.path. But setuptools require a site package directory. Maybe a new
env var PYTHONSITEPATH could solve the problem.


We don't need another setup variable for this. Just place a
well-known module into the site-packages/ directory and then
query it's __file__ attribute, e.g.

site-packages/site_packages.py

The module could even include a few helpers to query various
settings which apply to the site packages directory, e.g.

site_packages.get_dir()
site_packages.list_packages()
site_packages.list_modules()
etc.


As I've said a dozen times in this thread already, the feature I'd like
to get from a per-user installation location is that 'setup.py install',
or at least some completely canonical distutils incantation, should
work, by default, for non-root users; ideally non-administrators on
windows as well as non-root users on unixish platforms.


The implementation of my PEP provides a new option for install:

$ python setup.py install --user

Is it sufficient for you?


Just in case you don't know...

python setup.py install --home=~

will install to ~/lib/python

The problem is not getting the packages installed in a non-admin
location. It's about Python looking in a non-admin location per
default (as well as in the site-packages location).

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 04 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-3000] Reminder: last alphas next Wednesday 07-May-2008

2008-05-04 Thread M.-A. Lemburg

On 2008-05-04 21:57, Christian Heimes wrote:

M.-A. Lemburg schrieb:

PYTHONPATH is lacking one feature which is important for lots of
packages and setuptools. The directories in PYTHONPATH are just added to
sys.path. But setuptools require a site package directory. Maybe a new
env var PYTHONSITEPATH could solve the problem.

We don't need another setup variable for this. Just place a
well-known module into the site-packages/ directory and then
query it's __file__ attribute, e.g.

site-packages/site_packages.py

The module could even include a few helpers to query various
settings which apply to the site packages directory, e.g.

site_packages.get_dir()
site_packages.list_packages()
site_packages.list_modules()
etc.


I don't see how it is going to solve the use case Add another site
package directory when I don't have write access to the global site
package directory and I don't want to modify my apps.


No, but it's going to solve the issue which of the sys.path directories
is to be considered the site packages directory. I was under the
impression that this is what you were after.


Just in case you don't know...

python setup.py install --home=~

will install to ~/lib/python

The problem is not getting the packages installed in a non-admin
location. It's about Python looking in a non-admin location per
default (as well as in the site-packages location).


I know the --home option. For one the --home option is Unix only and not
supported on Windows Also the --user option takes all options of my PEP
370 user site directory into account, includinge the PYTHONUSERBASE env var.


Ok. Just wanted to mention that there is a precedent in distutils
for doing user home directory installations.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 04 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Encoding detection in the standard library?

2008-04-23 Thread M.-A. Lemburg

On 2008-04-23 07:26, Terry Reedy wrote:
Martin v. Löwis [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]

| I certainly agree that if the target set of documents is small enough it
|
| Ok. What advantage would you (or somebody working on a similar project)
| gain if chardet was part of the standard library? What if it was not
| chardet, but some other algorithm?

It seems to me that since there is not a 'correct' algorithm but only 
competing heuristics, encoding detection modules should be made available 
via PyPI and only be considered for stdlib after a best of breed emerges 
with community support. 


+1

Though in practice, determining the best of breed often becomes a
problem (see e.g. the JSON implementation discussion).

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Apr 23 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Encoding detection in the standard library?

2008-04-22 Thread M.-A. Lemburg

On 2008-04-21 23:31, Martin v. Löwis wrote:
This is useful when you get a hunk of data which _should_ be some  
sort of intelligible text from the Big Scary Internet (say, a posted  
web form or email message), and you want to do something useful with  
it (say, search the content).


I don't think that should be part of the standard library. People
will mistake what it tells them for certain.


+1

I also think that it's better to educate people to add (correct)
encoding information to their text data, rather than give them a
guess mechanism...

http://chardet.feedparser.org/docs/faq.html#faq.yippie

chardet is based on the Mozilla algorithm and at least in
my experience that algorithm doesn't work too well.

The Mozilla algorithm may work for Asian encodings due to the fact
that those encodings are usually also bound to a specific language
(and you can then use character and word frequency analysis), but
for encodings which can encode far more than just a single language
(e.g. UTF-8 or Latin-1), the correct detection rate is rather low.

The problem becomes completely even more difficult when leaving
the normal text domain or when mixing languages in the same
text, e.g. when trying to detect source code with comments using
a non-ASCII encoding.

The trick to just pass the text through a codec and see whether
it roundtrips also doesn't necessarily help: Latin-1, for example,
will always round-trip, since Latin-1 is a subset of Unicode.

IMHO, more research has to be done into this area before a
standard module can be added to the Python's stdlib... and
who knows, perhaps we're lucky and by the time everyone is
using UTF-8 anyway :-)

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Apr 22 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Encoding detection in the standard library?

2008-04-22 Thread M.-A. Lemburg

[CCing python-dev again]

On 2008-04-22 12:38, Greg Wilson wrote:

I don't think that should be part of the standard library. People
will mistake what it tells them for certain.
[etc]


These are all good arguments, but the fact remains that we can't control 
our inputs (e.g., we're archiving mail messages sent to lists managed by 
DrProject), and some of those inputs *don't* tell us how they're encoded.

Under those circumstances, what would you recommend?


I haven't done much research into this, but in general, I think it's
better to:

 * first try to look at other characteristics of a text
   message, e.g. language, origin, topic, etc.,

 * then narrow down the number of encodings which could apply,

 * rank them to try to avoid ambiguities and

 * then try to see what percentage of the text you can decode using
   each of the encodings in reverse ranking order (ie. more specialized
   encodings should be tested first, latin-1 last).

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Apr 22 2008)

Python/Zope Consulting and Support ...http://www.egenix.com/
mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/



 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Encoding detection in the standard library?

2008-04-22 Thread M.-A. Lemburg

On 2008-04-22 18:33, Bill Janssen wrote:

The 2002 paper A language and character set determination method
based on N-gram statistics by Izumi Suzuki and Yoshiki Mikami and
Ario Ohsato and Yoshihide Chubachi seems to me a pretty good way to go
about this. 


Thanks for the reference.

Looks like the existing research on this just hasn't made it into the
mainstream yet.

Here's their current project: http://www.language-observatory.org/
Looks like they are focusing more on language detection.

Another interesting paper using n-grams:
Language Identification in Web Pages by Bruno Martins and Mário J. Silva
http://xldb.fc.ul.pt/data/Publications_attach/ngram-article.pdf

And one using compression:
Text Categorization Using Compression Models by 
Eibe Frank, Chang Chui, Ian H. Witten
http://portal.acm.org/citation.cfm?id=789742


They're looking at LSEs, language-script-encoding
triples; a script is a way of using a particular character set to
write in a particular language.

Their system has these requirements:

R1. the response must be either correct answer or unable to detect
where unable to detect includes other than registered [the
registered set of LSEs];

R2. Applicable to multi-LSE texts;

R3. never accept a wrong answer, even when the program does not have
enough data on an LSE; and

R4. applicable to any LSE text.

So, no wrong answers.

The biggest disadvantage would seem to be that the registration data
for a particular LSE is kind of bulky; on the order of 10,000
shift-codons, each of three bytes, about 30K uncompressed.

http://portal.acm.org/ft_gateway.cfm?id=772759type=pdf


For a server based application that doesn't sound too large.

Unless you're using a very broad scope, I don't think that
you'd need more than a few hundred LSEs for a typical
application - nothing you'd want to put in the Python stdlib,
though.


Bill


IMHO, more research has to be done into this area before a
standard module can be added to the Python's stdlib... and
who knows, perhaps we're lucky and by the time everyone is
using UTF-8 anyway :-)

I walked over to our computational linguistics group and asked.  This
is often combined with language guessing (which uses a similar
approach, but using characters instead of bytes), and apparently can
usually be done with high confidence.  Of course, they're usually
looking at clean texts, not random stuff.  I'll see if I can get
some references and report back -- most of the research on this was
done in the 90's.

Bill


--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Apr 22 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 32- and 64-bit living together

2008-04-11 Thread M.-A. Lemburg
On 2008-04-11 19:10, Sérgio Durigan Júnior wrote:
 Hi all,
 
 My question is simple: is there any problem when installing/using both
 32- and 64-bit Python's on the same machine? I'm more concerned about
 header files (those installed under /usr/include/python-2.x), because as
 far as I could see there's nothing similar to a #ifdef USE_64BIT or
 something on them.

The include files are all static and can be used on both 32-bit and
64-bit platforms or installations.

Only the /usr/lib/python2.x files differ between 32-bit and 64-bit
(the configuration files are in /usr/lib/python2.x/config).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Apr 11 2008)
  Python/Zope Consulting and Support ...http://www.egenix.com/
  mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
  mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 32- and 64-bit living together

2008-04-11 Thread M.-A. Lemburg
On 2008-04-11 20:21, Sérgio Durigan Júnior wrote:
 Hi Lemburg,
 
 On Fri, 2008-04-11 at 19:38 +0200, M.-A. Lemburg wrote:
 On 2008-04-11 19:10, Sérgio Durigan Júnior wrote:
 Hi all,

 My question is simple: is there any problem when installing/using both
 32- and 64-bit Python's on the same machine? I'm more concerned about
 header files (those installed under /usr/include/python-2.x), because as
 far as I could see there's nothing similar to a #ifdef USE_64BIT or
 something on them.
 The include files are all static and can be used on both 32-bit and
 64-bit platforms or installations.
 
 Thanks :-).
 
 Only the /usr/lib/python2.x files differ between 32-bit and 64-bit
 (the configuration files are in /usr/lib/python2.x/config).
 
 Hmm, right. I tried to modify the installation path (using --libdir
 in ./configure) to /usr/lib64, but some *.pyo objects still are
 installed under /usr/lib. AFAIK, these objects are bitness-dependent
 (i.e., if they were generated by a 32-bit Python, they can only be
 execute by a 32-bit Python - and vice-versa), right?

Right.

 Is there any way to separate these arch-dependent files in /usr/lib
 and /usr/lib64 depending on their bitness?

There's no need for that. Only the config/ dir which is included
in the Python lib dir is dependent on the Python configuration.

 Thanks,
 
 P.S.: I think this misbehaviour of --libdir is a bug. IMHO, it should
 put every arch-dependent file in the path that the user provided.

You should probably have a look at how RedHat or openSUSE solve
these problems. Some of them have patched Python to fit their
needs. You may have to do that as well.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Apr 11 2008)
  Python/Zope Consulting and Support ...http://www.egenix.com/
  mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
  mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 32- and 64-bit living together

2008-04-11 Thread M.-A. Lemburg
On 2008-04-11 22:25, Sérgio Durigan Júnior wrote:
 On Fri, 2008-04-11 at 22:06 +0200, M.-A. Lemburg wrote:
  
 Hmm, right. I tried to modify the installation path (using --libdir
 in ./configure) to /usr/lib64, but some *.pyo objects still are
 installed under /usr/lib. AFAIK, these objects are bitness-dependent
 (i.e., if they were generated by a 32-bit Python, they can only be
 execute by a 32-bit Python - and vice-versa), right?
 Right.

Sorry, I misread you question. PYO and PYC files are *not* dependent
on 32/64 bit sizes.

 Is there any way to separate these arch-dependent files in /usr/lib
 and /usr/lib64 depending on their bitness?
 There's no need for that. Only the config/ dir which is included
 in the Python lib dir is dependent on the Python configuration.
 
 
 I'm afraid I still don't understand your point. I mean, if the *.pyo
 file *is* dependent on the bitness of the Python interpreter (as you
 confirmed in my first question), therefore when I decide to have both
 32- and 64-bit Python on my system I *must* have two versions of
 every .pyo file: one for 32- and another for 64-bit Python. What I've
 missed?

Sorry for the confusion.

 You should probably have a look at how RedHat or openSUSE solve
 these problems. Some of them have patched Python to fit their
 needs. You may have to do that as well.
 
 I'll sure take a look at them. Thanks!

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Apr 11 2008)
  Python/Zope Consulting and Support ...http://www.egenix.com/
  mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
  mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] fixing broken build

2008-03-27 Thread M.-A. Lemburg
On 2008-03-27 09:20, Christian Heimes wrote:
 Neal Norwitz schrieb:
 Christian,

 Please fix the build on the various buildbots that are failing or
 revert your changes for unicode literals.  The build failures started
 to occur at r61953.  There were several more (~5) follow up checkins.

 You can find all the failures here:  http://www.python.org/dev/buildbot/all/

 There seem to be at least two variations for how setup.py is failing.
 See below.
 
 I've already fixed the problem in r61956. I didn't noticed the issue
 with a non initialized var until I compiled Python without pydebug. In
 order to fix the problem on the build bots one has to remove all pyc and
 pyo files.

I'm not sure why that's necessary, but whenever you change something
in the compiler, please remember to update the PYC magic.

I'd also suggest that you run a non-debug build of Python to test
any checkins before committing them. The debug builds change various
ways the code is built.

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 27 2008)
  Python/Zope Consulting and Support ...http://www.egenix.com/
  mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
  mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal(unicode)

2008-03-26 Thread M.-A. Lemburg
On 2008-03-26 07:11, Martin v. Löwis wrote:
 For binary representations, we already have the struct module to handle 
 the parsing, but for byte sequences with embedded ASCII digits it's 
 reasonably common practice to use strings along with the respective type 
 constructors.
 
 Sure, but why can't you write
 
  foo = int(bar[start:stop].decode(ascii))
 
 then? Explicit is better than implicit.

Agreed.

The whole purpose of Unicode is to store text. Data from a file
isn't text per-se. You have to tell Python that a particular set of
bytes is to be interpreted as text and that only works by explicitly
converting the bytes to text.

Numbers or digits aren't any different in this context.
b1234 is just a sequence of bytes and could well represent
the binary encoding of an integer, the start of a base64 encoded
image, an SSH key or an audio file.

Don't get fooled by the looks of b1234. It's really just a
shorter way of writing 0x31 0x32 0x33 0x34.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 26 2008)
  Python/Zope Consulting and Support ...http://www.egenix.com/
  mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
  mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: from __future__ import unicode_string_literals

2008-03-25 Thread M.-A. Lemburg
On 2008-03-24 09:22, Lennart Regebro wrote:
 I think 2to3 is a procedure that will work well for library type
 projects with a reasonably small set of developers that make regular
 releases. There you can release both a python 2 and a python 3 version
 of the module, for example.
 ...
 So, in short: Large projects with interconnected modules where the
 developers and users of module are the same people will have big
 difficulties with the 2to3 approach and would be the people who are
 most likely to not be able to in practice go forward to Python 3
 unless they have some sort of smooth path forward.

I don't think there's a lot to worry about:

Companies using Python for applications typically have a completely
different life-cycle of releases and applications compared to the
Python release schedule, i.e. they often still run Python 2.3 or
2.4 and wait for major releases to settle before deciding to
port to them.

Every now and then, they make the decision to port to the next
release (for the next version of their software) and this change is
then managed accordingly - sometimes skipping a complete major release
of Python.

In such projects, 2to3 will get applied to the sources once and then
all development continues on the Python 3.0 version of the code.


In reality, I don't think that 2to3 will get used for continuous
porting between a 2.x code base and a 3.0 one all that much.

The transition from 2.x to 3.0 will happen during a longer period of
time (probably a few years) and depend a lot on the release cycle of
the applications using Python, whether or not the 3.0 version provides
better features, more performance,  etc. and whether the 2.x branches
of Python and the used 3rd party modules are still supported or not.

New applications will likely choose 3.0 right away - provided that
the needed 3rd party modules are available and stable enough.


In summary: 2to3 is a very useful tool to have. Whether or not
it is used for continuous porting between the two worlds is
really secondary.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 25 2008)
  Python/Zope Consulting and Support ...http://www.egenix.com/
  mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
  mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] How we can get rid of eggs for 2.6 and beyond

2008-03-21 Thread M.-A. Lemburg
On 2008-03-21 14:47, Phillip J. Eby wrote:
 So, to accomplish this, we (for some value of we) need to:
 
 1. Hash out consensus around what changes or enhancements are needed 
 to PEP 262, to resolve the previously-listed open issues, those that 
 have come up since (namespace packages, dependency specifications, 
 canonical name/version forms), and anything else that comes up.
 
 2. Update or replace the implementation as appropriate, and modify 
 the distutils to support it in Python 2.6 and beyond.  And support 
 it means, ensure that 'install' and *all* bdist commands update the 
 database.  The bdist_rpm, bdist_wininst, and bdist_msi commands, 
 even bdist_dumb.  (This should probably also include the add/remove 
 programs stuff in the Windows case.)

The bdist commands don't need to touch that database in any way,
since they don't install anything, nor do they upload things
anywhere. They simply package code and put the result into
the dist/ subdir. That's all.

What you probably mean is that the installers, pre/post-scripts,
etc. run when installing one of those packages should update
the database of installed packages.

Note that there are several package formats which do not execute
any code when installing them - the user simply unzips them in
some directory. These packages won't be able to register themselves
with a database.

I guess the only way to support all of these variants is
to use a filesystem based approach, e.g. by placing a file
with a special extension into some dir on sys.path.
The database logic could then scan sys.path for these
files, read the data and provide an interface to it.

All bdist formats would then have to include these files.

distutils already writes .egg-info files when running
python setup.py install, so perhaps that's a start (though
I'd prefer a three letter extension such as .pkg).

.egg-info files currently only include the package meta-data
(the PKG-INFO section from PEP 262).

We'd have to add a list of files making up the package (FILES
section in PEP 262) and also some extra information about any
extra files the package creates that can safely be removed in
the uninstall process (e.g. .pyo and .pyc files, temporary files,
database files, configuration data, registry entries, etc.) -
this is currently not covered in PEP 262.

I don't think the REQUIRES and PROVIDES sections from the
PEP 262 are needed. That info can easily go into the PKG-INFO
section.

A separate FILES section also doesn't seem to be necessary -
we could just add one or more entries or the format:

CreatesDir abc/
CreatesFile abc/xyz1.py
CreatesDir abc/def/
CreatesFile abc/def/xyz2.py
CreatesFile abc/def/xyz3.py
CreatesFile abc/def/xyz4.ini

(BTW: wininst writes such a file for the uninstall process)

So to keep things simple, the rfc822 approach defined in
PEP 241 would easily cover everything needed and we could
trim down the PEP 262 format to a simple rfc822 header
list.

In other words: the .egg-info files already provide the basis
and only need to be extended with a list of created files,
directories (and possibly other resources) as well as a list
of resources which may be removed even if not installed
explicitly such as byte-code files, etc.

 3. Create a document for system packagers referencing the PEP and 
 introducing them to what/why/how of the standard, in case they 
 weren't one of the original participants in creating this.

This should probably be a new PEP defining all the bits and
pieces making up the installation database.

 It will probably take some non-trivial work to do all this for Python 
 2.6, but it's probably possible, if we start now.  I don't think it's 
 critical to have an uninstall tool distributed with 2.6, as long as 
 there's a reasonable way to bootstrap its installation later.

BTW: There's a simple uninstall command in mxSetup.py that we
could contribute to distutils. It works much in the same
way as the install command... except that it removes all the
files it would have installed.

Using pre-built packages, this works without having to rebuild
the package just to be able to determine the list of things
that need to be removed.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 21 2008)
  Python/Zope Consulting and Support ...http://www.egenix.com/
  mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
  mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 

Re: [Python-Dev] Proposal: from __future__ import unicode_string_literals

2008-03-21 Thread M.-A. Lemburg
On 2008-03-21 22:32, Martin v. Löwis wrote:
 It's not implementable because the work has to occur in ast.c (see 
 Py_UnicodeFlag).  It can't occur later, because you need to skip the 
 encoding being done in parsestr().  But the __future__ import can only 
 be interpreted after the AST is built, at which time the encoding has 
 already been applied.  
 
 I think it would be possible to check for future statements on the
 basis of nodes already. Take a look at how Python 2.3 implemented
 future statements (why was that rewritten to use the AST, anyway?).
 
 As for it not making sense, this is really in the realm of 2to3.  I'm 
 beginning to really believe this statement in PEP 3000:
 
 There is still the original use case of people who don't want to run
 2to3 (for whatever reasons - mostly probably subjective ones), and
 who would rather run a single code base unmodified. They don't care
 that documentation tells them this is impossible, when they feel they
 are so close to making it possible.

Could we point them to a special byte-code compiler such as Andrew
Dalke's python4ply:

http://dalkescientific.com/Python/python4ply.html

That approach appears to be a lot easier to implement than trying
to tweak the C implementation of the Python parser.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 21 2008)
  Python/Zope Consulting and Support ...http://www.egenix.com/
  mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
  mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] How we can get rid of eggs for 2.6 and beyond

2008-03-21 Thread M.-A. Lemburg
On 2008-03-21 22:21, Phillip J. Eby wrote:
 At 08:06 PM 3/21/2008 +0100, M.-A. Lemburg wrote:
 I guess the only way to support all of these variants is
 to use a filesystem based approach, e.g. by placing a file
 with a special extension into some dir on sys.path.
 The database logic could then scan sys.path for these
 files, read the data and provide an interface to it.

 All bdist formats would then have to include these files.
 
 That's the idea behind the current version of PEP 262, yes, and I think 
 it should be kept.
 
 A separate FILES section also doesn't seem to be necessary -
 we could just add one or more entries or the format:

 CreatesDir abc/
 CreatesFile abc/xyz1.py
 CreatesDir abc/def/
 CreatesFile abc/def/xyz2.py
 CreatesFile abc/def/xyz3.py
 CreatesFile abc/def/xyz4.ini
 
 I actually think the size and hash information is good, in order to be 
 able to tell if you're looking at an original file.  I'm not sure how 
 useful the permissions and uid/gid info is.  I'm hoping we'll hear from 
 anybody who has a use case for that.

You're heading off in the wrong direction: we should not be trying
to rewrite RPM or InnoSetup in Python.

Anything more complicated should be left to tools which are
specifically written to manage complex software setups.

I honestly believe that most people would be happy if we just
provide these two things (and no more):

  * install a package from a local archive, a URL or PyPI

  * uninstall a package in way that doesn't break other
installed packages

and whatever the mechanism, avoid making any undercover
changes to the Python installation such as adding
.pth files, overriding site.py, etc. - these are
not needed if the tool keeps to the simple task of
installing and uninstalling Python packages.

Examples:

python pypi.py install mypkg-1.0.tgz
python pypi.py install http://www.example.com/mypkg-1.0.tgz
python pypi.py install mypkg-1.0

python pypi.py uninstall mypkg

If there's a dependency problem, the tool should print the
list of other packages it needs. It should not try to install
things automagically.

If a package needs other modules as well, the package docs
can point the user to use e.g.

python pypi.py install mydep1-1.3 mydep2-2.3 mydep4-0.3 mypkg-1.0

instead.

Anything more complicated should be left to specialized
tools such as RPM, apt, MSI or the other such tools out
there - after all the tool should be about Python *package*
installation, not application installation.

We *don't* need the tool to:

  * support multiple versions of a package (that's just bound
to cause problems with pickle, isinstance() etc.)

  * provide namespace hacking (is a completely separate issue
and can be handled by the packages rather than the install
tool)

  * support all kinds of funky version numbers (if a package
wants to participate in the system, the author better
make sure that the version string fits the standard format)

  * provide some form of intra-package bus interface (ie.
entry points as you call them)

  * provide support for keeping whole packages in ZIP files
(doesn't play well with C extensions, clutters up the
sys.path, is read-only, needs special importers, etc. etc. )

  * try automatic version matching for required packages

  * download things from SourceForge or other sites with special
download mechanisms

  * scan websites for links

  * make coffee, clean the house, send the kids to school :-)

 And of course, there are still some issues to be resolved regarding 
 requirements, package name/version stuff, etc.  But we can hash those 
 out once we reach a quorum on the Distutils-SIG.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 21 2008)
  Python/Zope Consulting and Support ...http://www.egenix.com/
  mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
  mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Consistent platform name for 64bit windows (was: distutils.util.get_platform() for Windows)

2008-03-20 Thread M.-A. Lemburg
On 2008-03-18 18:05, [EMAIL PROTECTED] wrote:
 I'm reviving a very old thread based on discussions with Martin at pycon.
 
 Sent: Monday, 23 July 2007 5:12 PM
 Subject: Re: [Distutils] distutils.util.get_platform() for Windows
 
 Rather than forcing everyone to read the context, allow me to summarize:
 On 64bit Windows versions, we need a string that identifies the
 platform, and this string should ideally be used consistently.  This
 original thread related to the files created by distutils (eg,
 pywin32-210.win???64??-py2.6.exe) but it seems obvious that we should be
 consistent wherever Python wants to display the platform (eg, in the
 startup banner, in platform.py, etc).
 
 In the old thread, there was a semi-consensus that 'x86_64' be used by
 distutils (and indeed, Lib/distutils/util.py in get_platform() has been
 changed, by me, to use this string), but the Python 'banner', for example,
 reports AMD64.  Platform.py doesn't report much at all in this area, at
 least when pywin32 isn't installed, but it arguably should.
 
 Both Martin and I prefer AMD64 as the string, for various reasons. 
 Firstly, it is less ugly than 'x86_64', and doesn't include an '_'/'-'
 which might tend to confuse parsing by humans or computers.  Martin also
 made the point that AMD invented the architecture and AMD64 is their
 preferred name, so we should respect that.
 
 So, at the risk of painting a bike-shed, I'd like to propose that we adopt
 'AMD64' in distutils (needs a change), platform.py (needs a change to use
 sys.getwindowsversion() in preference to pywin32, if possible, anyway),
 and the Python banner (which already uses AMD64).
 
 Any objections?  Any strong feelings that using 'AMD' will confuse people
 with Intel processors?  Strong feelings about the parsability of the name
 (PJE? wink)?  Strong feelings about the color wink?

Not really an object, but Microsoft itself uses the term x64 for
the 64-bit variants of their OS, e.g.

http://www.microsoft.com/windowsxp/64bit/default.mspx

Since the platform name is targeting Windows, I think we should
avoid confusing Windows users more than Intel users ;-)

About the platform.py changes: if someone could provide the return
values of sys.getwindowsversion() for 64bit versions of Windows
XP and Vista, I could add support for it (don't have a 64bit version
of Windows available to check myself).

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 20 2008)
  Python/Zope Consulting and Support ...http://www.egenix.com/
  mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
  mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Consistent platform name for 64bit windows (was: distutils.util.get_platform() for Windows)

2008-03-20 Thread M.-A. Lemburg
On 2008-03-20 13:42, Thomas Heller wrote:
 M.-A. Lemburg schrieb:
 About the platform.py changes: if someone could provide the return
 values of sys.getwindowsversion() for 64bit versions of Windows
 XP and Vista, I could add support for it (don't have a 64bit version
 of Windows available to check myself).
 
 This is the output of a 32-bit Python running on Windows XP Professional
 x64 Edition, Version 2003, Service Pack 2:
 
 C:\Python24ver
 
 Microsoft Windows [Version 5.2.3790]
 
 C:\Python24python
 Python 2.4.4 (#71, Oct 18 2006, 08:34:43) [MSC v.1310 32 bit (Intel)] on win32
 Type help, copyright, credits or license for more information.
 import sys
 sys.getwindowsversion()
 (5, 2, 3790, 2, 'Service Pack 2')

Thank you !

Anyone with a 64bit Vista ?

Or even better: a page documenting what to expect from the system call
behind the sys.getwindowsversion() API ?

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 20 2008)
  Python/Zope Consulting and Support ...http://www.egenix.com/
  mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
  mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Consistent platform name for 64bit windows (was: distutils.util.get_platform() for Windows)

2008-03-20 Thread M.-A. Lemburg
On 2008-03-20 13:55, M.-A. Lemburg wrote:
 On 2008-03-20 13:42, Thomas Heller wrote:
 M.-A. Lemburg schrieb:
 About the platform.py changes: if someone could provide the return
 values of sys.getwindowsversion() for 64bit versions of Windows
 XP and Vista, I could add support for it (don't have a 64bit version
 of Windows available to check myself).
 This is the output of a 32-bit Python running on Windows XP Professional
 x64 Edition, Version 2003, Service Pack 2:

 C:\Python24ver

 Microsoft Windows [Version 5.2.3790]

 C:\Python24python
 Python 2.4.4 (#71, Oct 18 2006, 08:34:43) [MSC v.1310 32 bit (Intel)] on 
 win32
 Type help, copyright, credits or license for more information.
 import sys
 sys.getwindowsversion()
 (5, 2, 3790, 2, 'Service Pack 2')
 
 Thank you !
 
 Anyone with a 64bit Vista ?
 
 Or even better: a page documenting what to expect from the system call
 behind the sys.getwindowsversion() API ?

FYI: I added winreg and sys.getwindowsversion() support in r61674.

platform.machine() and .processor() will now use the environment
variables PROCESSOR_ARCHITECTURE and PROCESSOR_IDENTIFIER where
available (should work on Windows XP and later).

According to http://support.microsoft.com/kb/888731 platform.machine()
will return AMD64, so I guess the x64 string is just a marketing
name for 64-bit platforms on Windows and the underlying system uses
AMD64 as machine type name.

For x86 processors, you'll now get x86 on Windows XP and later.

For Itanium processors, you should get IA64 according to this
WOW64 page:

http://msdn2.microsoft.com/en-us/library/aa384274(VS.85).aspx

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 20 2008)
  Python/Zope Consulting and Support ...http://www.egenix.com/
  mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
  mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 365 (Adding the pkg_resources module)

2008-03-20 Thread M.-A. Lemburg
On 2008-03-20 21:34, Paul Moore wrote:
  Also, setuptools-based packages *can* build bdist_wininst
  installers.  (In fact, if memory serves, I added that feature at your 
 request.)
 
 I know. python setup.py bdist_wininst. And thank you for adding it.
 But again you miss my point. People are starting to omit distributing
 bdist_wininst installers in favour of eggs only. And you cannot (to my
 knowledge) convert an egg into a bdist_wininst installer, and if you
 can't compile from source (a C extension with complex dependencies,
 for example) you're stuck (in the sense that you're forced to use eggs
 without add/remove programs support).

You might want to look at the eGenix pre-built packages as an
alternative: they include all the information necessary to let
standard distutils continue its works *after* the build step.

It's basically a distribution of the package as it looks after
the build step has run, but before the package is wrapped up
using a packager like bdist_wininst or bdist_msi or installed
into the system.

You can download the pre-built package and create e.g. an
MSI installer or a wininst EXE without needing a compiler -
in addition to providing all the options of the standard distutils
install command (which makes repackaging them as part of
larger applications easy as well).

All the logic for this is included in mxSetup.py which ships with
the pre-built packages.

http://www.egenix.com/products/python/mxBase/#Download
http://www.egenix.com/products/python/mxBase/#Installation

The current version we have is not yet perfect. The next
iteration will also play nice with distutils extensions.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 20 2008)
  Python/Zope Consulting and Support ...http://www.egenix.com/
  mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
  mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] C-API status of Python 3?

2008-03-02 Thread M.-A. Lemburg
On 2008-03-02 14:47, Christian Heimes wrote:
 Alex Martelli wrote:
 Yep, but please do keep the PyUnicode for str and PyString for bytes
 (as macros/synonnyms of PyStr and PyBytes if you want!-) to help the
 task of porting existing extensions... the bytearray functions should
 no doubt be PyBytearray, though.
 
 Yeah, we've already planed to keep PyUnicode as prefix for str type
 functions. It makes perfectly sense, not only from the historical point
 of view.
 
 But for PyString I planed to rename the prefix to PyBytes. In my opinion
 we are going to regret it, when we keep too many legacy names from 2.x.
 In order to make the migration process easier I can add a header file
 that provides PyString_* functions as aliases for PyBytes_*
 
 Comments?

+1

Why not also make unicode() the default type constructor and only
keep str() as alias to simplify porting (perhaps with a warning) ?

The term string is just too overloaded with all kinds of
misinterpretations. The term string just refers to a string of
bytes - a variable length array so to speak. However, depending
on the application space, string is used as synonym for
text string just as well as data string.

Removing the term string altogether would make it easier for
people to understand that Py3k only has unicode (for text data)
and bytes (for binary data).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 02 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] C-API status of Python 3?

2008-03-02 Thread M.-A. Lemburg
On 2008-03-02 20:39, Bill Janssen wrote:
 Why not also make unicode() the default type constructor and only
 keep str() as alias to simplify porting (perhaps with a warning) ?

 The term string is just too overloaded with all kinds of
 misinterpretations. The term string just refers to a string of
 bytes - a variable length array so to speak. However, depending
 on the application space, string is used as synonym for
 text string just as well as data string.

 Removing the term string altogether would make it easier for
 people to understand that Py3k only has unicode (for text data)
 and bytes (for binary data).
 
 I agree that string is very overloaded, but calling it unicode is
 sort of like calling integers int32 -- that is, you're talking about
 the implementation rather than the type. 

Hmm in that case, we'd have to call it ucs2 or ucs4 depending
on how Python was compiled ;-)

 In most programming
 languages that aren't at the machine level (like C is), string
 really is a sequence of text characters, not a string of bytes, and
 that's probably the term that should be used for Python going forward,
 despite the legacy issues it involves.

I'm not bound to unicode at all, just don't think using string
for text data will really make people think twice often enough
and then you end up having binary data in a string again -
with the only difference that it's now using the Unicode type
internally.

My personal favorite is text for text data.

 Personally, I feel that string (for text) and bytes (for binary
 data represented as a sequence of bytes) are appropriate terms for
 Python.  Keep unicode for a release or two as an alias for string.
 But isn't all this in a PEP somewhere already?

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 03 2008)
  Python/Zope Consulting and Support ...http://www.egenix.com/
  mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
  mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] C-API status of Python 3?

2008-03-02 Thread M.-A. Lemburg
On 2008-03-02 23:11, Greg Ewing wrote:
 M.-A. Lemburg wrote:
 Why not also make unicode() the default type constructor and only
 keep str() as alias to simplify porting (perhaps with a warning) ?
 
 -1 on making us type 7 characters instead of
 3 all over the place.

Oh well... how about text() ?

 The term string is just too overloaded with all kinds of
 misinterpretations. The term string just refers to a string of
 bytes - a variable length array so to speak.
 
 I disagree -- string has come to mean string of
 characters unless otherwise qualified. Using one
 to hold non-characters is just an aberration that
 was necessary in Python 2 because there wasn't much
 alternative.

Buffer objects have been around for years and for exactly
this purpose.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 03 2008)
  Python/Zope Consulting and Support ...http://www.egenix.com/
  mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
  mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Unicode -- UTF-8 in CPython extension modules

2008-02-22 Thread M.-A. Lemburg
On 2008-02-23 00:46, Colin Walters wrote:
 On Fri, Feb 22, 2008 at 4:23 PM, John Dennis [EMAIL PROTECTED] wrote:
 
  Python programs which use Unicode string objects for their i18n and
  which link to C libraries expecting UTF-8 but which have a CPython
  binding which only uses 's' or 's#' formats programs seem to often
  fail with encoding errors.
 
 One thing to be aware of is that PyGTK+ actually sets the Python
 Unicode object encoding to UTF-8.
 
 http://bugzilla.gnome.org/show_bug.cgi?id=132040
 
 I mention this because PyGTK is a very popular library related to
 Python and Linux.  So currently if you import gtk, then libraries
 which are using UTF-8 (as you say, the vast majority) will work with
 Python unicode objects unmodified.

Are you suggesting that John should rely on a bug in some 3rd party
extension instead of fixing the Python extension to use es# where
needed ?

There's a good reason why we don't allow setting the default
encoding outside site.py.

Trying to play tricks to change the default encoding later on
will only cause problems, e.g. the cached default encoded versions
of Unicode objects will then use different encodings - the one set
in site.py and later the ones with the new encoding. As a result,
all kind of weird things can happen.

Using the Python Unicode C API really isn't all that hard and it's
well documented too, so please use it instead of trying to design
software based on workarounds.

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 23 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] int/float freelists vs pymalloc

2008-02-13 Thread M.-A. Lemburg
On 2008-02-13 08:02, Andrew MacIntyre wrote:
 Christian Heimes wrote:
 Andrew MacIntyre wrote:
 I tried a LIFO stack implementation (though I won't claim to have done it
 well), and found it slightly slower than no freelist at all. The
 advantage of such an approach is that the known size of the stack makes
 deallocating excess objects easy (and thus no need for
 sys.compact_free_list() ).
 I've tried a single linked free list myself. I used the ob_type field to
 daisy chain the int and float objects. Although the code was fairly
 short it was slightly slower than an attempt without a free list at all.
 pymalloc is fast. It's very hard to beat it though.
 
 I'm speculating that CPU cache effects can make these differences.  The
 performance of the current trunk float freelist is depressing, given that
 the same strategy works so well for ints.
 
 I seem to recall Tim Peters paying a lot of attention to cache effects
 when he went over the PyMalloc code before the 2.3 release, which would
 contribute to its performance.
 
 A fixed size LIFO array like PyFloatObject
 *free_list[PyFloat_MAXFREELIST] increased the speed slightly. IMHO a
 value of about 80-200 floats and ints is realistic for most apps. More
 objects in the free lists could keep too many pymalloced areas occupied.
 
 I tested the updated patch you added to issue 2039.  With the int
 freelist set to 500 and the float freelist set to 100, its about the same
 as the no-freelist version for my tests, but PyBench shows the simple
 float arithmetic to be about 10% better.
 
 I'm inclined to set the int LIFO a bit larger than you suggest, simply as
 ints are so commonly used - hence the value of 500 I used.  Floats are
 much less common by comparison.  Even an int LIFO of 500 is only going to
 tie up ~8kB on a 32bit box (~16kB on 64bit), which is insignificant
 enough that I can't see a need for a compaction routine.
 
 A 200 entry float LIFO would only account for ~4kB on 32bit (~8kB on
 64bit).

It is difficult to tell what good limits for free lists should
be. This depends a lot on the application focus, e.g. a financial
application is going to need lots of floats, while a word
processor or parser will need more integers.

I think the main difference between the current free list
implementation and Christian's patches is that the current
implementation bypasses pymalloc altogether and allocates
the objects directly using the system malloc().

The objects in the free list then cannot keep artificially keep
pymalloc pools alive.

Furthermore, the current free list implementation works
by allocating 1k chunks of memory for more than just one
object whenever it finds that the free list is empty.

Christian's patches and your free list removal patch, cause
all allocations to be done via pymalloc. Christian's free
list can also result in nearly empty pymalloc pools to stay
alive due to the use of a linked list rather than an
array of objects.

Finally (and I don't know if you've missed that), the integer
implementation uses sharing for small integers. In the current
implementation all integers between -5 and 257 are only ever
allocated once and then reused whenever an integer in this
range is needed. The shared integers are not subject to any
of the extra free list handling or pymalloc overhead.

This results in a significant boost, since integers in this
range are *very* common and also causes the comparison between
integers and floats to become biased - floats don't have
this optimization.

I still think that dropping the free lists can be worthwhile,
but pymalloc would need to get further optimizations to give
better performance for often requested size classes (the 16 byte
class on 32bit architectures, the 24 byte class on 64bit
architectures).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 13 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] int/float freelists vs pymalloc

2008-02-13 Thread M.-A. Lemburg
On 2008-02-13 12:56, Andrew MacIntyre wrote:
 I'm not that interested in debating the detail of exactly how big the
 prospective LIFO freelists are - I just want to see the situation
 resolved with maximum utilisation of memory for minimum performance 
 penalty.  To that end, +1 from me for accepting your revised patch 
 against issue 2039.  In addition, unless there are other reasons to
 retain it, I would be suggesting that the freelist compaction
 infrastructure you introduced in r60567 be removed for lack of practical 
 utility (assuming acceptance of your patch).

If we're down to voting, here's my vote:

+1 on removing the freelists from ints and floats, but not the
   small int sharing optimization

+1 on focusing on improving pymalloc to handle int and float
   object allocations even better

-1 on changing the freelist implementations to use pymalloc for
   allocation of the freelist members instead of malloc, since
   this would potentially lead to pools (and arenas) being held alive
   by just a few objects - in the worst case a whole arena (256kB)
   for just one int object (14 bytes on 32bit platforms).

Eventually, all freelists should be removed, unless there's a
significant performance loss.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 13 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] int/float freelists vs pymalloc

2008-02-08 Thread M.-A. Lemburg
On 2008-02-08 08:21, Martin v. Löwis wrote:
 One of the hopes of having a custom allocator for Python was to be
 able to get rid off all free lists. For some reason that never happened.
 Not sure why. People were probably too busy with adding new
 features to the language at the time ;-)
 
 Probably not. It's more that the free lists still outperformed pymalloc.
 
 Something you could try to make PyMalloc perform better for the builtin
 types is to check the actual size of the allocated PyObjects and then
 make sure that PyMalloc uses arenas large enough to hold a good quantity
 of them, e.g. it's possible that the float types fall into the same
 arena as some other type and thus don't have enough room to use
 as free list.
 
 I don't think any improvements can be gained here. PyMalloc carves
 out pools of 4096 bytes from an arena when it runs out of blocks
 for a certain size class, and then keeps a linked list of pools of
 the same size class. So when many float objects get allocated,
 you'll have a lot of pools of the float type's size class.
 IOW, PyMalloc has always enough room.

Well, yes, it doesn't run out of memory, but if pymalloc needs
to allocate lots of objects of the same size, then performance
degrades due to the management overhead involved for checking
the free pools as well as creating new arenas as needed.

To reduce this overhead, it may be a good idea to preallocate
pools for common sizes and make sure they don't drop under a
certain threshold.

Here's a list of a few object sizes in bytes for Python 2.5
on an AMD64 machine:

 import mx.Tools
 mx.Tools.sizeof(int(0))
24
 mx.Tools.sizeof(float(0))
24

8-bit strings are var objects:

 mx.Tools.sizeof(str(''))
40
 mx.Tools.sizeof(str('a'))
41

Unicode objects use an external buffer:

 mx.Tools.sizeof(unicode(''))
48
 mx.Tools.sizeof(unicode('a'))
48

Lists do as well:

 mx.Tools.sizeof(list())
40
 mx.Tools.sizeof(list([1,2,3]))
40

Tuples are var objects:

 mx.Tools.sizeof(tuple())
24
 mx.Tools.sizeof(tuple([1,2,3]))
48

Old style classes:

 class C: pass
...
 mx.Tools.sizeof(C)
64

New style classes are a lot heavier:

 class D(object): pass
...
 mx.Tools.sizeof(D)
848

 mx.Tools.sizeof(type(2))
848


As you can see, Integers and floats fall into the same pymalloc size
class. What's strange in Andrew's result is that both integers
and floats use the same free list technique and fall into the same
pymalloc size class, yet the results are different.

The only difference that's apparent is that small integers are
shared, so depending on the data set used for the test, fewer
calls to pymalloc or the free list are made.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 08 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] int/float freelists vs pymalloc

2008-02-08 Thread M.-A. Lemburg
On 2008-02-08 19:28, Christian Heimes wrote:
 In addition to the pure performance aspect, there is the issue of memory
 utilisation.  The current trunk code running the int test case in my
 original post peaks at 151MB according to top on my FreeBSD box, dropping
 back to about 62MB after the dict is destroyed (without a compaction).
 The same script running on the no-freelist build of the interpreter peaks
 at 119MB, with a minima of around 57MB.
 
 I wonder why the free list has such a huge impact in memory usage. Int
 objects are small (4 byte pointer to type, 4 byte Py_ssize_t and 4 byte
 value). A thousand int object should consume less than 20kB including
 overhead and padding.

The free lists keep parts of the pymalloc pools alive.
Since these are only returned to the OS if the whole pool is
unused, a single object could keep 4k of memory associated
with the process.

I suppose that the remaining few MBs shown by the OS are not
really used by the process, but simply kept associated with
the process by the OS in case it quickly needs more memory.

In order to be sure about the true memory usage, you'd have
to force the OS to grab all available memory, e.g. by running
a huge process right next to the one you're testing.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 08 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] int/float freelists vs pymalloc

2008-02-07 Thread M.-A. Lemburg
On 2008-02-07 14:09, Andrew MacIntyre wrote:
 Probably in response to the same stimulus as Christian it occurred to me
 that the freelist approach had been adopted long before PyMalloc was
 enabled as standard (in 2.3), and that much of the performance gains
 between 2.2 and 2.3 were in fact due to PyMalloc.

One of the hopes of having a custom allocator for Python was to be
able to get rid off all free lists. For some reason that never happened.
Not sure why. People were probably too busy with adding new
features to the language at the time ;-)

Something you could try to make PyMalloc perform better for the builtin
types is to check the actual size of the allocated PyObjects and then
make sure that PyMalloc uses arenas large enough to hold a good quantity
of them, e.g. it's possible that the float types fall into the same
arena as some other type and thus don't have enough room to use
as free list.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 08 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Limit free list of method and builtin function objects (was: [Python-checkins] r60614 - in python/trunk: Misc/NEWS Objects/classobject.c Objects/methodobject.c)

2008-02-06 Thread M.-A. Lemburg
Hi Christian,

could you explain how you came up with the 256 entry limit ?
It appears to be rather low and somehow arbitrary.

I understand that some limit is required, but since these
objects get created a lot (e.g. for bound methods), setting the
limit too low will significantly slow down the interpreter.

BTW: What does pybench have to say to this patch ?

To get an idea of how many objects are typically part of the
free list, I'd suggest running an application such as Zope for
a while and then check the maximum numfree value.

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 06 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611


On 2008-02-06 13:44, christian.heimes wrote:
 Author: christian.heimes
 Date: Wed Feb  6 13:44:34 2008
 New Revision: 60614
 
 Modified:
python/trunk/Misc/NEWS
python/trunk/Objects/classobject.c
python/trunk/Objects/methodobject.c
 Log:
 Limit free list of method and builtin function objects to 256 entries each.
 
 Modified: python/trunk/Misc/NEWS
 ==
 --- python/trunk/Misc/NEWS(original)
 +++ python/trunk/Misc/NEWSWed Feb  6 13:44:34 2008
 @@ -12,6 +12,9 @@
  Core and builtins
  -
  
 +- Limit free list of method and builtin function objects to 256 entries
 +  each.
 +
  - Patch #1953: Added ``sys._compact_freelists()`` and the C API functions
``PyInt_CompactFreeList`` and ``PyFloat_CompactFreeList``
to compact the internal free lists of pre-allocted ints and floats.
 
 Modified: python/trunk/Objects/classobject.c
 ==
 --- python/trunk/Objects/classobject.c(original)
 +++ python/trunk/Objects/classobject.cWed Feb  6 13:44:34 2008
 @@ -4,10 +4,16 @@
  #include Python.h
  #include structmember.h
  
 +/* Free list for method objects to safe malloc/free overhead
 + * The im_self element is used to chain the elements.
 + */
 +static PyMethodObject *free_list;
 +static int numfree = 0;
 +#define MAXFREELIST 256
 +
  #define TP_DESCR_GET(t) \
  (PyType_HasFeature(t, Py_TPFLAGS_HAVE_CLASS) ? (t)-tp_descr_get : NULL)
  
 -
  /* Forward */
  static PyObject *class_lookup(PyClassObject *, PyObject *,
 PyClassObject **);
 @@ -2193,8 +2199,6 @@
 In case (b), im_self is NULL
  */
  
 -static PyMethodObject *free_list;
 -
  PyObject *
  PyMethod_New(PyObject *func, PyObject *self, PyObject *klass)
  {
 @@ -2207,6 +2211,7 @@
   if (im != NULL) {
   free_list = (PyMethodObject *)(im-im_self);
   PyObject_INIT(im, PyMethod_Type);
 + numfree--;
   }
   else {
   im = PyObject_GC_New(PyMethodObject, PyMethod_Type);
 @@ -2332,8 +2337,14 @@
   Py_DECREF(im-im_func);
   Py_XDECREF(im-im_self);
   Py_XDECREF(im-im_class);
 - im-im_self = (PyObject *)free_list;
 - free_list = im;
 + if (numfree  MAXFREELIST) {
 + im-im_self = (PyObject *)free_list;
 + free_list = im;
 + numfree++;
 + }
 + else {
 + PyObject_GC_Del(im);
 + }
  }
  
  static int
 @@ -2620,5 +2631,7 @@
   PyMethodObject *im = free_list;
   free_list = (PyMethodObject *)(im-im_self);
   PyObject_GC_Del(im);
 + numfree--;
   }
 + assert(numfree == 0);
  }
 
 Modified: python/trunk/Objects/methodobject.c
 ==
 --- python/trunk/Objects/methodobject.c   (original)
 +++ python/trunk/Objects/methodobject.c   Wed Feb  6 13:44:34 2008
 @@ -4,7 +4,12 @@
  #include Python.h
  #include structmember.h
  
 +/* Free list for method objects to safe malloc/free overhead
 + * The m_self element is used to chain the objects.
 + */
  static PyCFunctionObject *free_list = NULL;
 +static int numfree = 0;
 +#define MAXFREELIST 256
  
  PyObject *
  PyCFunction_NewEx(PyMethodDef *ml, PyObject *self, PyObject *module)
 @@ -14,6 +19,7 @@
   if (op != NULL) {
   free_list = (PyCFunctionObject *)(op-m_self);
   PyObject_INIT(op, PyCFunction_Type);
 + numfree--;
   }
   else {
   op = PyObject_GC_New(PyCFunctionObject, PyCFunction_Type);
 @@ -125,8 +131,14 @@
   _PyObject_GC_UNTRACK(m);
   Py_XDECREF(m-m_self);
   

Re: [Python-Dev] trunc()

2008-01-28 Thread M.-A. Lemburg
On 2008-01-27 08:14, Raymond Hettinger wrote:
 . You may disagree, but that doesn't make it nuts.
 
 Too many thoughts compressed into one adjective ;-)
 
 Deprecating int(float)--int may not be nuts, but it is disruptive.
 
 Having both trunc() and int() in Py2.6 may not be nuts, but it is duplicative 
 and confusing.
 
 The original impetus for facilitating a new Real type being able to trunc() 
 into a new Integral type may not be nuts, but the use 
 case seems far fetched (we're never had a feature request for it -- the 
 notion was born entirely out of numeric tower 
 considerations).
 
 The idea that programmers are confused by int(3.7)--3 may not be nuts, but 
 it doesn't match any experience I've had with any 
 programmer, ever.
 
 The idea that trunc() is beneficial may not be nuts, but it is certainly 
 questionable.
 
 In short, the idea may not be nuts, but I think it is legitimate to suggest 
 that it is unnecessary and that it will do more harm 
 than good.

All this reminds me a lot of discussions we've had when we
needed a new way to spell out string.join().

In the end, we ended up adding a method to strings (thanks to
Tim Peters, IIRC) instead of adding a builtin join().

Since all of the suggested builtins are only meant to work on
floats, why not simply add methods for them to the float object ?!

E.g.

x = 3.141
print x.trunc(), x.floor(), x.ceil()

etc.

This approach also makes it possible to write types or classes
that expose the same API without having to resort to new special
methods (we have too many of those already).

Please consider that type constructors have a different scope
than helper functions. Helper functions should only be made builtins
if they are really really useful and often needed. If they don't
meet this criteria, they are better off in a separate module.
I don't see any of the suggested helper functions meeting this
criteria and we already have math.floor() and math.ceil().

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 28 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] trunc()

2008-01-25 Thread M.-A. Lemburg
On 2008-01-25 21:26, Steve Holden wrote:
 Antoine Pitrou wrote:
 Raymond Hettinger python at rcn.com writes:
 Go ask a dozen people if they are surprised that int(3.7) returns 3.
 No one will be surprised (even folks who just use Excel or VB). It
 is foolhardy to be a purist and rage against the existing art:

 Well, for what it's worth, here are MySQL's own two cents:

 mysql create table t (a int);
 Query OK, 0 rows affected (0.00 sec)

 mysql insert t (a) values (1.4), (1.6), (-1.6), (-1.4);
 Query OK, 4 rows affected (0.00 sec)
 Records: 4  Duplicates: 0  Warnings: 0

 mysql select * from t;
 +--+
 | a|
 +--+
 |1 | 
 |2 | 
 |   -2 | 
 |   -1 | 
 +--+
 4 rows in set (0.00 sec)

 Two points. Firstly, regarding MySQL as authoritative from a standards 
 point of view is bound to lead to trouble, since they have always played 
 fast and loose with the standard for reasons (I suspect) of 
 implementation convenience.
 
 Second, that example isn't making use of the INT() function. I was going 
 to show you result of taking the INT() of a float column containing your 
 test values. That was when I found out that MySQL (5.0.41, anyway) 
 doesn't implement the INT() function. What was I saying about standards?

FWIW, here's what IBM has to say to this:


http://publib.boulder.ibm.com/infocenter/db2luw/v9/index.jsp?topic=/com.ibm.db2.udb.admin.doc/doc/r814.htm

If the argument is a numeric-expression, the result is the same number 
that would occur if the argument were
assigned to a large integer column or variable. If the whole part of the 
argument is not within the range of integers,
an error occurs. The decimal part of the argument is truncated if present.

AFAIK, the INTEGER() function is not part of the SQL standard, at
least not of SQL92:

http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt

The way to convert a value to an integer is by casting it to
one, e.g. CAST (X AS INTEGER). The INT() function is basically
a short-cut for this.

Regards,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 25 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: per user site-packages directory

2008-01-22 Thread M.-A. Lemburg
I don't really understand what all this has to do with per user
site-packages.

Note that the motivation for having per user site-packages
was to:

 * address a common request by Python extension package users,

 * get rid off the hackery done by setuptools in order
   to provide this.

As such the PEP can also be seen as an effort to enable code
cleanup *before* adding e.g. pkg_resources to the stdlib.

Cheers,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 22 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611


On 2008-01-21 16:06, Nick Coghlan wrote:
 Steve Holden wrote:
 Christian Heimes wrote:
 Steve Holden wrote:
 Maybe once we get easy_install as a part of the core (so there's no need
 to find and run ez_setup.py to start with) things will start to improve.
 This is an issue the whole developer community needs to take seriously
 if we are interested in increasing take-up.
 setuptools and easy_install won't be included in Python 2.6 and 3.0:
 http://www.python.org/dev/peps/pep-0365/

 Yes, and yet another release (two releases) will go out without easy 
 access to the functionality in Pypi. PEP 365 is a good start, but Pypi 
 loses much of its point until new Python users get access to it out of 
 the box. I also appreciate that resource limitations are standing in 
 the way of setuptools' inclusion (is there something I can do about 
 that?) Just to hammer the point home, however ...
 
 Have another look at the rationale given in PEP 365 - it isn't the 
 resourcing to do the work that's a problem, but the relatively slow 
 release cycle of the core.
 
 By including pkg_resources in the core (with the addition of access to 
 pure Python modules and packages on PyPI), we would get a simple, stable 
 base for Python packaging to work from, and put users a single standard 
 command away from the more advanced (but also more volatile) features of 
 easy_install and friends.
 
 Cheers,
 Nick.
 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] #! magic

2008-01-22 Thread M.-A. Lemburg
On 2008-01-20 19:30, Christian Heimes wrote:
 Yet another python executable could solve the issue, named pythons as
 python secure.
 
 /*
gcc -DNDEBUG -g -O2 -Wall -Wstrict-prototypes -IInclude -I. -pthread
-Xlinker -lpthread -ldl  -lutil -lm -export-dynamic -o pythons2.6
 
pythons.c libpython2.6.a
  */
 
 #include Python.h
 
 int main(int argc, char **argv) {
 /* disable some possible harmful features */
 Py_IgnoreEnvironmentFlag++;
 Py_NoUserSiteDirectory++;
 Py_InteractiveFlag -= INT_MAX;
 Py_InspectFlag -= INT_MAX;
 
 return Py_Main(argc, argv);
 }
 
 $ ./pythons2.6
 Python 2.6a0 (:59956M, Jan 14 2008, 22:09:17)
 [GCC 4.2.1 (Ubuntu 4.2.1-5ubuntu4)] on linux2
 Type help, copyright, credits or license for more information.
 import sys
 sys.flags
 sys.flags(debug=0, py3k_warning=0, division_warning=0, division_new=0,
 inspect=-2147483647, interactive=-2147483647, optimize=0,
 dont_write_bytecode=0, no_user_site=1, no_site=0, ingnore_environment=1,

Is this a copypaste error or a typo in the code ^ ?

 tabcheck=0, verbose=0, unicode=0)

To make this even more secure, you'd have to package this up
together with a copy of the stdlib, but like mxCGIPython does
(or did... I have to revive that project at some point :-):

http://www.egenix.com/www2002/python/mxCGIPython.html

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 22 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: per user site-packages directory

2008-01-14 Thread M.-A. Lemburg
On 2008-01-14 22:23, Christian Heimes wrote:
 The PEP is now available at http://www.python.org/dev/peps/pep-0370/.
 The reference implementation is in svn, too:
 svn+ssh://[EMAIL PROTECTED]/sandbox/trunk/pep370

Thanks for the effort, Christian. Much appreciated.

Regarding the recent ~/bin vs. ~/.local/bin discussion:

I usually maintain my ~/bin directories by hand and wouldn't want
any application to install things in there automatically (and so far
I haven't been using any application that does), so I'd be
in favor of the ~/.local/bin dir.

Note that users typically don't know which scripts are made
available by a Python application and it's not always clear
what functionality they provide, whether they can be trusted,
include bugs, need to be run with extra care, etc, so IMHO
making it a little harder to run them by accident is well
warranted.

Thanks again,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 14 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Backporting PEP 3101 to 2.6

2008-01-10 Thread M.-A. Lemburg
On 2008-01-10 14:31, Eric Smith wrote:
 (I'm posting to python-dev, because this isn't strictly 3.0 related.
 Hopefully most people read it in addition to python-3000).
 
 I'm working on backporting the changes I made for PEP 3101 (Advanced
 String Formatting) to the trunk, in order to meet the pre-PyCon release
 date for 2.6a1.
 
 I have a few questions about how I should handle str/unicode.  3.0 was
 pretty easy, because everything was unicode.

Since this is a new feature, why bother with strings at all
(even in 2.6) ?

Use Unicode throughout and be done with it.

 1: How should the builtin format() work?  It takes 2 parameters, an
 object o and a string s, and returns o.__format__(s).  If s is None, it
 returns o.__format__(empty_string).  In 3.0, the empty string is of
 course unicode.  For 2.6, should I use u'' or ''?
 
 
 2: In 3.0, object.__format__() is essentially this:
 
 class object:
 def __format__(self, format_spec):
 return format(str(self), format_spec)
 
 In 2.6, I assume it should be the equivalent of:
 
 class object:
 def __format__(self, format_spec):
 if isinstance(format_spec, str):
 return format(str(self), format_spec)
 elif isinstance(format_spec, unicode):
 return format(unicode(self), format_spec)
 else:
 error
 
  Does that seem right?
 
 
 3: Every overridden __format__() method is going to have to check for
 string or unicode, just like object.__format() does, and return either a
 string or unicode object, appropriately.  I don't see any way around
 this, but I'd like to hear any thoughts.  I guess there aren't all that
 many __format__ methods that will be implemented, so this might not be a
 big burden.  I'll of course implement the built in ones.
 
 Thanks in advance for any insights.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 10 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pkgutil, pkg_resource and Python 3.0 name space packages

2008-01-07 Thread M.-A. Lemburg
On 2008-01-07 14:57, Fred Drake wrote:
 On Jan 7, 2008, at 7:48 AM, M.-A. Lemburg wrote:
 Next, we add a per-user site-packages directory to the standard
 sys.path, and then we could get rid of most of the setuptools
 import and sys.path hackery, making it a lot cleaner.
 
 
 PYTHONPATH already provides this functionality.  I see no need to
 duplicate that.

Agreed, but one of the main arguments for all the .pth file hackery in
setuptools is that having to change PYTHONPATH in order to enable
user installations of packages is too hard for the typical user.

We could easily resolve that issue, if we add a per-user site-packages
dir to sys.path in site.py (this is already done for Macs).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 07 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pkgutil, pkg_resource and Python 3.0 name space packages

2008-01-07 Thread M.-A. Lemburg
On 2008-01-07 17:24, Barry Warsaw wrote:
 On Jan 7, 2008, at 10:12 AM, Guido van Rossum wrote:
 
 On Jan 7, 2008 6:32 AM, Barry Warsaw [EMAIL PROTECTED] wrote:
 On Jan 7, 2008, at 9:01 AM, M.-A. Lemburg wrote:
 We could easily resolve that issue, if we add a per-user site-packages
 dir to sys.path in site.py (this is already done for Macs).

 +1.  I've advocated that for years.
 
 I'm not sure what this buys given that you can do this using
 PYTHONPATH anyway, but because of that I also can't be against it. +0
 from me. Patches for 2.6 gratefully accepted.
 
 I think it's PEP-worthy too, just so that the semantics get nailed
 down.  Here's a strawman proto-quasi-pre-PEP.
 
 Python automatically adds ~/.python/site-packages to sys.path; this is
 added /before/ the system site-packages file.  An open question is
 whether it needs to go at the front of the list.  It should definitely
 be searched before the system site-packages.
 
 Python treats ~/.python/site-packages the same as the system
 site-packages, w.r.t. .pth files, etc.
 
 Open question: should we add yet another environment variable to control
 this?  It's pretty typical for apps to expose such a thing so that the
 base directory (e.g. ~/.python) can be moved.

I'd suggest to make the ~/.python part configurable by an
env var, e.g. PYTHONRESOURCES.

Perhaps we could use that directory for other Python-related
resources as well, e.g. an optional sys.path lookup cache (pickled
dictionary of known package/module file locations to reduces Python
startup time).

 I think that's all that's needed.  It would make playing with
 easy_install/setuptools nicer to have this.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 07 2008)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Memory benchmarking?

2007-11-29 Thread M.-A. Lemburg
On 2007-11-29 11:52, Titus Brown wrote:
 Hi all,
 
 is there a good, or standard memory benchmarking system for Python?
 pybench doesn't return significantly different results when Python 2.6
 is compiled with pymalloc and without pymalloc.  Thinking on it, I'm not
 too surprised -- pybench probably benchmarks a lot of stuff -- but some
 guidance on how/whether to benchmark different memory allocation schemes
 would be welcome.

pybench focuses on runtime performance, not memory usage. It's
way of creating and deleting objects is also highly non-standard
when compared to typical use of Python in real life applications.

It's also rather difficult to benchmark memory allocation, since
most implementations work with some sort of pre-allocation,
buffer pools or free lists.

If you want to use a similar approach as pybench does, ie. benchmark
small parts of the interpreter instead of generating some grand
total, then you'd probably have to do this by spawning a separate
process per test.

 refs:
 
 http://code.google.com/p/google-highly-open-participation-psf/issues/detail?id=105colspec=ID%20Status%20Summary
 
 http://evanjones.ca/memoryallocator/
 
 http://www.advogato.org/person/wingo/diary/225.html

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Nov 29 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Build Notes for building trunk with Visual Studio 2008 Express Edition

2007-11-24 Thread M.-A. Lemburg
On 2007-11-23 23:12, Paul Moore wrote:
 On 23/11/2007, Christian Heimes [EMAIL PROTECTED] wrote:
 bsddb is automatically build by a build step. But you have to convert
 the project files in build_win32 to VS 2008 first. Simply open the
 solution file and let VS convert the projects.
 
 VS 2008 Express doesn't have a devenv command, so the pre-link step
 doesn't work. You need to open the bsddb project file, and build
 db_static by hand. For a debug Python, you need the Debug
 configuration, for a release Python you need the Release
 configuration. Beware - the default config is Debug_ASCII which is not
 checked by the pre-link step.
 
 So, from a checkout of Python, plus the various svn externals:
 
 - dowload nasm, install it somewhere on your PATH, and copy nasm.exe
 to nasmw.exe (Why did you use nasmw.exe rather than nasm.exe? Is there
 a difference in the version you have?)

The OpenSSL build process still uses the old nasmw.exe name
(the build instructions there are for the old NASM version,
but it also works with the latest NASM release).

The NASM project has recently changed the name of the executable
to nasm.exe.

 - Open the bsddb solution file, and build debug and release versions
 of db_static
 - Open the Python pcbuild solution file, and build the solution.
 
 You'll get a total of 2 failures and 18 successes. Of the failures,
 one (_sqlite3) is not actually fatal (the pre-link step fails, and
 that only the first time), and the module is actually built correctly.
 The other is _tkinter, which isn't sorted out yet.
 
 You can then run the tests with rt.bat. If you have an openssl.exe on
 your path, test_socket_ssl may hang. Otherwise, everything should
 pass, apart from test_tcl. (Actually, there's a failure in
 test_doctest right now, seems to have come in with r59137, but I don't
 have time to diagnose right now).
 
 This is the case for both trunk and py3k (ignoring genuine test failures).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Nov 24 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Build Notes for building trunk with Visual Studio 2008 Express Edition

2007-11-23 Thread M.-A. Lemburg
On 2007-11-23 16:59, Christian Heimes wrote:
 Paul Moore wrote:
 _ssl
 

 Christian has been making changes to allow this to build without Perl,
 so I gave it a try. I used openssl 0.9.8g, which I extracted to the
 build directory (I noticed afterwards that this is the same version as
 in Python svn, so I could have used the svn external!)

 I needed to download nasm (nasm.sf.net) version 2.00rc1, and rename
 nasm.exe to nasmw.exe and put it on my PATH.

 Build succeeded, no issues.
 
 You still need Perl if you are using an official download of openssl.
 I've added the pre-build assembly and makefiles in the svn external at
 svn.python.org

Why not include the prebuilt libraries of all external libs in SVN
as well ?

BTW: Are you including the patented algorithms in the standard
OpenSSL build or excluding them ?

The patented ones are RC5, IDEA and MDC2:

http://svn.python.org/view/external/openssl-0.9.8g/README

Here's a previous discussion:

http://mail.python.org/pipermail/python-dev/2006-August/068055.html

Here's what MediaCrypt has to say about requiring a license
for IDEA:

http://www.mediacrypt.com/_contents/20_support/204010_faq_bus.asp

Note that in the case of IDEA, any commercial use will require
getting a license to the patented algorithm first (costs start
at EUR 15 for a single use license).

I'd opt for not including these algorithms, as it's just
too easy for the user to overlook this license requirement.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Nov 23 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] XML codec?

2007-11-12 Thread M.-A. Lemburg
On 2007-11-11 23:22, Martin v. Löwis wrote:
 First, XML-RPC is not the only mechanism using XML over a network
 connection. Second, you don't want to do this if you're dealing
 with several 100 MB of data just because you want to figure
 out the encoding.
 That's my original claim/question: what SPECIFIC application do
 you have in mind that transfers XML over a network and where you
 would want to have such a stream codec?
 XML-based web services used for business integration, e.g. based
 on ebXML.

 A common use case from our everyday consulting business is e.g.
 passing market and trading data to portfolio pricing web services.
 
 I still don't see the need for this feature from this example.
 First, in ebXML messaging, the message are typically *not* large
 (i.e. much smaller than 100 MB). Furthermore, the typical processing
 of such a message would be to pass it directly to the XML parser,
 no need for the functionality under discussion.

I don't see the point in continuing this discussion. If you think
you know better, that's fine. Just please don't generalize this
to everyone else working with Python and XML.

 Right. However, I' will remain opposed to adding this to the
 standard library until I see why one would absolutely need to
 have that. Not every piece of code that is useful in some
 application should be added to the standard library.
 Agreed, but the application space of web services is large
 enough to warrant this.
 
 If that was the case, wouldn't the existing Python web service
 libraries already include such a functionality?

No.

To finalize this:

We have a -1 from Martin and a +1 from Walter, Guido and myself.
Pretty clear vote if you ask me. I'd say we end the discussion here
and move on.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Nov 12 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] XML codec?

2007-11-11 Thread M.-A. Lemburg
On 2007-11-11 14:51, Martin v. Löwis wrote:
 A non-seekable stream is not all that uncommon in network processing.
 Right. But what is the relationship to XML encoding autodetection?
 It pops up whenever you need to detect the encoding of the
 incoming XML data on the network connection, e.g. in XML RPC
 or data upload mechanisms.
 
 No, it doesn't. For XML-RPC, you pass the XML payload of the
 HTTP request to the XML parser, and it deals with the encoding.

First, XML-RPC is not the only mechanism using XML over a network
connection. Second, you don't want to do this if you're dealing
with several 100 MB of data just because you want to figure
out the encoding.

 It is also not always feasible to load all data into memory, so
 some form of buffering must be used.
 
 Again, I don't see the use case. For XML-RPC, it's very feasible
 and standard procedure to have the entire document in memory
 (in a processed form).

You may not see the use case, but that doesn't really mean
anything if the use cases exist in real life applications,
right ?!

 This approach is also needed if you want to stack stream codecs
 (not sure whether this is still possible in Py3, but that's how
 I designed them for Py2).
 
 The design of the Py2 codecs is fairly flawed, unfortunately.

Fortunately, this sounds like a fairly flawed argument to me ;-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Nov 11 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] XML codec?

2007-11-11 Thread M.-A. Lemburg
On 2007-11-11 18:56, Martin v. Löwis wrote:
 First, XML-RPC is not the only mechanism using XML over a network
 connection. Second, you don't want to do this if you're dealing
 with several 100 MB of data just because you want to figure
 out the encoding.
 
 That's my original claim/question: what SPECIFIC application do
 you have in mind that transfers XML over a network and where you
 would want to have such a stream codec?

XML-based web services used for business integration, e.g. based
on ebXML.

A common use case from our everyday consulting business is e.g.
passing market and trading data to portfolio pricing web services.

 If I have 100MB of XML in a file, using the detection API, I do
 
   f = open(filename)
   s = f.read(100)
   while True:
 coding = xml.utils.detect_encoding(s)
 if coding is not undetermined:
break
 s += f.read(100)
   f.close()
 
 Having the loop here is paranoia: in my application, I might be
 able to know that 100 bytes are sufficient to determine the encoding
 always.

Doing the detection with files is easy, but that was never
questioned.

 Again, I don't see the use case. For XML-RPC, it's very feasible
 and standard procedure to have the entire document in memory
 (in a processed form).
 You may not see the use case, but that doesn't really mean
 anything if the use cases exist in real life applications,
 right ?!
 
 Right. However, I' will remain opposed to adding this to the
 standard library until I see why one would absolutely need to
 have that. Not every piece of code that is useful in some
 application should be added to the standard library.

Agreed, but the application space of web services is large
enough to warrant this.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Nov 11 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] XML codec?

2007-11-09 Thread M.-A. Lemburg
On 2007-11-09 14:10, Walter Dörwald wrote:
 Martin v. Löwis wrote:
 Yes, an XML parser should be able to use UTF-8, UTF-16, UTF-32, etc
 codecs to do the encoding.  There's no need to create a magical
 mystery codec to pick out which though.
 So the code is good, if it is inside an XML parser, and it's bad if it
 is inside a codec?
 Exactly so. This functionality just *isn't* a codec - there is no
 encoding. Instead, it is an algorithm for *detecting* an encoding.
 
 And what do you do once you've detected the encoding? You decode the
 input, so why not combine both into an XML decoder?

FWIW: I'm +1 on adding such a codec.

It makes working with XML data a lot easier: you simply don't have to
bother with the encoding of the XML data anymore and can just let the
codec figure out the details. The XML parser can then work directly
on the Unicode data.

Whether it needs to be in C or not is another question (I would have
done this in Python since performance is not really an issue), but since
the code is already written, why not use it ?

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Nov 09 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] XML codec?

2007-11-09 Thread M.-A. Lemburg
Martin v. Löwis wrote:
 It makes working with XML data a lot easier: you simply don't have to
 bother with the encoding of the XML data anymore and can just let the
 codec figure out the details. The XML parser can then work directly
 on the Unicode data.
 
 Having the functionality indeed makes things easier. However, I don't
 find
 
   s.decode(xml.detect_encoding(s))
 
 particularly more difficult than
 
   s.decode(xml-auto-detection)

Not really, but the codec has more control over what happens to
the stream, ie. it's easier to implement look-ahead in the codec
than to do the detection and then try to push the bytes back onto
the stream (which may or may not be possible depending on the
nature of the stream).

 Whether it needs to be in C or not is another question (I would have
 done this in Python since performance is not really an issue), but since
 the code is already written, why not use it ?
 
 It's a maintenance issue.

I'm sure Walter will do a great job in maintaining the code :-)

Regards,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Nov 09 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] XML codec?

2007-11-09 Thread M.-A. Lemburg
Martin v. Löwis wrote:
 Not really, but the codec has more control over what happens to
 the stream, ie. it's easier to implement look-ahead in the codec
 than to do the detection and then try to push the bytes back onto
 the stream (which may or may not be possible depending on the
 nature of the stream).
 
 YAGNI.

A non-seekable stream is not all that uncommon in network processing.
I usually end up either reading the complete data into memory
or doing the needed buffering by hand.

Regards,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Nov 10 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Does Python need a file locking module (slightly higher level)?

2007-10-26 Thread M.-A. Lemburg
On 2007-10-26 05:41, Barry Warsaw wrote:
 On Oct 22, 2007, at 11:30 PM, [EMAIL PROTECTED] wrote:
 
 It's not clear that any of these implementations is going to be  
 perfect.
 Maybe none ever will be.
 
 I would agree with this.  You write a program and know you need to  
 implement some kind of resource locking, so you start looking for  
 some OTS solution.  But then you realize that your application needs  
 somewhat different semantics or needs to work in platforms or  
 environments that the OTS code doesn't handle.  Just a few days ago,  
 I was looking at some locking code that needed to work across  
 multiple invocations of a script on multiple machines, and the only  
 thing they shared was a PostgreSQL connection, so we ended up wanting  
 to use its advisory locks.
 
 In his reply Jean-Paul made this comment:
 
 It might be nice to have something like that in the standard  
 library,
 but it's very simple once you know what to do.
 
 I'm not so sure about the very simple part, especially if you aren't
 familiar with all the ins and outs of the different platforms.
 
 I'd totally agree with this.  Locking seems simple, but it's got some  
 really tricky aspects that need to be coded just right or you'll be  
 in a world of hurt.  Mailman's LockFile.py (which you're right is  
 *nix only) is stable now, but has had some really subtle bugs in the  
 past.

You might want to take a look at the FileLock.py module that's
part of the eGenix mx Base distribution (mx.Misc.FileLock).

It works reliably on Unix and Windows, doesn't rely on fcntl and
has been in use for years.

The only downside is that it's application specific,
ie. only applications using the module for locking will
detect the locks - but then again: this is exactly the problem
you typically want to solve.

 The fact
 that the first three bits of code I was referred to were  
 implemented by
 three significant Python tools/platforms and that all are different  
 in some
 significant ways suggests that there is some both an underlying  
 need for a
 file locking mechanism but with a lack of consensus about the best  
 way to
 implement the mother-of-all-file-locking schemes for Python.  Maybe  
 the best
 place for this is in the distribution.  PEP?
 
 I don't think any one solution will work for everybody.  I'm not even  
 sure we can define a common API a la the DBAPI, but if something were  
 to make it into the standard distribution, that's the direction I'd  
 go in.  Then we can provide various implementations that support the  
 LockingAPI under various environments, constraints, and platforms.   
 If we wanted to distribute them in the stdlib, we could put them all  
 in a package and let the user decide which features they need.
 
 I'm still planning on de-Mailman-ifying LockFile.py sometime soon.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 26 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Unicode database

2007-08-09 Thread M.-A. Lemburg
Nick Maclaren wrote:
 Ah, the makefile. I don't think you use it create the Unicode database.

 It's only good for generating the codecs (Lib/encodings)
 
 Yes, but it DOES attempt to download the mappings, and is the ONLY
 script which attempts to do so.

Of course it does. The Tools/unicode/Makefile is meant to simplify
recreating the codecs from the (possibly updated) mapping on the Unicode
site.

If it doesn't work for you, that may well be possible, since I wrote
the Makefile and the other related stuff in that directory to help me
with updating the codecs from the mappings. It's only checked in for
convenience.

 beelzebub$find Python-2.5.1 -type f | wc
34583460  135981
 beelzebub$find Python-2.5.1 -type f | xargs grep ftp.unicode.org
 Python-2.5.1/Doc/lib/libunicodedata.tex:4.1.0 which is publicly available 
 from \url{ftp://ftp.unicode.org/}.
 grep: Python-2.5.1/Mac/Icons/Disk: No such file or directory
 grep: Image.icns: No such file or directory
 grep: Python-2.5.1/Mac/Icons/Python: No such file or directory
 grep: Folder.icns: No such file or directory
 Python-2.5.1/Misc/NEWS:  at ftp.unicode.org and contain a few updates (e.g. 
 the Mac OS
 Python-2.5.1/Tools/unicode/Makefile:# files available at 
 ftp://ftp.unicode.org/
 Python-2.5.1/Tools/unicode/Makefile:ncftpget -R ftp.unicode.org . 
 Public/MAPPINGS
 Python-2.5.1/Tools/unicode/gencodec.py:site 
 (ftp://ftp.unicode.org/Public/MAPPINGS/) and creates Python codec
 Python-2.5.1/Tools/unicode/python-mappings/TIS-620.TXT:#   
 ftp://ftp.unicode.org/Public/MAPPINGS/ISO8859/8859-11.TXT the
 Python-2.5.1/Tools/unicode/python-mappings/TIS-620.TXT:#   
 ftp://ftp.unicode.org/Public/MAPPINGS/ISO8859/8859-11.TXT
 Python-2.5.1/Tools/unicode/python-mappings/KOI8-U.TXT:#   
 ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MISC/KOI8-R.TXT
 Python-2.5.1/Tools/unicode/python-mappings/CP1140.TXT:#   
 ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/EBCDIC/CP037.TXT
 Python-2.5.1/Modules/unicodedata.c:4.1.0 which is publically available from 
 ftp://ftp.unicode.org/.\n
 
 AFAICT, the mappings are still where they always were: at the
 location given in the Makefile. (e.g.
 ftp://ftp.unicode.org/Public/MAPPINGS/ISO8859/8859-15.TXT
 )
 
 Then you DEFINITELY are using a non-standard set of files.  That
 above was from the source of Python 2.5.1 that I have just downloaded.

No idea where you get that impression from, but then I'm not really
sure what you're after anyway ;-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 09 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-3000] Python 3000 Status Update (Long!)

2007-06-19 Thread M.-A. Lemburg
On 2007-06-19 14:40, Walter Dörwald wrote:
 Georg Brandl wrote:
 A minuscule nit: the rot13 codec has no library equivalent, so it won't be
 supported anymore :)
 Given that there are valid use cases for bytes-to-bytes translations, 
 and a common API for them would be nice, does it make sense to have an 
 additional category of codec that is invoked via specific recoding 
 methods on bytes objects? For example:

encoded = data.encode_bytes('bz2')
decoded = encoded.decode_bytes('bz2')
assert data == decoded
 This is exactly what I proposed a while before under the name
 bytes.transform().

 IMO it would make a common use pattern much more convenient and
 should be given thought.

 If a PEP is called for, I'd be happy to at least co-author it.
 
 Codecs are a major exception to Guido's law: Never have a parameter
 whose value switches between completely unrelated algorithms.

I don't see much of a problem with that. Parameters are
per-se intended to change the behavior of a function or
method.

Note that you are referring to the .encode() and .decode()
methods - these are just easy to use interfaces to the codecs
registered in the system.

The codec design allows for different input and output
types as it doesn't impose restrictions on these. Codecs
are more general in that respect: they don't just deal
with Unicode encodings, it's a more general approach
that also works with other kinds of data types.

The access methods, OTOH, can impose restrictions and probably
should to restrict the return types to a predicable set.

 Why don't we put all string transformation functions into a common
 module (the string module might be a good place):
 
 import string
 string.rot13('abc')

I think the string module will have to go away. It doesn't
really separate between text and bytes data.

Adding more confusion will not really help with making
this distinction clear, either, I'm afraid.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 19 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2007-07-09: EuroPython 2007, Vilnius, Lithuania19 days to go

 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adventures with x64, VS7 and VS8 on Windows

2007-05-22 Thread M.-A. Lemburg
Hi Mark,

 +1 from me.

 I think this is simply a bug introduced with the UCS4 patches in
 Python 2.2.

 unicodeobject.h already has this code:

 #ifndef PY_UNICODE_TYPE

 /* Windows has a usable wchar_t type (unless we're using UCS-4) */
 # if defined(MS_WIN32)  Py_UNICODE_SIZE == 2
 #  define HAVE_USABLE_WCHAR_T
 #  define PY_UNICODE_TYPE wchar_t
 # endif

 # if defined(Py_UNICODE_WIDE)
 #  define PY_UNICODE_TYPE Py_UCS4
 # endif

 #endif

 But for some reason, pyconfig.h defines:

 /* Define as the integral type used for Unicode representation. */
 #define PY_UNICODE_TYPE unsigned short

 /* Define as the size of the unicode type. */
 #define Py_UNICODE_SIZE SIZEOF_SHORT

 /* Define if you have a useable wchar_t type defined in
 wchar.h; useable
means wchar_t must be 16-bit unsigned type. (see
Include/unicodeobject.h). */
 #if Py_UNICODE_SIZE == 2
 #define HAVE_USABLE_WCHAR_T
 #endif

 disabling the default settings in the unicodeobject.h.
 
 Yes, that does appear strange.  The following patch works for me, keeps
 Python building and appears to solve my problem.  Any objections?

Looks fine to me.

 Mark
 
 
 Index: pyconfig.h
 ===
 --- pyconfig.h  (revision 55487)
 +++ pyconfig.h  (working copy)
 @@ -491,22 +491,13 @@
  /* Define if you want to have a Unicode type. */
  #define Py_USING_UNICODE
 
 -/* Define as the integral type used for Unicode representation. */
 -#define PY_UNICODE_TYPE unsigned short
 -
  /* Define as the size of the unicode type. */
 -#define Py_UNICODE_SIZE SIZEOF_SHORT
 +/* This is enough for unicodeobject.h to do the right thing on Windows.
 */
 +#define Py_UNICODE_SIZE 2
 
 -/* Define if you have a useable wchar_t type defined in wchar.h; useable
 -   means wchar_t must be 16-bit unsigned type. (see
 -   Include/unicodeobject.h). */
 -#if Py_UNICODE_SIZE == 2
 -#define HAVE_USABLE_WCHAR_T
 -
  /* Define to indicate that the Python Unicode representation can be passed
 as-is to Win32 Wide API.  */
  #define Py_WIN_WIDE_FILENAMES
 -#endif
 
  /* Use Python's own small-block memory-allocator. */
  #define WITH_PYMALLOC 1
 
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 http://mail.python.org/mailman/options/python-dev/mal%40egenix.com

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 22 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adventures with x64, VS7 and VS8 on Windows

2007-05-21 Thread M.-A. Lemburg
On 2007-05-21 12:30, Kristján Valur Jónsson wrote:

 [Py_UNICODE being #defined as unsigned short on Windows]

 I'd rather make it a platform-specific definition (for platform=Windows
 API). Correct me if I'm wrong, but isn't wchar_t also available in VS
 2003 (and even in VC6?). And doesn't it have the right definition in
 all these compilers?

 So +1 for setting Py_UNICODE to wchar_t on Windows.
 
 Yes.  Btw, in previous visual studio versions, wchar_t was not treated
 as a builtin type by default, but rather as synonymous with unsighed short.
 Now the default is that it is, and this causes some semantic differences
 and incompatibilities of the type seen.

+1 from me.

If think this is simply a bug introduced with the UCS4 patches in
Python 2.2.

unicodeobject.h already has this code:

#ifndef PY_UNICODE_TYPE

/* Windows has a usable wchar_t type (unless we're using UCS-4) */
# if defined(MS_WIN32)  Py_UNICODE_SIZE == 2
#  define HAVE_USABLE_WCHAR_T
#  define PY_UNICODE_TYPE wchar_t
# endif

# if defined(Py_UNICODE_WIDE)
#  define PY_UNICODE_TYPE Py_UCS4
# endif

#endif

But for some reason, pyconfig.h defines:

/* Define as the integral type used for Unicode representation. */
#define PY_UNICODE_TYPE unsigned short

/* Define as the size of the unicode type. */
#define Py_UNICODE_SIZE SIZEOF_SHORT

/* Define if you have a useable wchar_t type defined in wchar.h; useable
   means wchar_t must be 16-bit unsigned type. (see
   Include/unicodeobject.h). */
#if Py_UNICODE_SIZE == 2
#define HAVE_USABLE_WCHAR_T
#endif

disabling the default settings in the unicodeobject.h.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 21 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0365: Adding the pkg_resources module

2007-05-21 Thread M.-A. Lemburg
On 2007-05-21 00:07, Talin wrote:
 Phillip J. Eby wrote:
 I wanted to get this in before the Py3K PEP deadline, since this is a 
 Python 2.6 PEP that would presumably impact 3.x as well.  Feedback welcome.


 PEP: 365
 Title: Adding the pkg_resources module
 
 I'm really surprised that there hasn't been more comment on this.

True both ways, I guess: I'm still waiting for a reply to my
comments.

I'd also like to see more discussion about adding e.g.:

 * support for user packages

   (ie. having site.py add a well-defined user home directory
   based Python path entry to sys.path, e.g.
   ~/.python/user-packages, much like what MacPython already does
   now)

 * support for having the import mechanism play nice
   with namespace packages

   (ie. packages that may live in different places on the disk,
   but appear to be in the same Python package as seen by the
   import mechanism)

I think those two features would go a long way in reducing the
number of hacks setuptools currently applies to get this
functionality working with code in .pth files, monkey-patching
site.py, etc.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 21 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0365: Adding the pkg_resources module

2007-05-21 Thread M.-A. Lemburg
On 2007-05-21 16:05, Phillip J. Eby wrote:
 At 01:43 PM 5/21/2007 +0200, M.-A. Lemburg wrote:
 On 2007-05-21 00:07, Talin wrote:
 Phillip J. Eby wrote:
 I wanted to get this in before the Py3K PEP deadline, since this is a
 Python 2.6 PEP that would presumably impact 3.x as 
 well.  Feedback welcome.

 PEP: 365
 Title: Adding the pkg_resources module
 I'm really surprised that there hasn't been more comment on this.
 True both ways, I guess: I'm still waiting for a reply to my
 comments.
 
 What comments are you talking about?  I must've missed them.

I've attached the email. Please see below.

 I'd also like to see more discussion about adding e.g.:

  * support for user packages

(ie. having site.py add a well-defined user home directory
based Python path entry to sys.path, e.g.
~/.python/user-packages, much like what MacPython already does
now)

  * support for having the import mechanism play nice
with namespace packages

(ie. packages that may live in different places on the disk,
but appear to be in the same Python package as seen by the
import mechanism)

 I think those two features would go a long way in reducing the
 number of hacks setuptools currently applies to get this
 functionality working with code in .pth files, monkey-patching
 site.py, etc.
 
 These items aren't directly related to the PEP, 
 however. 

Right. I wasn't referring to this PEP. I think we should have
two more PEPs covering the above points, since they offer
benefits for all users, not just setuptools users.

 pkg_resources doesn't monkeypatch anything or touch any 
 .pth files.  It only changes sys.path at runtime if you explicitly 
 ask it to locate and activate packages for you.

 As for namespace packages, pkg_resources provides a more PEP 
 302-compatible alternative to pkgutil.extend_path().  pkgutil doesn't 
 support anything but existing filesystem directories, but the 
 pkg_resources version supports zipfiles and has hooks to allow 
 namespace package support to be registered for any PEP 302 importer.  See:
 
 http://peak.telecommunity.com/DevCenter/PkgResources#supporting-custom-importers
 
 (specifically, the register_namespace_handler() function.)

Looking at the code it appears as if you've already formalized
an implementation for this.

However, since this is not egg-specific it should probably be
moved to pkgutil and get a separate PEP with detailed documentation
(the link you provided doesn't really explain the concepts, reading
the code helped a bit).

What I don't understand about your approach is why importers
would have to register with the namespace implementation.

This doesn't seem necessary, since the package __path__ attribute
already provides all functionality needed for redirecting
lookups to different paths.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 21 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
---BeginMessage---
On 2007-05-01 02:29, Phillip J. Eby wrote:
 I wanted to get this in before the Py3K PEP deadline, since this is a 
 Python 2.6 PEP that would presumably impact 3.x as well.  Feedback welcome.

Could you add a section that explains the side effects of
importing pkg_resources ?

The documentation of the module doesn't mention any, but the
code suggests that you are installing (some form of) import
hooks.

Some other comments:

* Wouldn't it be better to factor out all the meta-data access
  code that's not related to eggs into pkgutil ?!

* How about then renaming the remaining module to egglib ?!

* The module needs some reorganization: imports, globals and constants
  at the top, maybe a few comments delimiting the various sections,

* The get_*_platform() should probably use the platform module
  which is a lot more flexible than distutils' get_platform()
  (which should probably use the platform module as well in the
  long run)

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 04 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764

Re: [Python-Dev] PEP 0365: Adding the pkg_resources module

2007-05-21 Thread M.-A. Lemburg
On 2007-05-21 20:01, Phillip J. Eby wrote:
 At 06:28 PM 5/21/2007 +0200, M.-A. Lemburg wrote:
 However, since this is not egg-specific it should probably be
 moved to pkgutil and get a separate PEP with detailed documentation
 (the link you provided doesn't really explain the concepts, reading
 the code helped a bit).
 
 That doesn't really make sense in the context of the current PEP,
 though, which isn't to provide a general-purpose namespace package API;
 it's specifically about adding an existing piece of code to the stdlib,
 with its API intact.

You seem to indicate that you're not up to discussing the concepts
implemented by the module and *integrating* them with the Python
stdlib.

Please correct me if I'm wrong, but if the whole point of the PEP
is a take it or leave it decision, then I don't see the point of
discussing it. I'm -1 on adding the module in its current state;
I'd be +1 on integrating the concepts with the Python stdlib.

Hope I'm wrong,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 21 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0365: Adding the pkg_resources module

2007-05-21 Thread M.-A. Lemburg
On 2007-05-21 22:48, Phillip J. Eby wrote:
 At 08:56 PM 5/21/2007 +0200, M.-A. Lemburg wrote:
 On 2007-05-21 20:01, Phillip J. Eby wrote:
  At 06:28 PM 5/21/2007 +0200, M.-A. Lemburg wrote:
  However, since this is not egg-specific it should probably be
  moved to pkgutil and get a separate PEP with detailed documentation
  (the link you provided doesn't really explain the concepts, reading
  the code helped a bit).
 
  That doesn't really make sense in the context of the current PEP,
  though, which isn't to provide a general-purpose namespace package API;
  it's specifically about adding an existing piece of code to the stdlib,
  with its API intact.

 You seem to indicate that you're not up to discussing the concepts
 implemented by the module and *integrating* them with the Python
 stdlib.
 
 No, I'm saying something else.  I'm saying it:
 
 1. has nothing to do with the PEP,
 2. isn't something I'm volunteering to do, and
 3. would only make sense to do as part of Python 3 stdlib
 reorganization, if it were done at all.

I don't understand that last part: how can adding a new module
or set of modules require waiting for reorganization of the
stdlib ?

All I'm suggesting is to reorganize the code in pkg_resources.py
a bit and move the relevant bits into pkgutil.py and into a new
eggutil.py.

 Now, the code is certainly under an open license, and the concepts are
 entirely free for anyone to use.  If somebody wishes to do what you're
 describing, they're certainly welcome to take on that thankless task.
 
 But I personally don't see the point, since by definition that new API
 would have *no current users*.  And the purpose of the PEP is to serve
 the (rather large) audience that would like to take advantage of
 existing software that uses the API.

 Thus, any proposal to alter that API faces a high entry barrier to show
 how the proposed changes would provide a signficant practical benefit to
 users.

Why is that ?

You can easily provide a pkg_resource.py module with your old API
that interfaces to the new reorganized code in the stdlib.

 That's not even remotely similar to take it or leave it.  It might
 *seem* that way, of course, simply because in any proposal to change the
 API, there's an implicit question of why nobody proposed the change via
 the Distutils-SIG, sometime during the last 2+ years of discussions
 around that API.

This doesn't have anything to do with distutils. It's entirely
about the egg distribution format.

 I remain open-minded and curious as to the possibility that someone
 *could* propose a meaningful change, but am also rationally skeptical
 that someone actually *will* come up with something that would outweigh
 the user benefit of keeping the already published, already discussed,
 already field-tested, already in-use API.
 
 For that matter, I remain open-minded and curious as to the possibility
 of whether someone could propose a reasonable justification for *not*
 including the module in the stdlib.  After all, last year Fredrik Lundh
 surprised me with a convincing rationale for *not* including setuptools
 in the stdlib, which is why I backed off on doing so in 2.5, and am now
 proffering a much-reduced-in-scope proposal for 2.6.

 So, I'm perfectly willing and able to change my mind, given convincing
 reasons to do so.  So far, though, your change suggestions haven't even
 explained why *you* want them, let alone why anybody else should agree. 
 We can hardly discuss what you haven't yet said.

I'm not sure what you want to hear from me.

You asked for comments, I wrote back and gave you comments. I also
made it clear why I think that breaking up the addition into different
PEPs makes a lot of sense and why separating the code into different
modules for the same reason makes a lot of sense as well.

I also tried to stir up some discussion to make life easier
for setuptools by suggesting a user-package directory on
sys.path and adding support for namespace packages as
general Python feature instead of hiding it away in
pkg_resources.py.

You should see this as chance to introduce new concepts to Python.
Instead you seem to feel offended every time someone suggests a
change in your design. That's also the reason why I stopped
discussing things with you on the distutils list. There was simply
no way of getting through to you.

Perhaps we should just meet up for a beer in London sometime
and sort things out ;-)

Cheers,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 21 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld

Re: [Python-Dev] \u and \U escapes in raw unicode string literals

2007-05-13 Thread M.-A. Lemburg
On 2007-05-12 02:42, Andrew McNabb wrote:
 On Sat, May 12, 2007 at 01:30:52AM +0200, M.-A. Lemburg wrote:
 I wonder how we managed to survive all these years with
 the existing consistent and concise definition of the
 raw-unicode-escape codec ;-)

 There are two options:

  * no one really uses Unicode raw strings nowadays

  * none of the existing users has ever stumbled across the
problem case that triggered all this

 Both ways, we're discussing a non-issue.
 
 
 Sure, it's a non-issue for Python 2.x.  However, when Python 3 comes
 along, and all strings are Unicode, there will likely be a lot more
 users stumbling into the problem case.

In the first case, changing the codec won't affect much code when
ported to Py3k.

In the second case, a change to the codec is not necessary.

Please also consider the following:

* without the Unicode escapes, the only way to put non-ASCII
  code points into a raw Unicode string is via a source code encoding
  of say UTF-8 or UTF-16, pretty much defeating the original
  requirement of writing ASCII code only

* non-ASCII code points in text are not uncommon, they occur
  in most European scripts, all Asian scripts,
  many scientific texts and in also texts meant for the web
  (just have a look at the HTML entities, or think of Word
  exports using quotes)

* adding Unicode escapes to the re module will break code
  already using ...\u... in the regular expressions for
  other purposes; writing conversion tools that detect this
  usage is going to be hard

* OTOH, writing conversion tools that simply work on string
  literals in general is easy

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 13 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] \u and \U escapes in raw unicode string literals

2007-05-13 Thread M.-A. Lemburg
On 2007-05-13 18:04, Martin v. Löwis wrote:
 * without the Unicode escapes, the only way to put non-ASCII
   code points into a raw Unicode string is via a source code encoding
   of say UTF-8 or UTF-16, pretty much defeating the original
   requirement of writing ASCII code only
 
 That's no problem, though - just don't put the Unicode character
 into a raw string. Use plain strings if you have a need to include
 Unicode characters, and are not willing to leave ASCII.
 
 For Python 3, the default source encoding is UTF-8, so it is
 much easier to use non-ASCII characters in the source code.
 The original requirement may not be as strong anymore as it
 used to be.

You can do that today: Just put the # coding: utf-8 marker
at the top of the file.

However, in some cases, your editor may not be capable of
displaying or letting you enter the Unicode text you have
in mind.

In other cases, there may be a corporate coding standard in
place that prohibits using non-ASCII text in source code,
or fixes the encoding to e.g. Latin-1.

In all those cases, it's necessary to be able to enter the
Unicode code points which do cannot be used in the source
code using other means and the easiest way to do this is
by using Unicode escapes.

 * non-ASCII code points in text are not uncommon, they occur
   in most European scripts, all Asian scripts,
   many scientific texts and in also texts meant for the web
   (just have a look at the HTML entities, or think of Word
   exports using quotes)
 
 And you are seriously telling me that people who commonly
 use non-ASCII code points in their source code are willing
 to refer to them by Unicode ordinal number (which, of course,
 they all know by heart, from 1 to 65536)?

No, I'm not. I'm saying that non-ASCII code points are in
common use and (together with the above bullet) that there
are situations where you can't put the relevant code point
directly into your source code.

Using Unicode escapes for these will always be a cludge,
but it's still better than not being able to enter the
code points at all.

 * adding Unicode escapes to the re module will break code
   already using ...\u... in the regular expressions for
   other purposes; writing conversion tools that detect this
   usage is going to be hard
 
 It's unlikely to occur in code today - \u just means the same
 as u (so \u1234 matches u1234); if you want a backslash
 followed by u in your regular expression, you should write
 \\u.
 
 It would be possible to future-warn about \u in 2.6, catching
 these cases. Authors then would either have to remove the
 backslash, or duplicate it, depending on what they want to
 express.

Good idea.

The re module would then have to implement the same escaping
scheme as the raw-unicode-escape code (only an odd number of
backslashes causes the escaping code to trigger).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 13 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] \u and \U escapes in raw unicode string literals

2007-05-11 Thread M.-A. Lemburg
On 2007-05-11 07:52, Martin v. Löwis wrote:
 This is what prompted my question, actually: in Py3k, in the
 str/unicode unification branch, r\u1234 changes meaning: before the
 unification, this was an 8-bit string, where the \u was not special,
 but now it is a unicode string, where \u *is* special.
 
 That is true for non-raw strings also: the meaning of \u1234 also
 changes.
 
 However, traditionally, there was *no* escaping mechanism in raw strings
 in Python, and I feel that this is a good principle, because it is
 easy to learn (if you leave out the detail that \ can't be the last
 character in a raw string - which should get fixed also, IMO). So I
 think in Py3k, \u1234 should continue to be a string with 6
 characters. Otherwise, people will complain that
 os.stat(rc:\windows\system32\user32.dll) fails. Telling them to write
 os.stat(rc:\windows\system32\u005Cuser32.dll) will just cause puzzled
 faces.

Using double backslashes won't cause that reaction:

os.stat(c:\\windows\\system32\\user32.dll)

Also note that Windows is smart enough nowadays to parse
the good old Unix forward slash:

os.stat(c:/windows/system32/user32.dll)

 Windows path names are one of the two primary applications of raw
 strings (the other being regexes).

IMHO the primary use case are regexps and for those you'd
definitely want to be able to put Unicode characters into your
expressions.

BTW, if you use ur... for your expressions today (which you should
if you parse text), then nothing will change when removing the
'u' prefix in Py3k.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 11 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] \u and \U escapes in raw unicode string literals

2007-05-11 Thread M.-A. Lemburg
On 2007-05-11 13:05, Thomas Heller wrote:
 M.-A. Lemburg schrieb:
 On 2007-05-11 07:52, Martin v. Löwis wrote:
 This is what prompted my question, actually: in Py3k, in the
 str/unicode unification branch, r\u1234 changes meaning: before the
 unification, this was an 8-bit string, where the \u was not special,
 but now it is a unicode string, where \u *is* special.
 That is true for non-raw strings also: the meaning of \u1234 also
 changes.

 However, traditionally, there was *no* escaping mechanism in raw strings
 in Python, and I feel that this is a good principle, because it is
 easy to learn (if you leave out the detail that \ can't be the last
 character in a raw string - which should get fixed also, IMO). So I
 think in Py3k, \u1234 should continue to be a string with 6
 characters. Otherwise, people will complain that
 os.stat(rc:\windows\system32\user32.dll) fails. Telling them to write
 os.stat(rc:\windows\system32\u005Cuser32.dll) will just cause puzzled
 faces.
 Using double backslashes won't cause that reaction:

 os.stat(c:\\windows\\system32\\user32.dll)
 
 Sure.  But I want to use raw strings for Windows path names; it's much easier
 to type.

But think of the price to pay if we disable use of Unicode
escapes in raw strings. And all of this just because of the
one special case: having a file name that starts with a U
and needs to be referenced literally in a Python application
together with a path leading up to it.

BTW, there's an easy work-around for this special case:

os.stat(os.path.join(rc:\windows\system32, user32.dll))

 Also note that Windows is smart enough nowadays to parse
 the good old Unix forward slash:

 os.stat(c:/windows/system32/user32.dll)
 
 In my opinion this is a windows bug and not a features.  Especially because 
 there
 are Windows api functions (the shell functions, IIRC) that do NOT accept
 forward slashes.
 
 Would you say that *nix is dumb because it doesn't parse \\usr\\include?

Sorry, I wasn't trying to imply that Windows is/was a dumb system.

I think it's nice that you can use forward slashes on Windows -
makes writing code that works in both worlds (Unix and Windows)
a lot easier.

 Windows path names are one of the two primary applications of raw
 strings (the other being regexes).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 11 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] \u and \U escapes in raw unicode string literals

2007-05-10 Thread M.-A. Lemburg
On 2007-05-10 20:53, Paul Moore wrote:
 On 10/05/07, Guido van Rossum [EMAIL PROTECTED] wrote:
 I just discovered that, in all versions of Python as far back as I
 have access to (2.0), \u escapes are interpreted inside raw
 unicode strings. Thus:
 [...]
 Does anyone remember why it is done this way? The reference manual
 describes this behavior, but doesn't give an explanation:
 
 My memory is so dim as to be more speculation than anything else, but
 I suspect it's simply because there's no other way of including
 characters outside the ASCII range in a raw string.

This is per design (see PEP 100) and was done for the reason given
by Paul. The motivation for the chosen approach was to make Python's
raw Unicode strings compatible to Java's raw Unicode strings:

http://java.sun.com/docs/books/jls/second_edition/html/lexical.doc.html

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 10 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] \u and \U escapes in raw unicode string literals

2007-05-10 Thread M.-A. Lemburg
On 2007-05-11 00:11, Guido van Rossum wrote:
 On 5/10/07, M.-A. Lemburg [EMAIL PROTECTED] wrote:
 On 2007-05-10 20:53, Paul Moore wrote:
 On 10/05/07, Guido van Rossum [EMAIL PROTECTED] wrote:
 I just discovered that, in all versions of Python as far back as I
 have access to (2.0), \u escapes are interpreted inside raw
 unicode strings. Thus:
 [...]
 Does anyone remember why it is done this way? The reference manual
 describes this behavior, but doesn't give an explanation:
 My memory is so dim as to be more speculation than anything else, but
 I suspect it's simply because there's no other way of including
 characters outside the ASCII range in a raw string.
 This is per design (see PEP 100) and was done for the reason given
 by Paul. The motivation for the chosen approach was to make Python's
 raw Unicode strings compatible to Java's raw Unicode strings:

 http://java.sun.com/docs/books/jls/second_edition/html/lexical.doc.html
 
 I'm not sure what Java compatibility buys us. It is also far from
 perfect -- IIUC, in Java if you write \u0022 (that's the  character)
 it counts as an opening or closing quote, and if you write \u005c (a
 backslash) it can be used to escape the following character. OTOH, in
 Python, you can write urC:\Program Files\u005c and voila, a raw
 string terminating in a backslash. (In Java this would escape the 
 instead.)

http://mail.python.org/pipermail/python-dev/1999-November/001346.html
http://mail.python.org/pipermail/python-dev/1999-November/001392.html
and all the other postings in that month related to this.

 However, I understand the other reason (inclusion of non-ASCII
 characters in raw strings) and I reluctantly agree with it.
 Reluctantly, because it means I can't create a raw string containing a
 \ followed by u or U -- I needed one of those today.

 print ur\u005cu
\u

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 11 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Changing string constants to byte arrays in Py3k

2007-05-05 Thread M.-A. Lemburg
On 2007-05-04 19:51, Guido van Rossum wrote:
 [-python-dev]
 
 On 5/4/07, Fred L. Drake, Jr. [EMAIL PROTECTED] wrote:
 On Friday 04 May 2007, M.-A. Lemburg wrote:
   I also suggest making all bytes literals immutable to avoid running
   into any issues like the above.

 +1 from me.
 
 Rather than adding immutability to bytes objects (which has big
 implementation and type checking implications), consider using
 buffer(b123) as an immutable bytes literal. You can freely
 concatenate and compare buffer objects with bytes objects.

I like Georg's idea of having an immutable bytes subclass.
babc could then be a shortcut constructor for this subclass.

In general, I don't think it's a good idea to have literals
turn into mutable objects, since literals are normally perceived
as being constant.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 05 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Changing string constants to byte arrays in Py3k

2007-05-05 Thread M.-A. Lemburg
On 2007-05-05 18:11, Steven Bethard wrote:
 On 5/5/07, M.-A. Lemburg [EMAIL PROTECTED] wrote:
 On 2007-05-04 19:51, Guido van Rossum wrote:
  [-python-dev]
 
  On 5/4/07, Fred L. Drake, Jr. [EMAIL PROTECTED] wrote:
  On Friday 04 May 2007, M.-A. Lemburg wrote:
I also suggest making all bytes literals immutable to avoid running
into any issues like the above.
 
  +1 from me.
 
  Rather than adding immutability to bytes objects (which has big
  implementation and type checking implications), consider using
  buffer(b123) as an immutable bytes literal. You can freely
  concatenate and compare buffer objects with bytes objects.

 I like Georg's idea of having an immutable bytes subclass.
 babc could then be a shortcut constructor for this subclass.

 In general, I don't think it's a good idea to have literals
 turn into mutable objects, since literals are normally perceived
 as being constant.
 
 Does that mean you want list literals to be immutable too?
 
lst = ['a', 'b', 'c']
lst.append('d') # raises an error?

Sorry, I was referring to Python literals:

http://docs.python.org/ref/literals.html

ie. strings and numeric constant values defined in a Python program.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 05 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Changing string constants to byte arrays ([Python-checkins] r55119 - in python/branches/py3k-struni/Lib: codecs.py test/test_codecs.py)

2007-05-04 Thread M.-A. Lemburg
Hi Walter,

if the bytes type does turn out to be a mutable type as suggested
in PEP 358, then please make sure that no code (C code in
particular), relies on the constantness of these byte objects.

This is especially important when it comes to codecs, since
the error callback logic would allow the callback to manipulate
the byte object contents and length without the codec taking
note of this change.

I expect there to be other places in the interpreter which would
break as well.

Otherwise, you end up opening the door for segfaults and
easy DOS attacks on Python3.

Regards,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 04 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611


On 2007-05-04 15:05, walter.doerwald wrote:
 Author: walter.doerwald
 Date: Fri May  4 15:05:09 2007
 New Revision: 55119
 
 Modified:
python/branches/py3k-struni/Lib/codecs.py
python/branches/py3k-struni/Lib/test/test_codecs.py
 Log:
 Make the BOM constants in codecs.py bytes.
 
 Make the buffered input for decoders a bytes object.
 


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0365: Adding the pkg_resources module

2007-05-04 Thread M.-A. Lemburg
On 2007-05-01 02:29, Phillip J. Eby wrote:
 I wanted to get this in before the Py3K PEP deadline, since this is a 
 Python 2.6 PEP that would presumably impact 3.x as well.  Feedback welcome.

Could you add a section that explains the side effects of
importing pkg_resources ?

The documentation of the module doesn't mention any, but the
code suggests that you are installing (some form of) import
hooks.

Some other comments:

* Wouldn't it be better to factor out all the meta-data access
  code that's not related to eggs into pkgutil ?!

* How about then renaming the remaining module to egglib ?!

* The module needs some reorganization: imports, globals and constants
  at the top, maybe a few comments delimiting the various sections,

* The get_*_platform() should probably use the platform module
  which is a lot more flexible than distutils' get_platform()
  (which should probably use the platform module as well in the
  long run)

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 04 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611

 PEP: 365
 Title: Adding the pkg_resources module
 Version: $Revision: 55032 $
 Last-Modified: $Date: 2007-04-30 20:24:48 -0400 (Mon, 30 Apr 2007) $
 Author: Phillip J. Eby [EMAIL PROTECTED]
 Status: Draft
 Type: Standards Track
 Content-Type: text/x-rst
 Created: 30-Apr-2007
 Post-History: 30-Apr-2007
 
 
 Abstract
 
 
 This PEP proposes adding an enhanced version of the ``pkg_resources``
 module to the standard library.
 
 ``pkg_resources`` is a module used to find and manage Python
 package/version dependencies and access bundled files and resources,
 including those inside of zipped ``.egg`` files.  Currently,
 ``pkg_resources`` is only available through installing the entire
 ``setuptools`` distribution, but it does not depend on any other part
 of setuptools; in effect, it comprises the entire runtime support
 library for Python Eggs, and is independently useful.
 
 In addition, with one feature addition, this module could support
 easy bootstrap installation of several Python package management
 tools, including ``setuptools``, ``workingenv``, and ``zc.buildout``.
 
 
 Proposal
 
 
 Rather than proposing to include ``setuptools`` in the standard
 library, this PEP proposes only that ``pkg_resources`` be added to the
 standard library for Python 2.6 and 3.0.  ``pkg_resources`` is
 considerably more stable than the rest of setuptools, with virtually
 no new features being added in the last 12 months.
 
 However, this PEP also proposes that a new feature be added to
 ``pkg_resources``, before being added to the stdlib.  Specifically, it
 should be possible to do something like::
 
  python -m pkg_resources SomePackage==1.2
 
 to request downloading and installation of ``SomePackage`` from PyPI.
 This feature would *not* be a replacement for ``easy_install``;
 instead, it would rely on ``SomePackage`` having pure-Python ``.egg``
 files listed for download via the PyPI XML-RPC API, and the eggs would
 be placed in the ``$PYTHONEGGS`` cache, where they would **not** be
 importable by default.  (And no scripts would be installed)  However,
 if the download egg contains installation bootstrap code, it will be
 given a chance to run.
 
 These restrictions would allow the code to be extremely simple, yet
 still powerful enough to support users downloading package management
 tools such as ``setuptools``, ``workingenv`` and ``zc.buildout``,
 simply by supplying the tool's name on the command line.
 
 
 Rationale
 =
 
 Many users have requested that ``setuptools`` be included in the
 standard library, to save users needing to go through the awkward
 process of bootstrapping it.  However, most of the bootstrapping
 complexity comes from the fact that setuptools-installed code cannot
 use the ``pkg_resources`` runtime module unless setuptools is already
 installed. Thus, installing setuptools requires (in a sense) that
 setuptools already be installed.
 
 Other Python package management tools, such as ``workingenv`` and
 ``zc.buildout``, have similar bootstrapping issues, since they both
 make use of setuptools, but also want to provide users with something
 approaching a one-step install.  The complexity of creating bootstrap
 utilities for these and any other such tools that arise in future, is
 greatly reduced if ``pkg_resources`` is already present, and is also
 able to download pre-packaged 

Re: [Python-Dev] Changing string constants to byte arrays ([Python-checkins] r55119 - in python/branches/py3k-struni/Lib: codecs.py test/test_codecs.py)

2007-05-04 Thread M.-A. Lemburg
On 2007-05-04 18:53, Georg Brandl wrote:
 M.-A. Lemburg schrieb:
 Hi Walter,

 if the bytes type does turn out to be a mutable type as suggested
 in PEP 358, then please make sure that no code (C code in
 particular), relies on the constantness of these byte objects.

 This is especially important when it comes to codecs, since
 the error callback logic would allow the callback to manipulate
 the byte object contents and length without the codec taking
 note of this change.

 I expect there to be other places in the interpreter which would
 break as well.

 Otherwise, you end up opening the door for segfaults and
 easy DOS attacks on Python3.
 
 If the user does not need to change these bytes objects and this is needed
 in more places, adding an immutable flag for internal bytes objects
 only settable from C, or even an immutable byte base class might be an idea.

+1

I also suggest making all bytes literals immutable to avoid running
into any issues like the above.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 04 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Hindsight on Py_UNICODE_WIDE?

2007-03-23 Thread M.-A. Lemburg
On 2007-03-23 19:18, Jason Orendorff wrote:
 Scheme is adding Unicode support in an upcoming standard:
 (DRAFT) http://www.r6rs.org/document/lib-html/r6rs-lib-Z-H-3.html
 
 I have two questions for the python-dev team about Python's Unicode
 experiences.  If it's convenient, please take a moment to reply.
 Thanks in advance.
 
 1.  In hindsight, what do you think about PEP 261, the Py_UNICODE_WIDE
 build option?  On balance, has this been good, bad, or indifferent?
 What's good/bad about it?

Having narrow and wide builds introduces a level of complexity
that seems unnecessary. Few people ever use non-BMP code points
and the ones who do can easily get away with UTF-16 surrogates.

Most Unixes have chosen to go with UCS4 as storage format, so
you have little choice if you want to take advantage of mapping
directly to wchar on Unix.

Windows has chosen UTF-16 as internal storage format and wchar
is 16-bit on that platform.

You may also want to consider looking at PEP 263:

   http://www.python.org/dev/peps/pep-0263

Source code encoding is a great thing ! You can now write native
Unicode in Python source code.

The only downside is the extra complexity added by the fact
that the tokenizer in Py2 works on 8-bit characters. For this reason
we had to decode the source code to Unicode, then encode it to UTF-8,
pass it to the tokenizer and then decode the UTF-8 literal strings
for Unicode back into Unicode again.

Ideally, the tokenizer in Py3k should be rewritten to work directly on
Unicode.

 2.  The idea of multiple string representations has come up (that is,
 where all strings are Unicode, but in memory some are 8-bit, some
 16-bit, and some 32-bit--each string uses the narrowest possible
 representation).  This has been discussed here for Python 3000.  My
 question is:  Is this for real?  How far along is it?  How likely is
 it?

My suggestion for Scheme is not to go down that route. It adds
complexity for little added value and also makes the implementation
slower (due to the frequent conversion from one internal format
to another).

Can't comment on Py3k - I'm out of that loop.

If you want to know more about how Unicode was added to Python 2.x
and how it can be used, I suggest you read the following:

Unicode integration (one of the first PEPs ever written :-):

   http://www.python.org/dev/peps/pep-0100

Unicode in Python:

   http://www.egenix.com/files/python/EuroPython2002-Python-and-Unicode.pdf

Designing Unicode-aware Applications in Python:


http://www.egenix.com/files/python/EPC2006-Developing-Unicode-aware-applications-in-Python.pdf

Hope that helps,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 23 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal to revert r54204 (splitext change)

2007-03-15 Thread M.-A. Lemburg
On 2007-03-15 07:45, Martin v. Löwis wrote:
 Phillip J. Eby schrieb:
 And yet, that incorrect behavior was clearly intended by the author(s) 
 of the code, test, and docstrings.

 As it happens, Guido wrote that code (16 years ago) and the docstring (9 
 years ago), in the case of the posixpath module at least.
 
 I don't find it that clear that it was the intention, AFAICT, it could 
 have been an accident also. Guido added the doc strings as a 
 contribution from Charles G. Waldman; he may just have documented the
 implemented behavior.
 
 In r4493, Sjoerd Mullender changed splitext (in an incompatible way)
 so that it would split off only the last extension, before, foo.tar.gz
 would be split into 'foo', '.tar.gz'. So it's clear that the intention
 was always to split off the extension, whether or not the behavior
 on dotfiles was considered I cannot tell.
 
 As for Doc/lib, in r6524 Guido changed it to document the actual 
 behavior, from
 
 the last component of \var{root} contains no periods,
 and \var{ext} is empty or begins with a period.
 
 to
 
 and \var{ext} is empty or begins with a period and contains
 at most one period.
 
 So it seems the original (Guido's) intention was that it splits
 of all extensions; Sjoerd then changed it to split off only the
 last extension.

Whatever the intention was or has been: the term extension
itself is not well-defined, so there's no obvious right way
to implement an API that splits off an extension.

E.g. in some cases, .tar.gz is considered an extension, in others,
the .gz part is just a transfer encoding and .tar the extension.
Then you have .tgz which is a bit of both. It also depends on the
platform, e.g. on Windows, only the very last part of a filename
is used as extension by the OS to determine the (MIME) type of
a file.

As always, it's best to just right your own application-specific
code to get defined behavior.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 15 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] These csv test cases seem incorrect to me...

2007-03-14 Thread M.-A. Lemburg
Hi Skip,

On 2007-03-12 03:01, [EMAIL PROTECTED] wrote:
 I decided it would be worthwhile to have a csv module written in Python (no
 C underpinnings) for a number of reasons:
 
 * It will probably be easier to add Unicode support to a Python version
 
 * More people will be able to read/grok/modify/fix bugs in a Python
   implementation than in the current mixed Python/C implementation.
 
 * With alternative implementations of Python available (PyPy,
   IronPython, Jython) it makes sense to have a Python version they can
   use.

Lots of good reasons :-)

I've written a Python-only Unicode aware CSV module for a client (mostly
because CSV data tends to be quirky and I needed a quick way of dealing
with corner cases). Perhaps I can get them to donate it to the PSF...

 I'm far from having anything which will pass the current test suite, but in
 diagnosing some of my current failures I noticed a couple test cases which
 seem wrong.  In the TestDialectExcel class I see these two questionable
 tests:
 
 def test_quotes_and_more(self):
 self.readerAssertEqual('ab', [['ab']])
 
 def test_quote_and_quote(self):
 self.readerAssertEqual('a b', [['a b']])
 
 It seems to me that if a field starts with a quote it *has* to be a quoted
 field.  Any quotes appearing within a quoted field have to be escaped and
 the field has to end with a quote.  Both of these test cases fail on or the
 other assumption.  If they are indeed both correct and I'm just looking at
 things crosseyed I think they at least deserve comments explaining why they
 are correct.
 
 Both test cases date from the first checkin.  I performed the checkin
 because of the group developing the module I believe I was the only one with
 checkin privileges at the time, not because I wrote the test cases.
 
 Any ideas about why these test cases are in there?  I can't imagine Excel
 generating either one.

My recommendation: Let the module do whatever Excel does with such data.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 14 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] New syntax for 'dynamic' attribute access

2007-02-12 Thread M.-A. Lemburg
On 2007-02-12 16:19, Georg Brandl wrote:
 Tim Delaney asked in particular:
 Have you checked if [the existing uses of getattr, where getattr in
 that scope is a function argument with default value the built-in
 getattr] are intended to bring the getattr name into local scope
 for fast lookup, or to force a binding to the builtin gettattr at
 compile time (two common (ab)uses of default arguments)?  If they are,
 they would be better served by the new syntax.
 They're all in Lib/codecs.py, and are of the form:

 class StreamRecoder:
 def __getattr__(self, name,
 getattr=getattr):

  Inherit all other methods from the underlying stream.
 
 return getattr(self.stream, name)

 Without digging deeper into that code I'm afraid I can't say precisely
 what is going on.
 
 Since that is a special method and ought to have the signature
 __getattr__(self, name), I think it's safe to assume that that's meant
 as an optimization.

I can confirm that: it's a case of fast-local-lookup optimization.

You can add a -1 from me to the list as well: I don't think that
dynamic lookups are common enough to warrant new syntax.

Even if you do add a new syntax for this, using parenthesis is
a poor choice IMHO as the resulting code looks too much like a
function call (e.g. callable.(variable)).

Other choices would be square brackets [], but these have the
same problem as they are in use for indexing.

The only brackets that are not yet overloaded in the context
of applying them to an object are curly brackets, so
callable.{variable} would cause enough raising eyebrows
to not think of a typo.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 12 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Problem between deallocation of modules and func_globals

2007-01-20 Thread M.-A. Lemburg
On 2007-01-20 00:01, Brett Cannon wrote:
 On 1/19/07, M.-A. Lemburg [EMAIL PROTECTED] wrote:
 On 2007-01-19 22:33, Brett Cannon wrote:
 That's a typical error situation you get in __del__ methods at
 the time the interpreter is shut down.

 Yeah, but in this case this is at the end of Py_Initialize() for the
 stuff I am doing to the interpreter.  =)
 Is that in some error branch of Py_Initialize() ? Otherwise
 I don't see how the modules could get garbage-collected.

 
 Nope, it's code I am adding to clean out sys.modules of stuff the user
 didn't import themselves; it's for security reasons.

I'm not sure whether that's really going to increase
security: unloading of modules usually isn't safe and you
cannot be sure that it's possible to reinitialize a C
module once it has been loaded in the process. For Python
modules this is often possible, but there still may be
side-effects of the import that you cannot easily undo.

Perhaps you should just move those modules out to a different
dictionary and keep track of it in the import mechanism, so
that while you can't access the module directly via sys.modules,
the import mechanism still knows that it has been loaded and
reinserts it into sys.modules if it gets imported again.

I think that you get more security by explicitly
limiting which modules and packages you allow to be imported
in the first place and restricting what can be done with
sys.path and sys.modules.

 I'm not exactly sure which global state you are referring to. The
 aliase map, the cache used by the search function ?

 encodings._cache .

 Note that the search function registry is a global managed
 in the thread state (it's not stored in any module).

 Right, but that is not the issue.  If you have deleted the reference
 to the encodings module from sys.modules it then sets encodings._cache
 to None.  After the deletion, if you try to encode/decode a unicode
 string you can an AttributeError about how encodings._cache does not
 have a 'get' method since it is now None instead of a dict.  The
 function is fine and still runs, it's just that the global state it
 depends on is no longer the way it assume it should be.
 While I could add some tricks to have the cache dictionary stay
 alive even after the globals were set to None, I doubt that this
 will really fix the problem.

 The encoding package relies on the import mechanism, the codecs
 module and the _codecs builtin module. Any of these could fail
 to work depending on the order in which the modules get
 GCed.

 There's a reason why things in Py_Finalize() are as carefully
 ordered :-) Perhaps we need to apply some reordering to the
 steps in Py_Initialize() ?!

 
 Nah, I just  need to not delete the modules.  =)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 20 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Problem between deallocation of modules and func_globals

2007-01-19 Thread M.-A. Lemburg
On 2007-01-18 20:53, Brett Cannon wrote:
 I have discovered an issue relating to func_globals for functions and
 the deallocation of the module it is contained within.  Let's say you
 store a reference to the function encodings.search_function from the
 'encodings' module (this came up in C code, but I don't see why it
 couldn't happen in Python code).  Then you delete the one reference to
 the module that is stored in sys.modules, leading to its deallocation.
  That triggers the setting of None to every value in
 encodings.__dict__.
 
 Oops, now the global namespace for that module has everything valued
 at None.  The dict doesn't get deallocated since a reference is held
 by encodings.search_function.func_globals and there is still a
 reference to that (technically held in the interpreter's
 codec_search_path field).  So the function can still execute, but
 throws exceptions like AttributeError because a module variable that
 once held a dict now has None and thus doesn't have the 'get' method.

That's a typical error situation you get in __del__ methods at
the time the interpreter is shut down.

The main reason for setting everything to None first is to
break circular references and make sure that at least some
of the object destructors can run.

 My question is whether this is at all worth trying to rectify.  Since
 Google didn't turn anything up I am going to guess this is not exactly
 a common thing.  =)  That would lead me to believe some (probably
 most) of you will say, just leave it alone and work around it.

If you can come up with a better way, sure :-)

 The other option I can think of is to store a reference to the module
 instead of just to its __dict__ in the function.  The problem with
 that is we end up with a circular dependency of the functions in
 modules having a reference to the module but then the module having a
 reference to the functions.  I tried not having the values in the
 module's __dict__ set to None if the reference count was above 1 and
 that solved this issue, but that leads to dangling references on
 anything in that dict that does not have a reference stored away
 somewhere else like encodings.search_function.
 
 Anybody have any ideas on how to deal with this short of rewriting
 some codecs stuff so that they don't depend on global state in the
 module or just telling me to just live with it?

I'm not exactly sure which global state you are referring to. The
aliase map, the cache used by the search function ?

Note that the search function registry is a global managed
in the thread state (it's not stored in any module).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 19 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pep-3108.txt

2007-01-04 Thread M.-A. Lemburg
On 2007-01-03 01:42, Brett Cannon wrote:
 On 1/2/07, M.-A. Lemburg [EMAIL PROTECTED] wrote:
   +Open Issues
   +===
   +
   +Consolidate dependent modules together into a single module or
  package?
   ...
   +Consolidate certain modules with similar themes together in a
 package?
  
 +--
   ...
 
  If you do follow this route, please take the chance to place
  the whole Python stdlib under a single package. That way we'll
  avoid name clashes with existing packages and modules now and
  in the future.
 
 
  That has been suggested before (including by me) and Guido has always
 shot
  it down.  That's why I left it out of this proposal.

 Even if it is shot down again, it still deserves to be documented
 together with the reasons for being shot down.

 This is a one-in-a-lifetime chance, so it would be sad if it were
 not taken into account.

 The extra effort would be minimal - the renaming would have to be
 done using a script anyway and adding an extra 'from py import '
 prefix to the modules wouldn't really make the renaming more
 complicated ;-)
 
 
 I was about to start writing an open issue on this since the biggest
 objection from Guido I could find on this topic is
 http://mail.python.org/pipermail/python-dev/2002-July/026409.html , but
 then
 it started to feel like a separate PEP to me.  So I think I am going to
 pass
 on taking on this topic and let someone else tackle it in a PEP.  Sorry,
 MAL, but I need to worry about my sanity on this one.  =)

Oh well, it seemed like a perfect fit for the scope of PEP 3108.

Guido's reply seems to suggest that he's in favor of introducing
a multi-package stdlib structure:


  I'm rejecting the proposal of a single top-level package named python.

 You've written that before, but you still haven't given any
 explanation of why a single package would be worse than a
 multi-level hierarchy of modules (e.g. grouped by application
 space).

Because a single package doesn't have any other benefits besides
getting out of the way from 3rd party developers.

At least a proper hierarchy would have the other benefits of grouping.
(But better make it a shallow hierarchy!  remember Flat is better
than nested.)


AFAICT, he was only objecting having a single package without any
extra restructuring.

Then again, the post is from 2002 - so things may have changed.

There have been a couple of attempts to reorg the stdlib into
packages, but AFAIR, I see, all of them were withdrawn
due to the problem of finding a suitable grouping (often enough,
a module would be suitable for more than just one functional
package, e.g. urllib would fit io as well as net) or
lack of support from the developers.

Now that we're discussing moving the include files into
a subdirectory (for much the same reasons), I think it's
time to reboot the discussion of a Python package with or
without possible subpackages.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 04 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 2.5.1 plans

2007-01-04 Thread M.-A. Lemburg
On 2007-01-04 07:59, Neal Norwitz wrote:
 The current schedule looks like it's shaping up to be:
 
 Wed, Jan 24 for 2.5.1c1
 Wed Jan 31 for 2.5.1
 
 It would be great if you could comment on some of the bug reports
 below.  I think several already have patches/suggested fixes.
 
 It's not clear to me if this should be fixed, but it's got a high priority::
 
 http://python.org/sf/1467929 %-formatting and dicts

+1

The patch is ready to be applied. The only reason it got delayed
was the 2.5 release timing.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 04 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pep-3108.txt

2007-01-03 Thread M.-A. Lemburg
On 2007-01-03 00:35, Barry Warsaw wrote:
 On Jan 2, 2007, at 5:41 PM, M.-A. Lemburg wrote:
 
 Note that as side-effect of this it becomes a lot harder to manipulate
 PYTHONPATH to trick Python into loading a standard module from a
 non-standard location, improving security and robustness of the
 Python installations.
 
 Sometimes though you want to do this, as when you want your application
 to ensure it gets a particular version of a standard library module,
 regardless of the version of Python being used.  And now we're back to
 application-specific site-packages ;).

Well, I guess that's a rather particular use case and can probably
only be safely implemented by the maintainer of the module or package
in question ;-)

In such (rare) cases, it should be possible to use one of the harder
ways to achieve this:

 * monkey patching the package
 * using package.__path__ to redirect the in-package search
 * creating a private copy of the whole package which then has
   the modified modules and packages in place

Regarding application specific package setups:

In my experience it's better to have an application specific
sys.path setup function that manages this, rather than trying
to manipulate PYTHONPATH or trying to tweak Python's stdlib
site.py into using some particular way of setting up application
specific paths which then makes interop harder for all
applications using Python, rather than just the few that
require such setups.

The application can then call this path setup function early
on in the startup phase to make sure that the rest of the
startup and the application's main code then imports the
right modules and packages.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 03 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pep-3108.txt

2007-01-02 Thread M.-A. Lemburg
On 2007-01-02 01:02, brett.cannon wrote:
 Author: brett.cannon
 Date: Tue Jan  2 01:02:41 2007
 New Revision: 53204
 
 Added:
peps/trunk/pep-3108.txt   (contents, props changed)
 Modified:
peps/trunk/pep-.txt
 Log:
 Add PEP 3108: Standard Library Reorganization.
 
...

 +Open Issues
 +===
 +
 +Consolidate dependent modules together into a single module or package?
 ...
 +Consolidate certain modules with similar themes together in a package?
 +--
 ...

If you do follow this route, please take the chance to place
the whole Python stdlib under a single package. That way we'll
avoid name clashes with existing packages and modules now and
in the future.

Together with absolute imports this also improves the readability
of modules since it becomes immediately clear where the imported code
is coming from.

Note that as side-effect of this it becomes a lot harder to manipulate
PYTHONPATH to trick Python into loading a standard module from a
non-standard location, improving security and robustness of the
Python installations.

 +Packages are often used to group together modules that have a similar
 +theme but do not have any direct relationship or dependency upon each
 +other.  For Python 3.0 obvious groupings could be done since renaming
 +of various modules is already occurring.
 +
 +* collections
 ++ heapq
 ++ Queue
 ++ sets
 ++ UserDist
 ++ UserList
 ++ What to do with UserString?
 +- Have a package for Python implementations of built-in types
 +  instead of putting the User* modules into 'collections'?
 +* mac
 ++ Various Mac-specific modules.
 ++ Same can be done for other platform-specific code.
 +* Profiling
 ++ cProfile
 ++ profile
 ++ hotshot
 ++ pstats
 +* email
 ++ mailbox
 ++ mhlib
 +* Databases
 ++ anydbm
 ++ dbhash
 ++ dbm
 ++ bsddb
 ++ dumbdbm
 ++ gdbm
 ++ whichdb
 +* Audio
 ++ aifc
 ++ audioop
 ++ chunk
 ++ ossaudiodev
 ++ sunau
 ++ wave
 ++ winsound
 +* Servers
 ++ BaseHTTPServer
 ++ CGIHTTPServer
 ++ DocXMLRPCServer
 ++ SimpleHTTPServer
 ++ SimpleXMLRPCServer
 ++ SocketServer

The package names should probably be converted to lower-case to
follow PEP 8.

Thanks and Happy New Year,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 02 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pep-3108.txt

2007-01-02 Thread M.-A. Lemburg
On 2007-01-02 23:54, Brett Cannon wrote:
 On 1/2/07, M.-A. Lemburg [EMAIL PROTECTED] wrote:

 On 2007-01-02 01:02, brett.cannon wrote:
  Author: brett.cannon
  Date: Tue Jan  2 01:02:41 2007
  New Revision: 53204
 
  Added:
 peps/trunk/pep-3108.txt   (contents, props changed)
  Modified:
 peps/trunk/pep-.txt
  Log:
  Add PEP 3108: Standard Library Reorganization.
 
 ...
 
  +Open Issues
  +===
  +
  +Consolidate dependent modules together into a single module or
 package?
  ...
  +Consolidate certain modules with similar themes together in a package?
  +--
  ...

 If you do follow this route, please take the chance to place
 the whole Python stdlib under a single package. That way we'll
 avoid name clashes with existing packages and modules now and
 in the future.
 
 
 That has been suggested before (including by me) and Guido has always shot
 it down.  That's why I left it out of this proposal.

Even if it is shot down again, it still deserves to be documented
together with the reasons for being shot down.

This is a one-in-a-lifetime chance, so it would be sad if it were
not taken into account.

The extra effort would be minimal - the renaming would have to be
done using a script anyway and adding an extra 'from py import '
prefix to the modules wouldn't really make the renaming more
complicated ;-)

 Together with absolute imports this also improves the readability
 of modules since it becomes immediately clear where the imported code
 is coming from.

 Note that as side-effect of this it becomes a lot harder to manipulate
 PYTHONPATH to trick Python into loading a standard module from a
 non-standard location, improving security and robustness of the
 Python installations.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 02 2007)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] __str__ and unicode

2006-12-06 Thread M.-A. Lemburg
On 2006-12-06 10:26, Fredrik Lundh wrote:
 over at my work copy of the python language reference, Adrian Holovaty
 asked about the exact semantics of the __str__ hook:
 
 http://effbot.org/pyref/__str__
 
The return value must be a string object. Does this mean it can be a
*Unicode* string object? This distinction is ambiguous to me because
 unicode objects and string objects are both subclasses of basestring.
 May a __str__() return a Unicode object?
 
 I seem to remember earlier discussions on this topic, but don't recall when
 and what.  From what I can tell, __str__ may return a Unicode object, but
 only if can be converted to an 8-bit string using the default encoding.  Is 
 this
 on purpose or by accident?  Do we have a plan for improving the situation
 in future 2.X releases ?

This was added to make the transition to all Unicode in 3k easier:

.__str__() may return a string or Unicode object.

.__unicode__() must return a Unicode object.

There is no restriction on the content of the Unicode string
for .__str__().

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Dec 06 2006)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] __str__ and unicode

2006-12-06 Thread M.-A. Lemburg
On 2006-12-06 10:46, M.-A. Lemburg wrote:
 On 2006-12-06 10:26, Fredrik Lundh wrote:
 over at my work copy of the python language reference, Adrian Holovaty
 asked about the exact semantics of the __str__ hook:

 http://effbot.org/pyref/__str__

The return value must be a string object. Does this mean it can be a
*Unicode* string object? This distinction is ambiguous to me because
 unicode objects and string objects are both subclasses of basestring.
 May a __str__() return a Unicode object?

 I seem to remember earlier discussions on this topic, but don't recall when
 and what.  From what I can tell, __str__ may return a Unicode object, but
 only if can be converted to an 8-bit string using the default encoding.  Is 
 this
 on purpose or by accident?  Do we have a plan for improving the situation
 in future 2.X releases ?
 
 This was added to make the transition to all Unicode in 3k easier:
 
 .__str__() may return a string or Unicode object.
 
 .__unicode__() must return a Unicode object.
 
 There is no restriction on the content of the Unicode string
 for .__str__().

One more thing, since these two hooks are commonly used with str() and
unicode():

* unicode(obj) will first try .__unicode() and then revert to .__str__()
  (possibly converting the string return value to Unicode)

* str(obj) will try .__str__() only (possibly converting the Unicode
  return value to a string using the default encoding)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Dec 06 2006)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] __str__ and unicode

2006-12-06 Thread M.-A. Lemburg
On 2006-12-06 10:56, Fredrik Lundh wrote:
 M.-A. Lemburg wrote:
 
 This was added to make the transition to all Unicode in 3k easier:
 
 thanks for the clarification.
 
 do you recall when this was added?  2.5?

Not really, only that it was definitely before 2.5.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Dec 06 2006)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-28 Thread M.-A. Lemburg
Travis E. Oliphant wrote:
 
 
 
 
 PEP: unassigned
 Title: Adding data-type objects to the standard library
   Attributes
 
  kind  --  returns the basic kind of the data-type. The basic kinds
  are:
't' - bit, 
'b' - bool, 
'i' - signed integer, 
'u' - unsigned integer,
'f' - floating point,  
'c' - complex floating point, 
'S' - string (fixed-length sequence of char),
'U' - fixed length sequence of UCS4,

Shouldn't this read fixed length sequence of Unicode ?!
The underlying code unit format (UCS2 and UCS4) depends on the
Python version.

'O' - pointer to PyObject,
'V' - Void (anything else).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 28 2006)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


<    2   3   4   5   6   7   8   9   10   >