[Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-22 Thread Martin v. Löwis
I'm proposing the following PEP for inclusion into Python 3.1. Please comment. Regards, Martin PEP: 383 Title: Non-decodable Bytes in System Character Interfaces Version: $Revision: 71793 $ Last-Modified: $Date: 2009-04-22 08:42:06 +0200 (Mi, 22. Apr 2009) $ Author: Martin v. Löwis

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-22 Thread Nick Coghlan
Martin v. Löwis wrote: I'm proposing the following PEP for inclusion into Python 3.1. Please comment. That seems like a much nicer solution than having parallel bytes/Unicode APIs everywhere. When the locale encoding is UTF-8, would UTF-8b also be used for the command line decoding and

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-22 Thread glyph
On 06:50 am, mar...@v.loewis.de wrote: I'm proposing the following PEP for inclusion into Python 3.1. Please comment. To convert non-decodable bytes, a new error handler python-escape is introduced, which decodes non-decodable bytes using into a private-use character U+F01xx, which is

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-22 Thread Walter Dörwald
Martin v. Löwis wrote: I'm proposing the following PEP for inclusion into Python 3.1. Please comment. Regards, Martin PEP: 383 Title: Non-decodable Bytes in System Character Interfaces Version: $Revision: 71793 $ Last-Modified: $Date: 2009-04-22 08:42:06 +0200 (Mi, 22. Apr 2009) $

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-22 Thread MRAB
Martin v. Löwis wrote: [snip] To convert non-decodable bytes, a new error handler python-escape is introduced, which decodes non-decodable bytes using into a private-use character U+F01xx, which is believed to not conflict with private-use characters that currently exist in Python codecs. The

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-22 Thread Dirkjan Ochtman
On 22/04/2009 14:20, gl...@divmod.com wrote: -1. On UNIX, character data is not sufficient to represent paths. We must, must, must continue to have a simple bytes interface to these APIs. Covering it up in layers of obscure encoding hacks will not make the problem go away, it will just make it

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-22 Thread Benjamin Peterson
2009/4/22 Dirkjan Ochtman dirk...@ochtman.nl: On 22/04/2009 14:20, gl...@divmod.com wrote: -1. On UNIX, character data is not sufficient to represent paths. We must, must, must continue to have a simple bytes interface to these APIs. Covering it up in layers of obscure encoding hacks will not

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-22 Thread Antoine Pitrou
Dirkjan Ochtman dirkjan at ochtman.nl writes: As a hg developer, I have to concur. Keeping bytes-based APIs intact would make porting hg to py3k much, much easier. You may be able to imagine that dealing with paths correctly cross-platform on a VCS is a major PITA, and py3k is currently

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-22 Thread Martin v. Löwis
correct - corrected Thanks, fixed. To convert non-decodable bytes, a new error handler python-escape is introduced, which decodes non-decodable bytes using into a private-use character U+F01xx, which is believed to not conflict with private-use characters that currently exist in Python

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-22 Thread R. David Murray
On Wed, 22 Apr 2009 at 13:29, Benjamin Peterson wrote: 2009/4/22 Dirkjan Ochtman dirk...@ochtman.nl: On 22/04/2009 14:20, gl...@divmod.com wrote: -1. On UNIX, character data is not sufficient to represent paths. We must, must, must continue to have a simple bytes interface to these APIs.

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-22 Thread Martin v. Löwis
-1. On UNIX, character data is not sufficient to represent paths. We must, must, must continue to have a simple bytes interface to these APIs. I'd like to respond to this concern in three ways: 1. The PEP doesn't remove any of the existing interfaces. So if the interfaces for

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-22 Thread Martin v. Löwis
Dirkjan Ochtman wrote: On 22/04/2009 14:20, gl...@divmod.com wrote: -1. On UNIX, character data is not sufficient to represent paths. We must, must, must continue to have a simple bytes interface to these APIs. Covering it up in layers of obscure encoding hacks will not make the problem go

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-22 Thread Martin v. Löwis
Yeah, but IIRC a complete set of bytes APIs doesn't exist yet in py3k. Define complete. I'm not aware of any interfaces wrt. file IO that are lacking, so which ones were you thinking of? Python doesn't currently provide a way to access environment variables and command line arguments as bytes.

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-22 Thread Martin v. Löwis
MRAB wrote: Martin v. Löwis wrote: [snip] To convert non-decodable bytes, a new error handler python-escape is introduced, which decodes non-decodable bytes using into a private-use character U+F01xx, which is believed to not conflict with private-use characters that currently exist in

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-22 Thread R. David Murray
On Wed, 22 Apr 2009 at 21:21, Martin v. L?wis wrote: Yeah, but IIRC a complete set of bytes APIs doesn't exist yet in py3k. Define complete. I'm not aware of any interfaces wrt. file IO that are lacking, so which ones were you thinking of? Python doesn't currently provide a way to access

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-22 Thread Walter Dörwald
Martin v. Löwis wrote: correct - corrected Thanks, fixed. To convert non-decodable bytes, a new error handler python-escape is introduced, which decodes non-decodable bytes using into a private-use character U+F01xx, which is believed to not conflict with private-use characters that

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-22 Thread M.-A. Lemburg
On 2009-04-22 22:06, Walter Dörwald wrote: Martin v. Löwis wrote: correct - corrected Thanks, fixed. To convert non-decodable bytes, a new error handler python-escape is introduced, which decodes non-decodable bytes using into a private-use character U+F01xx, which is believed to not

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-22 Thread Martin v. Löwis
The python-escape codec is only used/meaningful if the env encoding is not UTF-8. For any other encoding, it is assumed that no character actually maps to the private-use characters. Which should be true for any encoding from the pre-unicode era, but not for UTF-16/32 and variants. Right.

Re: [Python-Dev] Issue5434: datetime.monthdelta

2009-04-22 Thread Jess Austin
On Thu, Apr 16, 2009 at 8:01 PM, Jess Austin jess.aus...@gmail.com wrote: These operations are useful in particular contexts.  What I've submitted is also useful, and currently isn't easy in core, batteries-included python.  While I would consider the foregoing interpretation of the Zen to be

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

2009-04-22 Thread glyph
On 07:17 pm, mar...@v.loewis.de wrote: -1. On UNIX, character data is not sufficient to represent paths. We must, must, must continue to have a simple bytes interface to these APIs. I'd like to respond to this concern in three ways: 1. The PEP doesn't remove any of the existing