Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-10-01 Thread Nick Coghlan
Bill Janssen wrote: > Perhaps PEP 355 just went too far. That was certainly one of the major objections to it. A filesystem path object which didn't try to combine a half-dozen different modules into methods on a single object, but instead focused on solving a few specific problems with using raw

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-10-01 Thread glyph
On 03:54 pm, [EMAIL PROTECTED] wrote: I'm actually sort of liking this idea. A Pathname class, for convenience a subtype of String, but containing the underlying binary representation used by the OS. Even non-unicode pathnames could be represented. On the one hand, I agree with you - excep

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-10-01 Thread Ulrich Eckhardt
On Tuesday 30 September 2008, M.-A. Lemburg wrote: > On 2008-09-30 08:00, Martin v. Löwis wrote: > >> Change the default file system encoding to store bytes in Unicode is > >> like introducing a new Python type: . > > > > Exactly. Seems like the best solution to me, despite your polemics. > > Not a

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-10-01 Thread glyph
On 03:32 am, [EMAIL PROTECTED] wrote: On Sep 30, 2008, at 10:06 PM, [EMAIL PROTECTED] wrote: Can you clarify what proposal you are supporting for Python: Sure. Neither of your descriptions is terribly accurate, but I'll try to explain. 1) Two sets of APIs, one returning unicode strings, an

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-10-01 Thread glyph
On 30 Sep, 09:22 pm, [EMAIL PROTECTED] wrote: On Tue, Sep 30, 2008 at 1:04 PM, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: Guido van Rossum wrote: On Mon, Sep 29, 2008 at 11:00 PM, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: Martin, I don't understand why you are in favor of storing raw by

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-10-01 Thread Jack Jansen
On 1-Oct-2008, at 00:32 , Martin v. Löwis wrote: How does windows (and Python on windows) handle NFC versus NFD issues? That's left to the application. Can I have two files called "ümlaut.txt", one in NFD and one NFC form? Yes, you can. It sounds confusing, but only in a theoretical

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-10-01 Thread Jack Jansen
On 30-Sep-2008, at 23:42 , Martin v. Löwis wrote: It's the other way 'round: On Windows, Unicode file names are the natural choice, and byte strings have limitations. In a sense, Windows got it right - but then, they started later. Unix missed the opportunity of declaring that all file APIs

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-10-01 Thread Martin v. Löwis
>> SQLite has a similar problem with NULLs, and I'm definitely sticking >> paths in there, too. > > I think that you can say "all C libraries". Just for the sake of nit-picking: the socket library, and the regular POSIX stream IO library (as well as C standard "unformatted" IO) deal just fine wit

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-10-01 Thread Bill Janssen
[EMAIL PROTECTED] wrote: > > I'm actually sort of liking this idea. A Pathname class, for > > convenience > > a subtype of String, but containing the underlying binary > > representation > >used by the OS. Even non-unicode pathnames could be represented. > > On the one hand, I agree with you -

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-10-01 Thread Bill Janssen
M.-A. Lemburg <[EMAIL PROTECTED]> wrote: > On 2008-10-01 09:54, Ulrich Eckhardt wrote: > > On Tuesday 30 September 2008, M.-A. Lemburg wrote: > >> On 2008-09-30 08:00, Martin v. Löwis wrote: > Change the default file system encoding to store bytes in Unicode is > like introducing a new Py

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-10-01 Thread Nick Coghlan
[EMAIL PROTECTED] wrote: > The reasoning is that a lot of software doesn't care if it's wrong for > edge cases, it's really hard to come up with something that's correct > with respect to all of those edge cases (absurdly difficult, if you need > to stay in the straightjacket of string / bytes type

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-10-01 Thread M.-A. Lemburg
On 2008-10-01 09:54, Ulrich Eckhardt wrote: > On Tuesday 30 September 2008, M.-A. Lemburg wrote: >> On 2008-09-30 08:00, Martin v. Löwis wrote: Change the default file system encoding to store bytes in Unicode is like introducing a new Python type: . >>> Exactly. Seems like the best solut

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-09-30 Thread Martin v. Löwis
> However, Martin, I can promise you that I will _never_ ask for any > convenience functions related to bytes as a result of this decision. :-) Regards, Martin ___ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/p

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-09-30 Thread Terry Reedy
Martin v. Löwis wrote: Guido van Rossum wrote: However the *proposed* behavior (returns bytes if the arg was bytes, and returns str when the arg was str) is IMO sane, and no different than the polymorphism found in len() or many builtin operations. My concern still is that it brings the bytes

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-09-30 Thread James Y Knight
On Sep 30, 2008, at 10:06 PM, [EMAIL PROTECTED] wrote: However, Martin, I can promise you that I will _never_ ask for any convenience functions related to bytes as a result of this decision. I want bytes to come back from filesystem APIs because I intend to have a wrapper layer which knows

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-09-30 Thread Adam Olsen
On Tue, Sep 30, 2008 at 8:06 PM, <[EMAIL PROTECTED]> wrote: > The proposal of using U+ seems like it would have been almost the same > from such a wrapper's perspective, except (A) people using the filesystem > APIs without the benefit of such a wrapper would have been even more > screwed, and

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-09-30 Thread Victor Stinner
Le Wednesday 01 October 2008 00:28:22 Martin v. Löwis, vous avez écrit : > I don't think we will manage to release Python 3.0 this year if that > change is to be implemented. And then, I don't think the release manager > will agree to such a delay. The minimum change is to disallow bytes/str mix:

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-09-30 Thread James Y Knight
On Sep 30, 2008, at 6:21 PM, Martin v. Löwis wrote: IOW, Java hasn't solved the problem in the last 10 years. Java is already really bad at being a small little language to write cooperating tools in. I'd never even attempt to write a little pipeline filter in Java -- I've already pretty mu

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-09-30 Thread Guido van Rossum
On Tue, Sep 30, 2008 at 3:21 PM, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: >>> My concern still is that it brings the bytes type into the status of >>> another character string type, which is really bad, and will require >>> further modifications to Python for the lifetime of 3.x. >> >> I'd like

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-09-30 Thread Martin v. Löwis
> How does windows (and Python on windows) handle NFC versus NFD issues? That's left to the application. > Can I have two files called "ümlaut.txt", one in NFD and one NFC form? Yes, you can. It sounds confusing, but only in a theoretical way. You never have combining characters on Windows (at

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-09-30 Thread Martin v. Löwis
> Yes! If there is a byte-string access method for Windows, pretty please > make it decode from UTF-8 internally and call the Unicode version of the > Windows APIs. The non-unicode windows APIs are pretty much just broken > -- Ideally, Python should never be calling those. I don't think we will ma

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-09-30 Thread Antoine Pitrou
Le mardi 30 septembre 2008 à 23:33 +0200, "Martin v. Löwis" a écrit : > > By the way, doesn't all this controversy yearn for a PEP? > > There must be a solution for 3.0 (which *could* be "it's a bug, > don't use Python 3.0 on such broken systems"); we can't wait for > a PEP to resolve this issue f

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-09-30 Thread Martin v. Löwis
>> My concern still is that it brings the bytes type into the status of >> another character string type, which is really bad, and will require >> further modifications to Python for the lifetime of 3.x. > > I'd like to understand why this is "really bad". I though it was by > design that the str

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-09-30 Thread Guido van Rossum
On Tue, Sep 30, 2008 at 1:29 PM, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > Guido van Rossum wrote: >> However >> the *proposed* behavior (returns bytes if the arg was bytes, and >> returns str when the arg was str) is IMO sane, and no different than >> the polymorphism found in len() or many b

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-09-30 Thread Martin v. Löwis
> By the way, doesn't all this controversy yearn for a PEP? There must be a solution for 3.0 (which *could* be "it's a bug, don't use Python 3.0 on such broken systems"); we can't wait for a PEP to resolve this issue for 3.0. Most likely, the solution for 3.0 arrives through BDFL pronouncement, i

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-09-30 Thread Guido van Rossum
On Tue, Sep 30, 2008 at 1:04 PM, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > Guido van Rossum wrote: >> On Mon, Sep 29, 2008 at 11:00 PM, "Martin v. Löwis" <[EMAIL PROTECTED]> >> wrote: Change the default file system encoding to store bytes in Unicode is like introducing a new Python

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-09-30 Thread Martin v. Löwis
> I'm not sure either way. I've heard it claim that Windows filesystem > APIs use Unicode natively. Does Python 3.0 on Windows currently > support filenames expressed as bytes? Yes, it does (at least, os.open, os.stat support them, builtin open doesn't). > Are they encoded first before > passing

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-09-30 Thread Martin v. Löwis
Guido van Rossum wrote: > However > the *proposed* behavior (returns bytes if the arg was bytes, and > returns str when the arg was str) is IMO sane, and no different than > the polymorphism found in len() or many builtin operations. My concern still is that it brings the bytes type into the statu

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-09-30 Thread Martin v. Löwis
> I didn't get an answer to my question: what is the result characters) stored in unicode> + ? I guess that the result is > instead of raising an error > (invalid types). So again: why introducing a new type instead of reusing > existing Python types? I didn't mean to introduce a new data typ

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-09-30 Thread Antoine Pitrou
Martin v. Löwis v.loewis.de> writes: > > True. I try to outweigh the need for simplicity in the API against the > need to support all cases. So I see two solutions: > > a) (...) > > b) (...) By the way, doesn't all this controversy yearn for a PEP? __

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-09-30 Thread Martin v. Löwis
Guido van Rossum wrote: > On Mon, Sep 29, 2008 at 11:00 PM, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: >>> Change the default file system encoding to store bytes in Unicode is like >>> introducing a new Python type: . >> Exactly. Seems like the best solution to me, despite your polemics. > > Mar

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-09-30 Thread M.-A. Lemburg
On 2008-09-30 18:46, Guido van Rossum wrote: > On Tue, Sep 30, 2008 at 8:20 AM, M.-A. Lemburg <[EMAIL PROTECTED]> wrote: >> In the end, I think it's better not to be clever and just return >> the filenames that cannot be decoded as bytes objects in os.listdir(). > > Unfortunately that's going to b

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-09-30 Thread Guido van Rossum
On Tue, Sep 30, 2008 at 10:28 AM, Georg Brandl <[EMAIL PROTECTED]> wrote: >> How can it *regularly* drive you crazy when "the majority of fie names >> [...] encoded correctly" (as you assert above)? > > Because Office files are a) often named with long, seemingly descriptive > filenames, which inva

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-09-30 Thread Terry Reedy
Guido van Rossum wrote: On Mon, Sep 29, 2008 at 8:55 PM, Terry Reedy <[EMAIL PROTECTED]> wrote: Le Monday 29 September 2008 19:06:01 Guido van Rossum, vous avez écrit : I know I keep flipflopping on this one, but the more I think about it the more I believe it is better to drop those names than

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-09-30 Thread Guido van Rossum
On Tue, Sep 30, 2008 at 8:20 AM, M.-A. Lemburg <[EMAIL PROTECTED]> wrote: > In the end, I think it's better not to be clever and just return > the filenames that cannot be decoded as bytes objects in os.listdir(). Unfortunately that's going to break most code that is using os.listdir(), so it's ha

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-09-30 Thread M.-A. Lemburg
On 2008-09-30 16:05, Guido van Rossum wrote: > On Tue, Sep 30, 2008 at 3:31 AM, M.-A. Lemburg <[EMAIL PROTECTED]> wrote: >> On 2008-09-30 08:00, Martin v. Löwis wrote: Change the default file system encoding to store bytes in Unicode is like introducing a new Python type: . >>> Exactly. S

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-09-30 Thread Guido van Rossum
On Tue, Sep 30, 2008 at 3:31 AM, M.-A. Lemburg <[EMAIL PROTECTED]> wrote: > On 2008-09-30 08:00, Martin v. Löwis wrote: >>> Change the default file system encoding to store bytes in Unicode is like >>> introducing a new Python type: . >> >> Exactly. Seems like the best solution to me, despite your

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-09-30 Thread Victor Stinner
Le Tuesday 30 September 2008 15:53:09 Guido van Rossum, vous avez écrit : > On Mon, Sep 29, 2008 at 11:00 PM, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > >> Change the default file system encoding to store bytes in Unicode is > >> like introducing a new Python type: . > > > > Exactly. Seems lik

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-09-30 Thread Guido van Rossum
On Mon, Sep 29, 2008 at 11:00 PM, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: >> Change the default file system encoding to store bytes in Unicode is like >> introducing a new Python type: . > > Exactly. Seems like the best solution to me, despite your polemics. Martin, I don't understand why you

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-09-30 Thread Guido van Rossum
On Mon, Sep 29, 2008 at 8:55 PM, Terry Reedy <[EMAIL PROTECTED]> wrote: > >> Le Monday 29 September 2008 19:06:01 Guido van Rossum, vous avez écrit : > >>> I know I keep flipflopping on this one, but the more I think about it >>> the more I believe it is better to drop those names than to raise an

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-09-30 Thread M.-A. Lemburg
On 2008-09-30 08:00, Martin v. Löwis wrote: >> Change the default file system encoding to store bytes in Unicode is like >> introducing a new Python type: . > > Exactly. Seems like the best solution to me, despite your polemics. Not a bad idea... have os.listdir() return Unicode subclasses that

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-09-29 Thread Martin v. Löwis
> Change the default file system encoding to store bytes in Unicode is like > introducing a new Python type: . Exactly. Seems like the best solution to me, despite your polemics. Regards, Martin ___ Python-3000 mailing list Python-3000@python.org http

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-09-29 Thread Stephen J. Turnbull
Guido van Rossum writes: > On Mon, Sep 29, 2008 at 4:29 PM, Victor Stinner > <[EMAIL PROTECTED]> wrote: > > It would be hard for a newbie programmer to understand why he's > > unable to find his very important file ("important r?port.doc") > > using os.listdir(). > *Every* failure in this s

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-09-29 Thread Adam Olsen
On Mon, Sep 29, 2008 at 5:29 PM, Victor Stinner <[EMAIL PROTECTED]> wrote: > Le Monday 29 September 2008 19:06:01 Guido van Rossum, vous avez écrit : >> >> - listdir(unicode) -> unicode and raise an error on invalid filename >> >> I know I keep flipflopping on this one, but the more I think about

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-09-29 Thread Martin v. Löwis
> import os > import os.path > import sys > if os.path.supports_unicode_filenames: > cwd = getcwd() > else: > cwd = getcwdb() > encoding = sys.getfilesystemencoding() > for filename in os.listdir(cwd): > if os.path.supports_unicode_filenames: > text = str(fil

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-09-29 Thread Guido van Rossum
On Mon, Sep 29, 2008 at 4:29 PM, Victor Stinner <[EMAIL PROTECTED]> wrote: > Le Monday 29 September 2008 19:06:01 Guido van Rossum, vous avez écrit : >> >> - listdir(unicode) -> unicode and raise an error on invalid filename >> >> I know I keep flipflopping on this one, but the more I think about

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-09-29 Thread Martin v. Löwis
> The default behaviour should be to use unicode and raise an error if > conversion to unicode fails. It should also be possible to use bytes using > bytes arguments and optional arguments (for getcwd). I'm still opposed to allowing bytes as file names at all in 3k. Python should really strive

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-09-29 Thread James Y Knight
On Sep 29, 2008, at 6:17 PM, Adam Olsen wrote: I suspect linux will eventually take this route as well. If ext3 had an option for UTF-8 validation I know I'd want it on. That'd move the error to the program creating bogus file names, rather than those trying to read, display, and manage them.

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-09-29 Thread Adam Olsen
On Mon, Sep 29, 2008 at 11:06 AM, Guido van Rossum <[EMAIL PROTECTED]> wrote: > On Mon, Sep 29, 2008 at 9:45 AM, Georg Brandl <[EMAIL PROTECTED]> wrote: > >> This approach (changing all path-handling functions to accept either bytes >> or string, but not both) is doomed in my eyes. First, there are

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-09-29 Thread Victor Stinner
Patches are already avaible in the issue #3187 (os.listdir): Le Monday 29 September 2008 14:07:55 Victor Stinner, vous avez écrit : > - listdir(unicode) -> unicode and raise an error on invalid filename Need raise_decoding_errors.patch (don't clear Unicode error > - listdir(bytes) -> bytes Al