Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Martin v. Löwis
> However, Martin, I can promise you that I will _never_ ask for any > convenience functions related to bytes as a result of this decision. :-) Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/pyt

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Martin v. Löwis
> Sorry, maybe I'm just being thick here, but I don't understand how that > is possible. On the physical disk, each Windows file name must be > represented by a byte string, yes? So how is it possible that there are > Windows files with names that can't be represented as a byte string? > What h

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread glyph
On 03:32 am, [EMAIL PROTECTED] wrote: On Sep 30, 2008, at 10:06 PM, [EMAIL PROTECTED] wrote: Can you clarify what proposal you are supporting for Python: Sure. Neither of your descriptions is terribly accurate, but I'll try to explain. 1) Two sets of APIs, one returning unicode strings, an

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Terry Reedy
Guido van Rossum wrote: No, that's because bytes is missing from the explicit list of allowable types in io.open. Victor has a one-line trivial patch for this. Could you try this though? import _fileio _fileio._FileIO(b'tem') >>> import _fileio >>> _fileio._FileIO(b'tem') _fileio._FileIO(3,

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread James Y Knight
On Sep 30, 2008, at 10:06 PM, [EMAIL PROTECTED] wrote: However, Martin, I can promise you that I will _never_ ask for any convenience functions related to bytes as a result of this decision. I want bytes to come back from filesystem APIs because I intend to have a wrapper layer which knows

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Adam Olsen
On Tue, Sep 30, 2008 at 8:06 PM, <[EMAIL PROTECTED]> wrote: > The proposal of using U+ seems like it would have been almost the same > from such a wrapper's perspective, except (A) people using the filesystem > APIs without the benefit of such a wrapper would have been even more > screwed, and

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread glyph
On 30 Sep, 09:22 pm, [EMAIL PROTECTED] wrote: On Tue, Sep 30, 2008 at 1:04 PM, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: Guido van Rossum wrote: On Mon, Sep 29, 2008 at 11:00 PM, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: Martin, I don't understand why you are in favor of storing raw by

Re: [Python-Dev] Filename as byte string in python 2.6 or 3.0?

2008-09-30 Thread glyph
On 30 Sep, 09:37 pm, [EMAIL PROTECTED] wrote: On Tue, Sep 30, 2008 at 11:42 AM, <[EMAIL PROTECTED]> wrote: There are other ways to glean this knowledge; for example, looking at the 'iocharset' or 'nls' mount options supplied to mount various filesystems. I know we could do a better job, but

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Greg Ewing
M.-A. Lemburg wrote: In the end, I think it's better not to be clever and just return the filenames that cannot be decoded as bytes objects in os.listdir(). But since it's a rare occurrence, most applications are just going to ignore the issue, and then fail unexpectedly one day on some unsuspe

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Victor Stinner
Le Wednesday 01 October 2008 00:28:22 Martin v. Löwis, vous avez écrit : > I don't think we will manage to release Python 3.0 this year if that > change is to be implemented. And then, I don't think the release manager > will agree to such a delay. The minimum change is to disallow bytes/str mix:

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Michael Urman
On Tue, Sep 30, 2008 at 7:04 PM, Steven D'Aprano <[EMAIL PROTECTED]> wrote: >> I believe on disk it uses UTF-16. > > Which is made up of bytes. There may be byte sequences that are illegal > UTF-16, but that's not what Martin said. I don't understand how there > can be UTF-16 sequences which don't

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Steven D'Aprano
On Wed, 1 Oct 2008 09:21:37 am you wrote: > On Tue, Sep 30, 2008 at 4:08 PM, Steven D'Aprano <[EMAIL PROTECTED]> wrote: > > On Wed, 1 Oct 2008 07:40:01 am Martin v. Löwis wrote: > >> >> On Windows, we might reject bytes filenames for all file > >> >> operations: open(), unlink(), os.path.join(), e

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Guido van Rossum
On Tue, Sep 30, 2008 at 4:08 PM, Steven D'Aprano <[EMAIL PROTECTED]> wrote: > On Wed, 1 Oct 2008 07:40:01 am Martin v. Löwis wrote: >> >> On Windows, we might reject bytes filenames for all file >> >> operations: open(), unlink(), os.path.join(), etc. (raise a >> >> TypeError or UnicodeError) >> >

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Steven D'Aprano
On Wed, 1 Oct 2008 07:40:01 am Martin v. Löwis wrote: > >> On Windows, we might reject bytes filenames for all file > >> operations: open(), unlink(), os.path.join(), etc. (raise a > >> TypeError or UnicodeError) > > > > Since I've seen no objections to this yet: please no. If we offer a > > "lower

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Jack Jansen
On 1-Oct-2008, at 00:32 , Martin v. Löwis wrote: How does windows (and Python on windows) handle NFC versus NFD issues? That's left to the application. Can I have two files called "ümlaut.txt", one in NFD and one NFC form? Yes, you can. It sounds confusing, but only in a theoretical

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread James Y Knight
On Sep 30, 2008, at 6:21 PM, Martin v. Löwis wrote: IOW, Java hasn't solved the problem in the last 10 years. Java is already really bad at being a small little language to write cooperating tools in. I'd never even attempt to write a little pipeline filter in Java -- I've already pretty mu

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Guido van Rossum
On Tue, Sep 30, 2008 at 3:21 PM, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: >>> My concern still is that it brings the bytes type into the status of >>> another character string type, which is really bad, and will require >>> further modifications to Python for the lifetime of 3.x. >> >> I'd like

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Martin v. Löwis
> How does windows (and Python on windows) handle NFC versus NFD issues? That's left to the application. > Can I have two files called "ümlaut.txt", one in NFD and one NFC form? Yes, you can. It sounds confusing, but only in a theoretical way. You never have combining characters on Windows (at

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Martin v. Löwis
> Yes! If there is a byte-string access method for Windows, pretty please > make it decode from UTF-8 internally and call the Unicode version of the > Windows APIs. The non-unicode windows APIs are pretty much just broken > -- Ideally, Python should never be calling those. I don't think we will ma

Re: [Python-Dev] Filename as byte string in python 2.6 or 3.0?

2008-09-30 Thread Guido van Rossum
On Tue, Sep 30, 2008 at 3:18 PM, Nick Coghlan <[EMAIL PROTECTED]> wrote: > That said, I don't think this is something we (or, more to the point, > Guido) need to make a decision on right now - for 3.0, having > bytes-level APIs that can see everything, and Unicode APIs that ignore > badly encoded f

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Martin v. Löwis
>> My concern still is that it brings the bytes type into the status of >> another character string type, which is really bad, and will require >> further modifications to Python for the lifetime of 3.x. > > I'd like to understand why this is "really bad". I though it was by > design that the str

Re: [Python-Dev] Filename as byte string in python 2.6 or 3.0?

2008-09-30 Thread Guido van Rossum
On Tue, Sep 30, 2008 at 2:43 PM, Nick Coghlan <[EMAIL PROTECTED]> wrote: > Of the suggestions I've seen so far, I like Marcin's Mono-inspired > NULL-escape codec idea the best. Since these strings all come from parts > of the environment where NULLs are not permitted, a simple "'\0' in > text" chec

Re: [Python-Dev] Filename as byte string in python 2.6 or 3.0?

2008-09-30 Thread Nick Coghlan
Adam Olsen wrote: > On Tue, Sep 30, 2008 at 3:43 PM, Nick Coghlan <[EMAIL PROTECTED]> wrote: >> Of the suggestions I've seen so far, I like Marcin's Mono-inspired >> NULL-escape codec idea the best. Since these strings all come from parts >> of the environment where NULLs are not permitted, a simpl

Re: [Python-Dev] Patch for an initial support of bytes filename in Python3

2008-09-30 Thread Guido van Rossum
On Tue, Sep 30, 2008 at 12:07 PM, Simon Cross <[EMAIL PROTECTED]> wrote: > On Tue, Sep 30, 2008 at 7:56 PM, Guido van Rossum <[EMAIL PROTECTED]> wrote: >> (since os.getcwdb() is a Unix-only thing). > > I would be happier if all the Unix byte functions existed on Windows > fell back to something lik

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Jack Jansen
On 30-Sep-2008, at 23:42 , Martin v. Löwis wrote: It's the other way 'round: On Windows, Unicode file names are the natural choice, and byte strings have limitations. In a sense, Windows got it right - but then, they started later. Unix missed the opportunity of declaring that all file APIs

Re: [Python-Dev] Filename as byte string in python 2.6 or 3.0?

2008-09-30 Thread Adam Olsen
On Tue, Sep 30, 2008 at 3:43 PM, Nick Coghlan <[EMAIL PROTECTED]> wrote: > Guido van Rossum wrote: >> The callback would either be an extra argument to all >> system calls (bad, ugly etc., and why not go with the existing unicode >> encoding and error flags if we're adding extra args?) or would be

Re: [Python-Dev] Patch for an initial support of bytes filename in Python3

2008-09-30 Thread Guido van Rossum
On Tue, Sep 30, 2008 at 11:47 AM, <[EMAIL PROTECTED]> wrote: > > On 05:56 pm, [EMAIL PROTECTED] wrote: >> >> On Tue, Sep 30, 2008 at 10:59 AM, <[EMAIL PROTECTED]> wrote: >>> >>> On 02:32 pm, [EMAIL PROTECTED] wrote: > >>> In the absence of a 2.6 getcwdb, perhaps the fixer could just drop the >>>

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread James Y Knight
On Sep 30, 2008, at 5:40 PM, Martin v. Löwis wrote: On Windows, we might reject bytes filenames for all file operations: open(), unlink(), os.path.join(), etc. (raise a TypeError or UnicodeError) Since I've seen no objections to this yet: please no. If we offer a "lower-level" bytes filename

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Marcin 'Qrczak' Kowalczyk
2008/9/30 Glenn Linderman <[EMAIL PROTECTED]>: > So the problem is that a Unicode file system interface can't deal with > non-UTF-8 byte streams as file names. > > So it seems there are four suggested approaches, all of which have aspects > that are inconvenient. Let's not forget what happens whe

Re: [Python-Dev] Filename as byte string in python 2.6 or 3.0?

2008-09-30 Thread Nick Coghlan
Guido van Rossum wrote: > The callback would either be an extra argument to all > system calls (bad, ugly etc., and why not go with the existing unicode > encoding and error flags if we're adding extra args?) or would be > global, where I'd be worried that it might interfere with the proper > opera

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Martin v. Löwis
> Oh, ok. I had assumed Windows just uses a fixed encoding without the problem > of misencoded filenames. It's the other way 'round: On Windows, Unicode file names are the natural choice, and byte strings have limitations. In a sense, Windows got it right - but then, they started later. Unix misse

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Martin v. Löwis
>> On Windows, we might reject bytes filenames for all file operations: open(), >> unlink(), os.path.join(), etc. (raise a TypeError or UnicodeError) > > Since I've seen no objections to this yet: please no. If we offer a > "lower-level" bytes filename API, it should work for all platforms. Unfo

Re: [Python-Dev] Filename as byte string in python 2.6 or 3.0?

2008-09-30 Thread Guido van Rossum
On Tue, Sep 30, 2008 at 11:42 AM, <[EMAIL PROTECTED]> wrote: > There are other ways to glean this knowledge; for example, looking at the > 'iocharset' or 'nls' mount options supplied to mount various filesystems. I > thought maybe Python (or some C library call) might be invoking some logic > tha

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Guido van Rossum
On Tue, Sep 30, 2008 at 1:29 PM, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > Guido van Rossum wrote: >> However >> the *proposed* behavior (returns bytes if the arg was bytes, and >> returns str when the arg was str) is IMO sane, and no different than >> the polymorphism found in len() or many b

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Guido van Rossum
On Tue, Sep 30, 2008 at 1:12 PM, Terry Reedy <[EMAIL PROTECTED]> wrote: > Terry Reedy wrote: >> >> Guido van Rossum wrote: > >>> I'm not sure either way. I've heard it claim that Windows filesystem >>> APIs use Unicode natively. Does Python 3.0 on Windows currently >>> support filenames expressed a

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Guido van Rossum
On Tue, Sep 30, 2008 at 1:04 PM, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > Guido van Rossum wrote: >> On Mon, Sep 29, 2008 at 11:00 PM, "Martin v. Löwis" <[EMAIL PROTECTED]> >> wrote: Change the default file system encoding to store bytes in Unicode is like introducing a new Python

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Guido van Rossum
On Tue, Sep 30, 2008 at 12:42 PM, Terry Reedy <[EMAIL PROTECTED]> wrote: > Guido van Rossum wrote: >> >> On Tue, Sep 30, 2008 at 11:13 AM, Georg Brandl <[EMAIL PROTECTED]> wrote: >>> >>> Victor Stinner schrieb: On Windows, we might reject bytes filenames for all file operations: open

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Martin v. Löwis
> I'm not sure either way. I've heard it claim that Windows filesystem > APIs use Unicode natively. Does Python 3.0 on Windows currently > support filenames expressed as bytes? Yes, it does (at least, os.open, os.stat support them, builtin open doesn't). > Are they encoded first before > passing

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Martin v. Löwis
Guido van Rossum wrote: > However > the *proposed* behavior (returns bytes if the arg was bytes, and > returns str when the arg was str) is IMO sane, and no different than > the polymorphism found in len() or many builtin operations. My concern still is that it brings the bytes type into the statu

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Martin v. Löwis
> I didn't get an answer to my question: what is the result characters) stored in unicode> + ? I guess that the result is > instead of raising an error > (invalid types). So again: why introducing a new type instead of reusing > existing Python types? I didn't mean to introduce a new data typ

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Martin v. Löwis
Guido van Rossum wrote: > On Mon, Sep 29, 2008 at 11:00 PM, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: >>> Change the default file system encoding to store bytes in Unicode is like >>> introducing a new Python type: . >> Exactly. Seems like the best solution to me, despite your polemics. > > Mar

Re: [Python-Dev] Patch for an initial support of bytes filename in Python3

2008-09-30 Thread Simon Cross
On Tue, Sep 30, 2008 at 7:56 PM, Guido van Rossum <[EMAIL PROTECTED]> wrote: > (since os.getcwdb() is a Unix-only thing). I would be happier if all the Unix byte functions existed on Windows fell back to something like encoding the filenames to/from UTF-8. Then at least it would be possible for pr

Re: [Python-Dev] Patch for an initial support of bytes filename in Python3

2008-09-30 Thread glyph
On 05:56 pm, [EMAIL PROTECTED] wrote: On Tue, Sep 30, 2008 at 10:59 AM, <[EMAIL PROTECTED]> wrote: On 02:32 pm, [EMAIL PROTECTED] wrote: In the absence of a 2.6 getcwdb, perhaps the fixer could just drop the "benefit of the doubt" case? It could always be added to 2.7, and the parity relea

Re: [Python-Dev] Filename as byte string in python 2.6 or 3.0?

2008-09-30 Thread glyph
On 06:16 pm, [EMAIL PROTECTED] wrote: On Tue, Sep 30, 2008 at 11:12 AM, <[EMAIL PROTECTED]> wrote: The one thing it doesn't do is expose the decoding rules for the higher- level applications to deal with. I am pretty sure I don't understand how the interaction between filesystem encoding and

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Guido van Rossum
On Tue, Sep 30, 2008 at 11:13 AM, Georg Brandl <[EMAIL PROTECTED]> wrote: > Victor Stinner schrieb: >> On Windows, we might reject bytes filenames for all file operations: open(), >> unlink(), os.path.join(), etc. (raise a TypeError or UnicodeError) > > Since I've seen no objections to this yet: pl

Re: [Python-Dev] Filename as byte string in python 2.6 or 3.0?

2008-09-30 Thread Guido van Rossum
On Tue, Sep 30, 2008 at 11:12 AM, <[EMAIL PROTECTED]> wrote: > The one thing it doesn't do is expose the decoding rules for the higher- > level applications to deal with. I am pretty sure I don't understand how > the interaction between filesystem encoding and user locale works in that > case, t

Re: [Python-Dev] Filename as byte string in python 2.6 or 3.0?

2008-09-30 Thread glyph
On 02:39 pm, [EMAIL PROTECTED] wrote: For example, implementing os.listdir to return the file names as Unicode subclasses with ability to access the underlying bytes (automatically recognized by open and friends) sounds like a good compromise that allows the word processor to both have the cake

Re: [Python-Dev] Patch for an initial support of bytes filename in Python3

2008-09-30 Thread Guido van Rossum
On Tue, Sep 30, 2008 at 10:59 AM, <[EMAIL PROTECTED]> wrote: > On 02:32 pm, [EMAIL PROTECTED] wrote: >> If 2.6 weren't pretty much released already I'd ask to add >> os.getcwdb() there, as an alias for os.getcwd(), and add a 2to3 fixer >> that converts os.getcwdu() to os.getcwd(), leaves os.getcwd

Re: [Python-Dev] Patch for an initial support of bytes filename in Python3

2008-09-30 Thread glyph
On 02:32 pm, [EMAIL PROTECTED] wrote: On Tue, Sep 30, 2008 at 6:21 AM, <[EMAIL PROTECTED]> wrote: On 12:47 am, [EMAIL PROTECTED] wrote: It sounds like maybe there should be some 2to3 fixers in here somewhere, too? Not necessarily as part of this patch, but somewhere related? I don't know

Re: [Python-Dev] [Python-3000] Patch for an initial support of bytes filename in Python3

2008-09-30 Thread Guido van Rossum
On Tue, Sep 30, 2008 at 10:41 AM, Bill Janssen <[EMAIL PROTECTED]> wrote: > Guido van Rossum <[EMAIL PROTECTED]> wrote: >> On Tue, Sep 30, 2008 at 8:47 AM, Bill Janssen <[EMAIL PROTECTED]> wrote: >> > Victor Stinner <[EMAIL PROTECTED]> wrote: >> > >> >> - listdir(unicode) -> only unicode, *skip* i

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Georg Brandl
Guido van Rossum schrieb: > On Tue, Sep 30, 2008 at 10:28 AM, Georg Brandl <[EMAIL PROTECTED]> wrote: >>> How can it *regularly* drive you crazy when "the majority of fie names >>> [...] encoded correctly" (as you assert above)? >> >> Because Office files are a) often named with long, seemingly des

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread M.-A. Lemburg
On 2008-09-30 18:46, Guido van Rossum wrote: > On Tue, Sep 30, 2008 at 8:20 AM, M.-A. Lemburg <[EMAIL PROTECTED]> wrote: >> In the end, I think it's better not to be clever and just return >> the filenames that cannot be decoded as bytes objects in os.listdir(). > > Unfortunately that's going to b

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Guido van Rossum
On Tue, Sep 30, 2008 at 10:28 AM, Georg Brandl <[EMAIL PROTECTED]> wrote: >> How can it *regularly* drive you crazy when "the majority of fie names >> [...] encoded correctly" (as you assert above)? > > Because Office files are a) often named with long, seemingly descriptive > filenames, which inva

Re: [Python-Dev] [Python-3000] Patch for an initial support of bytes filename in Python3

2008-09-30 Thread Bill Janssen
Guido van Rossum <[EMAIL PROTECTED]> wrote: > On Tue, Sep 30, 2008 at 8:47 AM, Bill Janssen <[EMAIL PROTECTED]> wrote: > > Victor Stinner <[EMAIL PROTECTED]> wrote: > > > >> - listdir(unicode) -> only unicode, *skip* invalid filenames > >>(as asked by Guido) > > > > Is there an option listdir(

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Georg Brandl
Steven D'Aprano schrieb: > On Tue, 30 Sep 2008 11:50:10 pm Guido van Rossum wrote: > >> > To avoid silent skipping, is it possible to drop 'unreadable' >> > names, issue a warning (instead of exception), and continue to >> > completion? "Warning: unreadable filename skipped; see >> > PyWiki/Unread

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Georg Brandl
Guido van Rossum schrieb: >> With the filenames decoded by UTF-8, your files named têste, ô, dossié will >> be displayed and handled correctly. The others are *invalid* in the >> filesystem >> encoding UTF-8 and therefore would be represented by something like >> >> u'dir\uXXffname' where XX is s

Re: [Python-Dev] [Python-3000] Patch for an initial support of bytes filename in Python3

2008-09-30 Thread Guido van Rossum
On Tue, Sep 30, 2008 at 8:47 AM, Bill Janssen <[EMAIL PROTECTED]> wrote: > Victor Stinner <[EMAIL PROTECTED]> wrote: > >> - listdir(unicode) -> only unicode, *skip* invalid filenames >>(as asked by Guido) > > Is there an option listdir(bytes) which will return *all* filenames (as > byte sequen

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Guido van Rossum
On Tue, Sep 30, 2008 at 8:20 AM, M.-A. Lemburg <[EMAIL PROTECTED]> wrote: > In the end, I think it's better not to be clever and just return > the filenames that cannot be decoded as bytes objects in os.listdir(). Unfortunately that's going to break most code that is using os.listdir(), so it's ha

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Guido van Rossum
On Tue, Sep 30, 2008 at 7:53 AM, Steven D'Aprano <[EMAIL PROTECTED]> wrote: > On Tue, 30 Sep 2008 11:50:10 pm Guido van Rossum wrote: > >> > To avoid silent skipping, is it possible to drop 'unreadable' >> > names, issue a warning (instead of exception), and continue to >> > completion? "Warning: u

Re: [Python-Dev] Patch for an initial support of bytes filename in Python3

2008-09-30 Thread Bill Janssen
Victor Stinner <[EMAIL PROTECTED]> wrote: > - listdir(unicode) -> only unicode, *skip* invalid filenames >(as asked by Guido) Is there an option listdir(bytes) which will return *all* filenames (as byte sequences)? Otherwise, this seems troubling to me; *something* should be returned for f

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread M.-A. Lemburg
On 2008-09-30 16:05, Guido van Rossum wrote: > On Tue, Sep 30, 2008 at 3:31 AM, M.-A. Lemburg <[EMAIL PROTECTED]> wrote: >> On 2008-09-30 08:00, Martin v. Löwis wrote: Change the default file system encoding to store bytes in Unicode is like introducing a new Python type: . >>> Exactly. S

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Steven D'Aprano
On Tue, 30 Sep 2008 11:50:10 pm Guido van Rossum wrote: > > To avoid silent skipping, is it possible to drop 'unreadable' > > names, issue a warning (instead of exception), and continue to > > completion? "Warning: unreadable filename skipped; see > > PyWiki/UnreadableFilenames" > > That would be

Re: [Python-Dev] Filename as byte string in python 2.6 or 3.0?

2008-09-30 Thread Hrvoje Nikšić
On Tue, 2008-09-30 at 07:26 -0700, Guido van Rossum wrote: > > I am not convinced that a word processor can just ignore files with > > (what it thinks are) undecodable file names. In countries with a > > history of incompatible national encodings, such file names crop up very > > often, sometimes

Re: [Python-Dev] Patch for an initial support of bytes filename in Python3

2008-09-30 Thread Guido van Rossum
On Tue, Sep 30, 2008 at 6:21 AM, <[EMAIL PROTECTED]> wrote: > On 12:47 am, [EMAIL PROTECTED] wrote: > > This is the most sane contribution I've seen so far :). Thanks. I'll review it later today (after coffee+breakfast :) and will apply it assuming the code is reasonably sane, otherwise I'll go a

Re: [Python-Dev] when is path==NULL?

2008-09-30 Thread Guido van Rossum
On Tue, Sep 30, 2008 at 5:48 AM, Christian Heimes <[EMAIL PROTECTED]> wrote: > Ulrich Eckhardt wrote: >> >> Hi! >> >> I'm looking at trunk/Python/sysmodule.c, function PySys_SetArgv(). In that >> function, there is code like this: >> >> PyObject* path = PySys_GetObject("path"); >> ... >> if (pat

Re: [Python-Dev] Filename as byte string in python 2.6 or 3.0?

2008-09-30 Thread Guido van Rossum
On Tue, Sep 30, 2008 at 3:52 AM, Hrvoje Nikšić <[EMAIL PROTECTED]> wrote: > On Tue, 2008-09-30 at 19:45 +1000, Nick Coghlan wrote: >> To my mind, there are two kinds of app in the world when it comes to >> file paths: >> 1) "Normal" apps (e.g. a word processor), that are only interested in >> files

Re: [Python-Dev] Filename as byte string in python 2.6 or 3.0?

2008-09-30 Thread Guido van Rossum
On Tue, Sep 30, 2008 at 2:45 AM, Nick Coghlan <[EMAIL PROTECTED]> wrote: > Adam Olsen wrote: >> Lossy conversion just moves around what gets treated as garbage. As >> all valid unicode scalars can be round tripped, there's no way to >> create a valid unicode file name without being lossy. The alt

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Victor Stinner
Le Tuesday 30 September 2008 15:53:09 Guido van Rossum, vous avez écrit : > On Mon, Sep 29, 2008 at 11:00 PM, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > >> Change the default file system encoding to store bytes in Unicode is > >> like introducing a new Python type: . > > > > Exactly. Seems lik

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Guido van Rossum
On Tue, Sep 30, 2008 at 3:31 AM, M.-A. Lemburg <[EMAIL PROTECTED]> wrote: > On 2008-09-30 08:00, Martin v. Löwis wrote: >>> Change the default file system encoding to store bytes in Unicode is like >>> introducing a new Python type: . >> >> Exactly. Seems like the best solution to me, despite your

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Guido van Rossum
On Mon, Sep 29, 2008 at 11:22 PM, Georg Brandl <[EMAIL PROTECTED]> wrote: > No, that was not what I meant (although it is another possibility). As I > wrote, > Martin's proposal that I support here is using the modified UTF-8 codec that > successfully roundtrips otherwise invalid UTF-8 data. I th

Re: [Python-Dev] Patch for an initial support of bytes filename in Python3

2008-09-30 Thread Victor Stinner
Hi, > This is the most sane contribution I've seen so far :). Oh thanks. > Do I understand properly that (listdir(bytes) -> bytes)? Yes, os.listdir(bytes)->bytes. It's already the current behaviour. But with Python3 trunk, os.listdir(str) -> str ... or bytes (if unicode conversion fails). >

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Guido van Rossum
On Mon, Sep 29, 2008 at 11:00 PM, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: >> Change the default file system encoding to store bytes in Unicode is like >> introducing a new Python type: . > > Exactly. Seems like the best solution to me, despite your polemics. Martin, I don't understand why you

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Guido van Rossum
On Mon, Sep 29, 2008 at 8:55 PM, Terry Reedy <[EMAIL PROTECTED]> wrote: > >> Le Monday 29 September 2008 19:06:01 Guido van Rossum, vous avez écrit : > >>> I know I keep flipflopping on this one, but the more I think about it >>> the more I believe it is better to drop those names than to raise an

Re: [Python-Dev] Patch for an initial support of bytes filename in Python3

2008-09-30 Thread glyph
On 12:47 am, [EMAIL PROTECTED] wrote: This is the most sane contribution I've seen so far :). See attached patch: python3_bytes_filename.patch Using the patch, you will get: - open() support bytes - listdir(unicode) -> only unicode, *skip* invalid filenames (as asked by Guido) Forgive me fo

Re: [Python-Dev] when is path==NULL?

2008-09-30 Thread Thomas Lee
Ulrich Eckhardt wrote: Hi! I'm looking at trunk/Python/sysmodule.c, function PySys_SetArgv(). In that function, there is code like this: PyObject* path = PySys_GetObject("path"); ... if (path != NULL) { ... } My intuition says that if path==NULL, something is very wrong. At least

Re: [Python-Dev] when is path==NULL?

2008-09-30 Thread Thomas Lee
Ulrich Eckhardt wrote: Hi! I'm looking at trunk/Python/sysmodule.c, function PySys_SetArgv(). In that function, there is code like this: PyObject* path = PySys_GetObject("path"); ... if (path != NULL) { ... } My intuition says that if path==NULL, something is very wrong. At least

Re: [Python-Dev] when is path==NULL?

2008-09-30 Thread Christian Heimes
Ulrich Eckhardt wrote: Hi! I'm looking at trunk/Python/sysmodule.c, function PySys_SetArgv(). In that function, there is code like this: PyObject* path = PySys_GetObject("path"); ... if (path != NULL) { ... } My intuition says that if path==NULL, something is very wrong. At least

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Adam Olsen
On Tue, Sep 30, 2008 at 5:24 AM, Stephen J. Turnbull <[EMAIL PROTECTED]> wrote: > Adam Olsen writes: > > > [1] You could argue that Unicode should add new scalars to handle all > > currently invalid UTF-8 sequences. > > AFAIK there are about 2^31 of these, though! They've promised to never alloc

[Python-Dev] when is path==NULL?

2008-09-30 Thread Ulrich Eckhardt
Hi! I'm looking at trunk/Python/sysmodule.c, function PySys_SetArgv(). In that function, there is code like this: PyObject* path = PySys_GetObject("path"); ... if (path != NULL) { ... } My intuition says that if path==NULL, something is very wrong. At least I would expect to get 'N

Re: [Python-Dev] Python security team

2008-09-30 Thread Steve Holden
Jan Mate wrote: > Guido van Rossum napsal(a): [...] >> know you personally -- but perhaps other current members of the PSRT >> do and that could be enough to secure an invitation. > > No, i don't think that i'm known well enough to earn the invitation > (yet), this was more of a "so how the hell d

Re: [Python-Dev] Python security team

2008-09-30 Thread jek <[EMAIL PROTECTED]>
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Guido van Rossum napsal(a): > I think we may have to expand our selection creteria, since the > existing approach has led to a small PSRT whose members are all too > busy to do the necessary legwork. At the same time we need to remain > selective -- I

Re: [Python-Dev] Filename as byte string in python 2.6 or 3.0?

2008-09-30 Thread Hrvoje Nikšić
On Tue, 2008-09-30 at 19:45 +1000, Nick Coghlan wrote: > To my mind, there are two kinds of app in the world when it comes to > file paths: > 1) "Normal" apps (e.g. a word processor), that are only interested in > files with sane, well-formed file names that can be properly decoded to > Unicode wit

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread Stephen J. Turnbull
Adam Olsen writes: > [1] You could argue that Unicode should add new scalars to handle all > currently invalid UTF-8 sequences. AFAIK there are about 2^31 of these, though! ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mail

Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

2008-09-30 Thread M.-A. Lemburg
On 2008-09-30 08:00, Martin v. Löwis wrote: >> Change the default file system encoding to store bytes in Unicode is like >> introducing a new Python type: . > > Exactly. Seems like the best solution to me, despite your polemics. Not a bad idea... have os.listdir() return Unicode subclasses that

Re: [Python-Dev] Filename as byte string in python 2.6 or 3.0?

2008-09-30 Thread Nick Coghlan
Adam Olsen wrote: > Lossy conversion just moves around what gets treated as garbage. As > all valid unicode scalars can be round tripped, there's no way to > create a valid unicode file name without being lossy. The alternative > is not be valid unicode, but since we can't use such objects with >

Re: [Python-Dev] Status of MS Windows CE port

2008-09-30 Thread Ulrich Eckhardt
On Tuesday 30 September 2008, Martin v. Löwis wrote: > Ulrich Eckhardt wrote: > >>> Well, currently it does make a difference. Simple example: > >>> CreateFile(). > >> > >> It's not so simple: Python doesn't actually call CreateFile > > > > Martin, CreateFile() was just used as an example. You can