On Wed, 1 Oct 2008 09:21:37 am you wrote: > On Tue, Sep 30, 2008 at 4:08 PM, Steven D'Aprano <[EMAIL PROTECTED]> wrote: > > On Wed, 1 Oct 2008 07:40:01 am Martin v. Löwis wrote: > >> >> On Windows, we might reject bytes filenames for all file > >> >> operations: open(), unlink(), os.path.join(), etc. (raise a > >> >> TypeError or UnicodeError) > >> > > >> > Since I've seen no objections to this yet: please no. If we > >> > offer a "lower-level" bytes filename API, it should work for all > >> > platforms. > >> > >> Unfortunately, it can't. You cannot represent all possible file > >> names in a byte string in Windows (just as you can't do so in a > >> Unicode string on Unix). > > > > Sorry, maybe I'm just being thick here, but I don't understand how > > that is possible. On the physical disk, each Windows file name must > > be represented by a byte string, yes? So how is it possible that > > there are Windows files with names that can't be represented as a > > byte string? What have I missed? > > I believe on disk it uses UTF-16.
Which is made up of bytes. There may be byte sequences that are illegal UTF-16, but that's not what Martin said. I don't understand how there can be UTF-16 sequences which don't correspond to some sequence of bytes. How would they be represented in memory? Is this to do with the endianness of the UTF-16 sequence? -- Steven _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com