On Thu, Sep 16, 2010 at 11:16 AM, Toshio Kuratomi <a.bad...@gmail.com> wrote:
> On Thu, Sep 16, 2010 at 10:56:56AM -0700, Guido van Rossum wrote:
>> On Thu, Sep 16, 2010 at 10:46 AM, Martin (gzlist) <gzl...@googlemail.com> 
>> wrote:
>> > On 16/09/2010, Guido van Rossum <gu...@python.org> wrote:
>> >>
>> >> In all cases I can imagine where such polymorphic functions make
>> >> sense, the necessary and sufficient assumption should be that the
>> >> encoding is a superset of 7-bit(*) ASCII. This includes UTF-8, all
>> >> Latin-N variant, and AFAIK also the popular CJK encodings other than
>> >> UTF-16. This is the same assumption made by Python's byte type when
>> >> you use "character-based" methods like lower().
>> >
>> > Well, depends on what exactly you're doing, it's pretty easy to go wrong:
>> >
>> > Python 3.2a2+ (py3k, Sep 16 2010, 18:43:45) [MSC v.1500 32 bit (Intel)] on 
>> > win32
>> > Type "help", "copyright", "credits" or "license" for more information.
>> >>>> import os, sys
>> >>>> os.path.split("C:\\十")
>> > ('C:\\', '十')
>> >>>> os.path.split("C:\\十".encode(sys.getfilesystemencoding()))
>> > (b'C:\\\x8f', b'')
>> >
>> > Similar things can catch out web developers once they step outside the
>> > percent encoding.
>>
>> Well, that character is not 7-bit ASCII. Of course things will go
>> wrong there. That's the whole point of what I said, isn't it?
>>
> You were talking about encodings that were supersets of 7-bit ASCII.
> I think Martin was demonstrating a byte string that was a superset of 7-bit
> ASCII being fed to a stdlib function which went wrong.

Whoops, sorry. I don't have access to Windows so I can't reproduce
this though. I also don't understand it. What is the Unicode codepoint
for that 十 character? What is sys.getfilesystemencoding()? What is the
value of "C:\\十".encode(sys.getfilesystemencoding())?

-- 
--Guido van Rossum (python.org/~guido)
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to