Re: Problem with -3 switch

Christian Heimes Mon, 12 Jan 2009 05:07:55 -0800

>> Perhaps you also like to hear from a developer who has worked on Python
>> 3.0 itself and who has done lots of work with internationalized
>> applications. If you want to get it right you must
>>
>> * decode incoming text data to unicode as early as possible
>> * use unicode for all internal text data
>> * encode outgoing unicode as late as possible.
>>
>> where incoming data is read from the file system, database, network etc.
>>
>> This rule applies not only to Python 3.0 but to *any* application
>> written in *any* languate.
> 
> The above is a story with which I'm quite familiar. However it is
> *not* the issue!! The issue is why would anyone propose changing a
> string constant "foo" in working 2.x code to u"foo"?


Do I really have to repeat "use unicode for all internal text data"?

"foo" and u"foo" are two totally different things. The former is a byte
sequence "\x66\x6f\x6f" while the latter is the text 'foo'. It just
happens that "foo" and u"foo" are equal in Python 2.x because
"foo".decode("ascii") == u"foo". In Python 3.x does it right, b"foo" is
unequal to "foo".

Christian

--
http://mail.python.org/mailman/listinfo/python-list

Re: Problem with -3 switch

Reply via email to