Steven D'Aprano <steve+comp.lang.pyt...@pearwood.info>: > Unicode strings in Python 2 are second class entities.
I don't see that. They form a type just like, say, complex. > It's not just that people will, in general, take the lazy way and > write "foo" instead of u"foo" for their strings. People live with their choices, and I don't see the consequences of that lazy way as very bad. In fact, I find the lazy use of Unicode strings at least as scary as the lazy use of byte strings, especially since Python 3 sneaks Unicode to the outer interfaces of the program (files, IPC). > But it is that the whole Python virtual machine is based on > byte-strings, not Unicode strings, and u"" strings are bolted on top. The internal implementation of the VM is free to change as long as the external semantics stay the same. > [steve@ando ~]$ python3.3 -c "π = 3.14; print(π+1)" > 4.140000000000001 > [steve@ando ~]$ python2.7 -c "π = 3.14; print(π+1)" > File "<string>", line 1 > π = 3.14; print(π+1) > ^ > SyntaxError: invalid syntax My native language uses ä and ö, but I don't see any pressing need to embed those characters in identifiers. > Python 2 "helpfully" tries to guess what you want when you work with > bytes-pretending-to-be-strings, and when it guesses right, it's nice, but > when it guesses wrongly, you'll left with mysterious encoding and > decoding errors from code that don't appear to involve either. The whole > thing is a mess. I can't think of a matching example. Marko -- https://mail.python.org/mailman/listinfo/python-list