Michael Torrie <torr...@gmail.com>: > Unicode can only be encoded to bytes. > Bytes can only be decoded to unicode.
I don't really like it how Unicode is equated with text, or even character strings. There's barely any difference between the truth value of these statements: Python strings are ASCII. Python strings are Latin-1. Python strings are Unicode. Each of those statements is true as long as you stay within the respective character sets, and cease to be true when your text contains characters outside the character sets. Now, it is true that Python currently limits itself to the 1,114,112 Unicode code points. And it likely won't adopt more characters unless Unicode does it first. However, text is something more lofty and abstract than a sequence of Unicode code points. We shouldn't call strings Unicode any more than we call numbers IEEE or times ISO. Marko -- https://mail.python.org/mailman/listinfo/python-list