On Thu, May 12, 2011 at 1:58 AM, John Machin <sjmac...@lexicon.net> wrote: > On Thu, May 12, 2011 4:31 pm, harrismh777 wrote: > >> >> So, the UTF-16 UTF-32 is INTERNAL only, for Python > > NO. See one of my previous messages. UTF-16 and UTF-32, like UTF-8 are > encodings for the EXTERNAL representation of Unicode characters in byte > streams.
Right. *Under the hood* Python uses UCS-2 (which is not exactly the same thing as UTF-16, by the way) to represent Unicode strings. However, this is entirely transparent. To the Python programmer, a unicode string is just an abstraction of a sequence of code-points. You don't need to think about UCS-2 at all. The only times you need to worry about encodings are when you're encoding unicode characters to byte strings, or decoding bytes to unicode characters, or opening a stream in text mode; and in those cases the only encoding that matters is the external one. -- http://mail.python.org/mailman/listinfo/python-list