Ross Ridge wrote: > It should be obvious that any 8-bit single-byte character set can > produce byte sequences that are valid in UTF-8.
It is certainly possible to interpret UTF-8 data as if they were in a specific single-byte encoding. However, the text you then obtain is not meaningful in any language of the world. So "valid" yes; "meaningful" no. Therefore, for all practical purposes, 8-bit single-byte characters sets *will not* produce byte sequences that are valid in UTF-8 (although they could - it just won't happen). > In fact I can't think of any multi-byte encoding that can't produce > valid UTF-8 byte sequence. The same reasoning applies for them. Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list