Re: recycling internationalized garbage

Martin v. Löwis Wed, 15 Mar 2006 23:46:00 -0800

Ross Ridge wrote:
> It should be obvious that any 8-bit single-byte character set can 
> produce byte sequences that are valid in UTF-8.


It is certainly possible to interpret UTF-8 data as if they were
in a specific single-byte encoding. However, the text you then
obtain is not meaningful in any language of the world.

So "valid" yes; "meaningful" no. Therefore, for all practical
purposes, 8-bit single-byte characters sets *will not* produce
byte sequences that are valid in UTF-8 (although they could -
it just won't happen).

> In fact I can't think of any multi-byte encoding that can't produce
> valid UTF-8 byte sequence.

The same reasoning applies for them.

Regards,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: recycling internationalized garbage

Reply via email to