On 02/01/2014 09:32 AM, Eli Zaretskii wrote:
What if a sequence of bytes intended to encode ISO-8859-1 characters
happens to correctly represent UTF-8 characters?
This cannot happen, by virtue of the UTF-8 definition and the fact
that ISO-8859-1 is a single-byte encoding.
Except for ASCII characters, that is.
I don't believe that is correct.
Imagine the 2-byte sequence 110xxxxx 10yyyyyy. In UTF-8 that represents
the character xxxxxyyyyyy, while in ISO-8859-1 that can be a valid
2-character sequence.
--
--Per Bothner
[email protected] http://per.bothner.com/