Stephen,
Only the 7 bit ASCII characters are the same. UTF-8 encodes characters so
that you can tell how many bytes the character will take from the value of
the first byte of the character.
0x00 - 0x7F = 1 byte
0xC0 - 0xDF = 2 bytes
0xE0 - 0xEF = 3 bytes
0xF0 - 0xF7 = 4 bytes
0x80 - 0xBF is u
At 11:52 01/07/10 +0100, Stephen Cowe - Sun Scotland wrote:
>Hi Unicoders,
>
>I am new to the list and would be really grateful if you could help me out
>here.
>
>I am trying to discover if the "extended latin" 8-bit ascii (decimal
>values 128-255, Hex A0-FF), i.e. ISO-8859-1 are supported by UTF
Hi Stephen,
The short answer to your question is "no". The characters between U+0080 and
U+00FF *are* supported by UTF-8 (all Unicode characters are supported by
UTF-8), but they do not use the same code points as Latin-1. If they used
the same code points as Latin-1, they would *be* Latin-1 and
3 matches
Mail list logo