RE: Unicode, UTF-8 and Extended 8-Bit Ascii - Help Needed

2001-07-10 Thread Carl W. Brown
Stephen, Only the 7 bit ASCII characters are the same. UTF-8 encodes characters so that you can tell how many bytes the character will take from the value of the first byte of the character. 0x00 - 0x7F = 1 byte 0xC0 - 0xDF = 2 bytes 0xE0 - 0xEF = 3 bytes 0xF0 - 0xF7 = 4 bytes 0x80 - 0xBF is u

Re: Unicode, UTF-8 and Extended 8-Bit Ascii - Help Needed

2001-07-10 Thread Martin Duerst
At 11:52 01/07/10 +0100, Stephen Cowe - Sun Scotland wrote: >Hi Unicoders, > >I am new to the list and would be really grateful if you could help me out >here. > >I am trying to discover if the "extended latin" 8-bit ascii (decimal >values 128-255, Hex A0-FF), i.e. ISO-8859-1 are supported by UTF

RE: Unicode, UTF-8 and Extended 8-Bit Ascii - Help Needed

2001-07-10 Thread Addison Phillips [wM]
Hi Stephen, The short answer to your question is "no". The characters between U+0080 and U+00FF *are* supported by UTF-8 (all Unicode characters are supported by UTF-8), but they do not use the same code points as Latin-1. If they used the same code points as Latin-1, they would *be* Latin-1 and