In a recent note, Thomas E. Dickey said: > Date: Mon, 22 Oct 2001 11:11:08 -0400 (EDT) > > > > would a 0x00 be legal following the 0x80? (If not, we could add a check > > > for that special case). > > > > > I defer that one to the Unicode/UTF8/CJK/Big5 experts. Such a check > > would be insurance for the multibyte cases, but might leave some > > breakage for non-ASCII ISO8859 characters. Would a character with the > > 0x80 bit set be legal at the end of an ISO8859 string? > > I don't think so (0x80 is a control character in ISO 8859). 0x80,0x00 > shouldn't appear embedded in UTF-8 either. > The problem is not only with 0x80, but with any character having the high bit set. This includes ISO8859 letters with diacriticals and all EBCDIC letters and digits.
-- gil -- StorageTek INFORMATION made POWERFUL ; To UNSUBSCRIBE: Send "unsubscribe lynx-dev" to [EMAIL PROTECTED]
