sorry for the me too post, but this has been a major pet peeve of mine for a long time. 16 bit unicode should be gotten rid of, being the worst of both worlds, non backwards compatable with ascii, endianness issues and no constant length encoding.... utf8 externally and utf32 when worknig with individual characters is the way to go.
seeing as how the haskell standard is horribly vauge when it comes to character set encodings anyway, I would recommend that we just omit any reference to the bit size of Char, and just say abstractly that each Char represents one unicode character, but the entire range of unicode is not guarenteed to be expressable, which must be true, since haskell 98 implementations can be written now, but unicode can change in the future. The only range guarenteed to be expressable in any representation are the values 0-127 US ASCII (or perhaps latin1) John On Sun, Sep 30, 2001 at 02:29:40PM +0000, Marcin 'Qrczak' Kowalczyk wrote: > IMHO it would have been better to not invent UTF-16 at all and use > UTF-8 in parallel with UTF-32. But Unicode used to promote UTF-16 as > the real Unicode, and now it causes so many threads on Unicode list > to clear the confusion about the nature of characters above U+FFFF. -- --------------------------------------------------------------------------- John Meacham - California Institute of Technology, Alum. - [EMAIL PROTECTED] --------------------------------------------------------------------------- _______________________________________________ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell