I believe it's more "DHTML" that is the problem.
DOMString is specified to be UTF-16. Likewise for ECMAScript strings, IIRC, although they may still be officially UCS-2. In practice ECMAScript specifies (and implementations provide) such minimal Unicode support (no canonicalization or character class primitives [combining, etc.], for instance, and no way to work with characters rather than UCS-2 codes/surrogate halves, and no access to codecs other than UTF-8<->UTF-16 [often buggy and incomplete, and rarely able to deal with errors in any way other than throwing exceptions], nor any access to the Unicode names database or the Unihan database) that applications are basically on their own. On 16 Mar 2007 21:59:06 +0000, Colin Paul Adams <[EMAIL PROTECTED]> wrote:
>>>>> "Rich" == Rich Felker <[EMAIL PROTECTED]> writes: Rich> UTF-8. There's no good reason for using UTF-16 at all; it's Rich> just a bad implementation choice. IIRC either HTML or XML Rich> (yes I know they're different but I forget which does it..) I don't ever recall seeing this in HTML, but it certainly isn't in XML. The only thing XML has to say on the subject is that XML parsers must be able to read both. -- Colin Adams Preston Lancashire -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
-- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/