Re: High-Speed UTF-8 to UTF-16 Conversion

Ben Wiley Sittler Fri, 16 Mar 2007 18:17:50 -0800

I believe it's more "DHTML" that is the problem.


DOMString is specified to be UTF-16. Likewise for ECMAScript strings,
IIRC, although they may still be officially UCS-2.

In practice ECMAScript specifies (and implementations provide) such
minimal Unicode support (no canonicalization or character class
primitives [combining, etc.], for instance, and no way to work with
characters rather than UCS-2 codes/surrogate halves, and no access to
codecs other than UTF-8<->UTF-16 [often buggy and incomplete, and
rarely able to deal with errors in any way other than throwing
exceptions], nor any access to the Unicode names database or the
Unihan database) that applications are basically on their own.

On 16 Mar 2007 21:59:06 +0000, Colin Paul Adams
<[EMAIL PROTECTED]> wrote:

>>>>> "Rich" == Rich Felker <[EMAIL PROTECTED]> writes:

    Rich> UTF-8. There's no good reason for using UTF-16 at all; it's
    Rich> just a bad implementation choice. IIRC either HTML or XML
    Rich> (yes I know they're different but I forget which does it..)

I don't ever recall seeing this in HTML, but it certainly isn't in
XML.
The only thing XML has to say on the subject is that XML parsers must
be able to read both.
--
Colin Adams
Preston Lancashire

--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/


--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Re: High-Speed UTF-8 to UTF-16 Conversion

Reply via email to