Re: High-Speed UTF-8 to UTF-16 Conversion

Rich Felker Sat, 17 Mar 2007 17:37:55 -0800

On Sat, Mar 17, 2007 at 06:25:59PM +0600, Christopher Fynn wrote:
> Colin Paul Adams wrote:
> 
> >>>>>>"Rich" == Rich Felker <[EMAIL PROTECTED]> writes:
> >
> >    Rich> Indeed, this was what I was thinking of. Thanks for
> >    Rich> clarifying. BTW, any idea WHY they brought the UTF-16
> >    Rich> nonsense to DOM/DHTML/etc.?
> 
> >I don't know for certain, but I can speculate well, I think.
> 
> >DOM was a micros**t invention (and how it shows!). NT was UCS-2
> >(effectively).
> 
> AFAIK Unicode was originally only planned to be a 16-bit encoding.
> the The Unicode Consortium and ISO 10646 then agreed to synchronize the
> two standards - though originally Unicode was only going to be a 16-bit 
> subset of the UCS. A little after that Unicode decided to support UCS 
> characters beyond plane 0.
> 
> Anyway at the time NT was being designed (late eighties) Unicode was 
> supposed to be limited to < 65536 characers and UTF-8 hadn't been 
> thought of, so 16-bits probably seemed like a good idea.


While this is probably true, it's also aside from the point. I wasn't
asking why Windows used UCS-2, but why JavaScript remained stuck on
the 16bit idea even after the character set expanded -- since JS is a
pretty high level lang and the size of types is largely irrelevant,
redefining characters to be 32bit integers shouldn't have broken
anything.

Rich

--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Re: High-Speed UTF-8 to UTF-16 Conversion

Reply via email to