Re: Full Unicode strings strawman

Allen Wirfs-Brock Mon, 16 May 2011 17:07:32 -0700

On May 16, 2011, at 4:21 PM, Shawn Steele wrote:

> > Not in my proposal!  "\ud800\udc00"=== "\u+010000"  is false in my proposal.
>  
> That’s exactly my problem.  I think the engine’s (or at least the 
> applications written in JavaScript) are still UTF-16-centric and that they’ll 
> have d800, dc00 === 10000.  For example, if they were different, then d800, 
> dc00 should print �� instead of 𐀀, however I’m reasonably sure that any 
> implementation would end up rendering it as 𐀀.


I think you'll find that the actual JS engines are currently UCS-2 centric. The 
surrounding browser environments are doing the UTF-16 interpretation.  That why 
you see 𐀀 instead of �� in browser generated display output.

>  
> In other words I don’t think you can get the engine to be completely UTF-32.  
> At least not without declaring a page as being UTF-32.
>  

I agree that application writer will continue for the foreseeable future have 
to know whether or not they are dealing with UTF-16 encoded data and/or 
communicating with other subsystems that expect such data.  However, core 
language support for UTF-32 is a prerequisite for ever moving beyond UTF-16APIs 
and libraries and getting back to uniform sized character processing. 

Allen

_______________________________________________
es-discuss mailing list
[email protected]
https://mail.mozilla.org/listinfo/es-discuss

Re: Full Unicode strings strawman

Reply via email to