On May 16, 2011, at 4:21 PM, Shawn Steele wrote: > > Not in my proposal! "\ud800\udc00"=== "\u+010000" is false in my proposal. > > That’s exactly my problem. I think the engine’s (or at least the > applications written in JavaScript) are still UTF-16-centric and that they’ll > have d800, dc00 === 10000. For example, if they were different, then d800, > dc00 should print �� instead of 𐀀, however I’m reasonably sure that any > implementation would end up rendering it as 𐀀.
I think you'll find that the actual JS engines are currently UCS-2 centric. The surrounding browser environments are doing the UTF-16 interpretation. That why you see 𐀀 instead of �� in browser generated display output. > > In other words I don’t think you can get the engine to be completely UTF-32. > At least not without declaring a page as being UTF-32. > I agree that application writer will continue for the foreseeable future have to know whether or not they are dealing with UTF-16 encoded data and/or communicating with other subsystems that expect such data. However, core language support for UTF-32 is a prerequisite for ever moving beyond UTF-16APIs and libraries and getting back to uniform sized character processing. Allen
_______________________________________________ es-discuss mailing list [email protected] https://mail.mozilla.org/listinfo/es-discuss

