In practice, the supplemental code points don't really cause problems in Unicode strings. Most implementations just treat them as if they were unassigned. The only important issue is that *when* they are converted to UTF-xx for storage or transmission, they need to be handled; typically by converting to FFFD (never just deleted - a bad idea for security).
Mark *— Il meglio è l’inimico del bene —* On Mon, May 16, 2011 at 14:46, Boris Zbarsky <[email protected]> wrote: > On 5/16/11 5:16 PM, Mike Samuel wrote: > >> The strawman says >> >> "The String type is the set of all finite ordered sequences of zero or >> more 21-bit unsigned integer values (“elements”)." >> > > Yeah, that's not the same thing as an actual Unicode string, and requires > handling of all sorts of "what if someone sticks non-Unicode in there?" > issues... > > Of course people actually do use JS strings as immutable arrays of 16-bit > unsigned integers right now (not just as byte arrays), so I suspect that we > can't easily exclude the surrogate ranges from "strings" without breaking > existing content... > > > -Boris > _______________________________________________ > es-discuss mailing list > [email protected] > https://mail.mozilla.org/listinfo/es-discuss >
_______________________________________________ es-discuss mailing list [email protected] https://mail.mozilla.org/listinfo/es-discuss

