Re: Code points vs Unicode scalar values

Brendan Eich Wed, 04 Sep 2013 09:15:34 -0700

Anne van Kesteren <mailto:[email protected]>
September 4, 2013 7:48 AM
ES6 introduces String.prototype.codePointAt() and
String.codePointFrom()


String.fromCodePoint, rather.

as well as an iterator (not defined). It struck
me this is the only place in the platform where we'd expose code point
as a concept to developers.

Nowadays strings are either 16-bit code units (JavaScript, DOM, etc.)
or Unicode scalar values (anytime you hit the network and use utf-8).

I'm not sure I'm a big fan of having all three concepts around.

You can't avoid it: UTF-8 is a transfer format that can be observed viaserialization. String.prototype.charCodeAt and String.fromCharCode arerequired for backward compatibility. And ES6 wants to expose code pointsas well, so three.

We
could have String.prototype.unicodeAt() and String.unicodeFrom()
instead, and have them translate lone surrogates into U+FFFD. Lone
surrogates are a bug and I don't see a reason to expose them in more
places than just the 16-bit code units.

Sorry, I missed this: how else (other than the charCodeAt/fromCharCodelegacy) are lone surrogates exposed?

/be

_______________________________________________
es-discuss mailing list
[email protected]
https://mail.mozilla.org/listinfo/es-discuss

Re: Code points vs Unicode scalar values

Reply via email to