Re: Unicode normalization problem
On Thu, Apr 2, 2015 at 1:39 AM, Andrea Giammarchi andrea.giammar...@gmail.com wrote: Jordan the purpose of `Array.from` is to iterate over the string, and the point of iteration instead of splitting is to have automagically codepoints. This, unless I've misunderstood Mathias presentation (might be) So, here there is a different problem: there are code-points that do not represent real visual representation ... Those are called grapheme clusters or just “graphemes”, as Boris mentioned. And here’s how to deal with them: https://mathiasbynens.be/notes/javascript-unicode#other-grapheme-clusters “Unicode Standard Annex #29 describes [an algorithm for determining grapheme cluster boundaries](http://www.unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries). For a _completely_ accurate solution that works for all Unicode scripts, implement this algorithm in JavaScript, and then count each grapheme cluster as a single symbol.” or maybe, the real problem, is about broken `Array.from` polyfill? `Array.from` just uses `String.prototype[Symbol.iterator]` internally, and that is defined to deal with code points, not grapheme clusters. Either choice would have confused some developers. IIRC, Perl 6 has built-in capabilities to deal with grapheme clusters, but until ES does, this use case must be addressed in user-land. ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: Unicode normalization problem
It was the 90s, when 16 bits seemed enough. Wish we could go back. Even in 1995 this was obviously going to fail, but the die had been cast years earlier in Windows and Java APIs and language/implementation designs. /be Claude Pache wrote: (So, taking your example, the character is internally represented as a sequence of two 16-bit-units, not “characters”. And, very confusingly, the String methods that contain “char” in their name have nothing to do with “characters”.) —Claude ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: Unicode normalization problem
Le 2 avr. 2015 à 01:22, Jordan Harband ljh...@gmail.com a écrit : Unfortunately we don't have a String#codepoints or something that would return the number of code points as opposed to the number of characters (that length returns) - something like that imo would greatly simplify explaining the differences to people. For the time being, I've been explaining that some characters are actually made up of two, and the character (it's a fun example to use) is an example of two characters combining to make one code point. It's not a quick or trivial thing to explain but people do seem to grasp it eventually. And when they think to have understood, they are in fact still in great trouble, because they will confuse it with other unrelated issues like grapheme clusters and/or precomposed characters. The issue here is specific to the UTF16 encoding, where some Unicode code points are encoded as a sequence of two 16-bit units; and ES strings are (by an accident of history) sequences of 16-bit units, not Unicode code points. I think it is important to stress that it is an issue of encoding, at least in order to have a chance to distinguish it from the other aforementioned issues. (So, taking your example, the character is internally represented as a sequence of two 16-bit-units, not “characters”. And, very confusingly, the String methods that contain “char” in their name have nothing to do with “characters”.) —Claude ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: Re-exporting imports and CreateImportBinding assertions
I've added this to a few bugs on the bug-tracker: https://bugs.ecmascript.org/show_bug.cgi?id=4184 (CreateImportBinding) https://bugs.ecmascript.org/show_bug.cgi?id=4244 (GetExportedNames and ResolveExport) On Wed, Apr 1, 2015 at 4:31 PM, Adam Klein ad...@chromium.org wrote: I have a question about CreateImportBinding(N, M, N2) (where N is the name to create in the importing module, and M is a module which exports N2). Step 4 of https://people.mozilla.org/~jorendorff/es6-draft.html#sec-createimportbinding is the following assertion Assert: When M.[[Environment]] is instantiated it will have a direct binding for N2. What about the case were M is simply re-exporting an import? Consider: - module 'a': import { x } from 'b'; - module 'b': import { x } from 'c'; export { x }; - module 'c': export let x = 42; - In this case, when running CreateImportBinding(x, 'b', x) in module 'a', the assertion fails, as x in 'b' is an immutable indirect binding (itself created by CreateImportBinding). Is there a need for this assert I'm missing? I don't think skipping over this assert, or removing direct from its wording, will cause any problems. Also, the term direct binding is not defined anywhere that I can find, except as the negation of the indirect binding created by CreateImportBinding. Note that there's a similar issue in ResolveExport: step 4.a.i of https://people.mozilla.org/~jorendorff/es6-draft.html#sec-resolveexport asserts that resolved exports found in [[LocalExportEntries]] are leaf bindings (another term that goes undefined), where by the usual CS definition of leaf the assertion would be false for x in 'b' (when resolved from 'a'). - Adam ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss