On 28 Jan 2015, at 11:36, Marja Hölttä ma...@chromium.org wrote:
TL;DR: /foo.bar/u.test(“foo\uD83Dbar”) == ?
The ES6 unicode regexp spec is not very clear regarding what should happen if
the regexp or the matched string contains lonely surrogates (a lead surrogate
without a trail, or a
On 1/28/2015 2:51 PM, André Bargull wrote:
For a reference, here's how Java (tried w/ Oracle 1.8.0_31 and openjdk
1.7.0_65) Pattern.UNICODE_CHARACTER_CLASS works:
foo\uD834bar and foo\uDC00bar match ^foo[^a]bar$ and ^foo.bar$, so,
generally, lonely surrogates match /./.
Backreferences are
Typically, implementation-specific things aren't specified in the spec
(like Math precision, etc) - although usually when it's
implementation-specific, it's explicitly noted as such (
https://people.mozilla.org/~jorendorff/es6-draft.html#sec-date.parse ,
For a reference, here's how Java (tried w/ Oracle 1.8.0_31 and openjdk
1.7.0_65) Pattern.UNICODE_CHARACTER_CLASS works:
foo\uD834bar and foo\uDC00bar match ^foo[^a]bar$ and ^foo.bar$, so,
generally, lonely surrogates match /./.
Backreferences are allowed to consume the leading surrogate of a
I think the cleanest mental model is where UTF-16 or UTF-8 strings are
interpreted as if they were transformed into UTF-32.
While that is generally feasible, it often represents a cost in performance
which is not acceptable in practice. So you see various approaches that
involve some deviation
On 28 January 2015 at 13:14, Claude Pache claude.pa...@gmail.com wrote:
To me, finite is just to be taken in the common mathematical sense of
the term; in particular you could have theoretically a string of length
10^1. But yes, it would be reasonable to restrict oneself to strings of
For a reference, here's how Java (tried w/ Oracle 1.8.0_31 and openjdk
1.7.0_65) Pattern.UNICODE_CHARACTER_CLASS works:
foo\uD834bar and foo\uDC00bar match ^foo[^a]bar$ and ^foo.bar$, so,
generally, lonely surrogates match /./.
Backreferences are allowed to consume the leading surrogate of a
Based on Ex1, looks like the input string is not read as a sequence of code
points when we try to find a match for \1. So it's mostly read as a
sequence of code points except when it's not. :/
On Wed, Jan 28, 2015 at 3:11 PM, André Bargull andre.barg...@udo.edu
wrote:
On 1/28/2015 2:51 PM,
Hello es-discuss,
TL;DR: /foo.bar/u.test(“foo\uD83Dbar”) == ?
The ES6 unicode regexp spec is not very clear regarding what should happen
if the regexp or the matched string contains lonely surrogates (a lead
surrogate without a trail, or a trail without a lead). For example, for the
. operator,
On Wed, Jan 28, 2015 at 11:36 AM, Marja Hölttä marja at chromium.org
https://mail.mozilla.org/listinfo/es-discuss wrote:
/ The ES6 unicode regexp spec is not very clear regarding what should happen
// if the regexp or the matched string contains lonely surrogates (a lead
// surrogate
Good, that sounds right.
Mark https://google.com/+MarkDavis
*— Il meglio è l’inimico del bene —*
On Wed, Jan 28, 2015 at 12:57 PM, André Bargull andre.barg...@udo.edu
wrote:
On Wed, Jan 28, 2015 at 11:36 AM, Marja Hölttä marja at chromium.org
https://mail.mozilla.org/listinfo/es-discuss
On Wed, Jan 28, 2015 at 11:45 AM, Mathias Bynens math...@qiwi.be wrote:
On 28 Jan 2015, at 11:36, Marja Hölttä ma...@chromium.org wrote:
For example, the current version of Mathias’s ES6 Unicode regular
expression transpiler ( https://mothereff.in/regexpu ) converts /a.b/u
into
Le 28 janv. 2015 à 09:58, Jordan Harband ljh...@gmail.com a écrit :
Typically, implementation-specific things aren't specified in the spec (like
Math precision, etc) - although usually when it's implementation-specific,
it's explicitly noted as such (
On Wed, Jan 28, 2015 at 11:36 AM, Marja Hölttä ma...@chromium.org wrote:
The ES6 unicode regexp spec is not very clear regarding what should happen
if the regexp or the matched string contains lonely surrogates (a lead
surrogate without a trail, or a trail without a lead). For example, for the
Some interesting questions here.
1 - What is a character? Is it a Unicode Code Point?
2 - Should we be able to match all possible JS Strings?
3 - Should we be able to match all possible Unicode Strings?
4 - What do we do if there is a character in a String we cannot match?
5 - Do unmatchable
From: es-discuss [mailto:es-discuss-boun...@mozilla.org] On Behalf Of Jordan
Harband
Strings can't possibly have a length larger than Number.MAX_SAFE_INTEGER -
otherwise, you'd be able to have a string whose length is not a number
representable in JavaScript.
So? That's a bit inconvenient,
On Jan 28, 2015, at 4:40 PM, John-David Dalton john.david.dal...@gmail.com
wrote:
Kind of a bummer. The isTypedArray example from
https://esdiscuss.org/topic/tostringtag-spoofing-for-null-and-undefined#content-59
is incorrect. Is there an updated reference somewhere?
The toStringTag
Primary issue is in isTypedArray(a):
Uin32Array.prototype.buffer.call(a);
Besides the typos, accessing .buffer throws in at least Chrome Firefox.
Then .buffer is an object so if it doesn't throw there's no .call to
execute.
-JDD
On Wed, Jan 28, 2015 at 4:55 PM, Allen Wirfs-Brock
Strings can't possibly have a length larger than Number.MAX_SAFE_INTEGER -
otherwise, you'd be able to have a string whose length is not a number
representable in JavaScript. So, at the least, I think it would make sense
to define a maximum string length as Number.MAX_SAFE_INTEGER, even if that
To summarize the discussion at today's TC39 meeting:
Given that the style of checks that Allen proposed (
https://esdiscuss.org/topic/tostringtag-spoofing-for-null-and-undefined#content-59
) (using non-side-effecty non-generic methods that rely on internal slots,
in a try/catch) is indeed
Kind of a bummer. The isTypedArray example from
https://esdiscuss.org/topic/tostringtag-spoofing-for-null-and-undefined#content-59
is
incorrect. Is there an updated reference somewhere?
The toStringTag result is handy because it allows checking against several
tags at once without having to invoke
I suppose we could change the spec, but
https://people.mozilla.org/~jorendorff/es6-draft.html#sec-ecmascript-language-types-string-type
requires that The length of a String is the number of elements (i.e.,
16-bit values) within it. - if the number can't be represented, then it
seems that
On Jan 28, 2015, at 5:03 PM, John-David Dalton john.david.dal...@gmail.com
wrote:
Primary issue is in isTypedArray(a):
Uin32Array.prototype.buffer.call(a);
Besides the typos, accessing .buffer throws in at least Chrome Firefox.
Then .buffer is an object so if it doesn't throw there's
On Wed, Jan 28, 2015 at 5:44 AM, Andreas Rossberg rossb...@google.com
wrote:
On 28 January 2015 at 13:14, Claude Pache claude.pa...@gmail.com wrote:
To me, finite is just to be taken in the common mathematical sense of
the term; in particular you could have theoretically a string of length
On 1/28/2015 3:36 PM, Marja Hölttä wrote:
Based on Ex1, looks like the input string is not read as a sequence of code
points when we try to
find a match for \1. So it's mostly read as a sequence of code points except
when it's not. :/
Yep, back references are matched as a sequence of code
Le 29 janv. 2015 à 01:49, Jordan Harband ljh...@gmail.com a écrit :
I suppose we could change the spec, but
https://people.mozilla.org/~jorendorff/es6-draft.html#sec-ecmascript-language-types-string-type
requires that The length of a String is the number of elements (i.e.,
16-bit
At the moment that throws too. Anyways it's something to hammer on a bit.
Maybe Jordan can kick it around too.
Thanks,
-JDD
On Wed, Jan 28, 2015 at 5:16 PM, Allen Wirfs-Brock al...@wirfs-brock.com
wrote:
On Jan 28, 2015, at 5:03 PM, John-David Dalton
john.david.dal...@gmail.com wrote:
On Jan 28, 2015, at 5:26 AM, Mark Davis ☕️ m...@macchiato.com wrote:
I think the cleanest mental model is where UTF-16 or UTF-8 strings are
interpreted as if they were transformed into UTF-32.
This is exactly the approach used in the ES6 spec (except that it doesn’t deal
with UTF-8)
Cool, thanks for clarifications!
To make sure, as per the intended semantics, we never allow splitting a
valid surrogate pair (= matching only one of the surrogates but not the
other), and thus we'll differ from the Java implementation here:
/foo(.+)bar\1/u.test(foo\uD834bar\uD834\uDC00); we say
On Jan 28, 2015, at 2:36 AM, Marja Hölttä ma...@chromium.org wrote:
Hello es-discuss,
TL;DR: /foo.bar/u.test(“foo\uD83Dbar”) == ?
The ES6 unicode regexp spec is not very clear regarding what should happen if
the regexp or the matched string contains lonely surrogates (a lead surrogate
From: Mark S. Miller [mailto:erig...@google.com]
On Tue, Jan 27, 2015 at 5:53 PM, Boris Zbarsky bzbar...@mit.edu wrote:
I'd like to understand better the suggestion here, because I'm not sure I'm
entirely following it. Specifically, I'd like to understand it in terms of
the internal
On Jan 28, 2015, at 4:54 AM, Wes Garland w...@page.ca wrote:
Some interesting questions here.
These aren't discussion points. These are all things that must have answers
that are directly derivable from the ES6 spec. If, after developing an
adequate understand of that part of the
Cool, thanks for clarifications!
To make sure, as per the intended semantics, we never allow splitting a
valid surrogate pair (= matching only one of the surrogates but not the
other), and thus we'll differ from the Java implementation here:
/foo(.+)bar\1/u.test(foo\uD834bar\uD834\uDC00); we
On Wed, Jan 28, 2015 at 8:51 AM, Domenic Denicola d...@domenic.me wrote:
From: Mark S. Miller [mailto:erig...@google.com]
On Tue, Jan 27, 2015 at 5:53 PM, Boris Zbarsky bzbar...@mit.edu wrote:
I'd like to understand better the suggestion here, because I'm not sure
I'm entirely following
On Wed, Jan 28, 2015 at 11:08 AM, Domenic Denicola d...@domenic.me wrote:
From: Mark S. Miller [mailto:erig...@google.com]
In this situation, it will try and succeed. This more closely obeys the
intent in the original code (e.g., the comment in the jQuery code), since
it creates a
From: Mark S. Miller [mailto:erig...@google.com]
In this situation, it will try and succeed. This more closely obeys the
intent in the original code (e.g., the comment in the jQuery code), since it
creates a non-configurable property on the *Window* W. It does not violate
any invariant,
Mark S. Miller wrote:
Exactly correct. I didn't realize until reading your reply is that
this is all that's necessary -- that it successfully covers all
the cases I was thinking about without any further case division.
Here's another option, not clearly better or worse:
37 matches
Mail list logo