Comment #11 on issue 1972 by [email protected]: Incorrect treatment of
unicode escapes in keywords
https://code.google.com/p/v8/issues/detail?id=1972
Err, meant to say:
Project Member Reported by [email protected], Feb 24, 2012
The spec is not particularly clear on this, but I gather that unicode
escapes in identifier names are supposed to be decoded _before_
distinguishing between keywords and identifiers. That is,
v\u0061r x = 0
eval("v\\u0061r y = 1")
should parse as valid declarations. That's what FF does, and it's also in
line with other languages like Java. V8 rejects these examples with
SyntaxError. JSC seems to be inconsistent and disallows the first but
accepts the second.
Unfortunately, test262 has no tests for this.
Feb 24, 2012 Project Member #1 [email protected]
Conversely, the following should be syntax errors:
var v\u0061r = 9
eval("var v\\u0061r = 9")
FF rejects them, V8 accepts them, introducing a variable named "var".
Feb 24, 2012 Project Member #2 [email protected]
When looking more closely at the spec I totally agree with your
interpretation. Nice catch!
Feb 24, 2012 #3 [email protected]
Keywords cannot contain unicode escapes, so
v\u0061r x = 0
is a syntax error on two fronts.
The sequence "v\u0061r" can only be tokenized as an IdentifierName.
It's not the "var" keyword, since that can't contain escapes. It is
equivalent to the IdentifierName "var".
However, that is not an Identifier, since Identifier is "IdentifierName but
not ReservedWord" and "var" is a reserved word.
So, the only place you can use "v\u0061r" is as an object literal property
name or after a ".", where you can use a plain IdentifierName.
Browsers generally (AFAIR) used to treat "v\u0061r" as an Identifier,
so "var v\u0061r = 9" would work. Some might have changed that.
Feb 24, 2012 Project Member #4 [email protected]
Well, at least as far as the language in the spec is concerned, I don't
think this is clear at all. I filed a bug against the spec and test262, so
that this can get clarified.
Feb 24, 2012 #5 lassern
I absolutely agree that it's not clear :)
I read the "var" in the production for variable declarations as
a "terminal" (5.1.6), which says that it must be occur in the source
exactly as written (no escapes).
But then, the list of, e.g., keywords, are also given as terminals, but
they are really lists/sets of strings to compare (post-escape-resolution)
identifier names against.
(FWIW, the ES3 spec wasn't any better).
Oct 8, 2014 Project Member #7 [email protected]
From ES6 draft 11.6
Unicode escape sequences are permitted in an IdentifierName, where they
contribute a single Unicode code point to the IdentifierName. The code
point is expressed by the HexDigits of the UnicodeEscapeSequence (see
11.8.4). The \ preceding the UnicodeEscapeSequence and the u and { } code
units, if they appear, do not contribute code points to the IdentifierName.
A UnicodeEscapeSequence cannot be used to put a code point into an
IdentifierName that would otherwise be illegal. In other words, if a \
UnicodeEscapeSequence sequence were replaced by the SourceCharacter it
contributes, the result must still be a valid IdentifierName that has the
exact same sequence of SourceCharacter elements as the original
IdentifierName. All interpretations of IdentifierName within this
specification are based upon their actual code points regardless of whether
or not an escape sequence was used to contribute any particular code point.
So I guess this is pretty clear now.
Cc: [email protected]
Oct 8, 2014 Project Member #8 [email protected]
(No comment was entered for this change.)
Labels: Harmony
Oct 8, 2014 Project Member #9 [email protected]
Adding marja@ since she may have some ideas how to fix this in the parser?
Cc: [email protected]
Today (moments ago) Delete comment Project Member #10 [email protected]
ATM the spec draft also says:
The ReservedWord definitions are specified as literal sequences of specific
SourceCharacter elements. A code point in a ReservedWord cannot be
expressed by a \ UnicodeEscapeSequence.
http://people.mozilla.org/~jorendorff/es6-draft.html#sec-reserved-words
So... based on this, I *don't* think
v\u0061r x = 0
eval("v\\u0061r y = 1")
should be *legal*
--
You received this message because this project is configured to send all
issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings
--
--
v8-dev mailing list
[email protected]
http://groups.google.com/group/v8-dev
---
You received this message because you are subscribed to the Google Groups "v8-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.