Revision: 25031
Author: [email protected]
Date: Fri Oct 31 13:03:15 2014 UTC
Log: Scanner: remove PushBack calls when we're going to return ILLEGAL.
This simplifies escape handling and makes it easier to extend escapes for
ES6.
PushBack just before detecting ILLEGAL is unnecessary, since we will abort
the
scanning / parsing anyway at that point, and it doesn't matter where the
cursor
exactly is. The error messages w/ PushBack are not any better or more
correct
than without.
In addition: remove a comment about handling invalid escapes gracefully
when we
no longer do. (*)
This CL includes a behavioral change: For input "var r = /foobar/g\urrrr;"
we
used to report "unexpected_token: ILLEGAL" for "\u", but now we report
malformed_regexp_flags which is a more correct error message. (Note that the
code for reporting invalid_regexp_flags was dead, and invalid_regexp_flags
is
not the right error message.)
Note that the V8 is more relaxed about unicode escapes in regexp flags than
ES6
(see
http://people.mozilla.org/~jorendorff/es6-draft.html#sec-regular-expressions
)
and this CL doesn't change it. (V8 accepts any \uxxxx, ES6 spec says only a
certain value range is acceptable.)
(*) Code archaeology:
Originally, doing PushBack in ScanHexEscape made sense (see e.g., here
https://codereview.chromium.org/5063003/diff/6001/src/prescanner.h ), since
we
wouldn't return ILLEGAL but treat an invalid escape sequence "\uxxxx" as
"uxxxx".
(The repo at that point contains another instance of the same function,
from the
initial commit. The logic is the same.)
This behavior was changed in a "renaming" commit
https://codereview.chromium.org/7739020.
BUG=
[email protected]
Review URL: https://codereview.chromium.org/684873002
https://code.google.com/p/v8/source/detail?r=25031
Modified:
/branches/bleeding_edge/src/messages.js
/branches/bleeding_edge/src/preparser.h
/branches/bleeding_edge/src/scanner.cc
/branches/bleeding_edge/test/cctest/test-parsing.cc
=======================================
--- /branches/bleeding_edge/src/messages.js Thu Oct 30 14:21:27 2014 UTC
+++ /branches/bleeding_edge/src/messages.js Fri Oct 31 13:03:15 2014 UTC
@@ -20,6 +20,7 @@
unexpected_strict_reserved: ["Unexpected strict mode reserved word"],
unexpected_eos: ["Unexpected end of input"],
malformed_regexp: ["Invalid regular expression:
/", "%0", "/: ", "%1"],
+ malformed_regexp_flags: ["Invalid regular expression flags"],
unterminated_regexp: ["Invalid regular expression: missing /"],
regexp_flags: ["Cannot supply flags when constructing
one RegExp from another"],
incompatible_method_receiver: ["Method ", "%0", " called on
incompatible receiver ", "%1"],
=======================================
--- /branches/bleeding_edge/src/preparser.h Tue Oct 28 12:23:26 2014 UTC
+++ /branches/bleeding_edge/src/preparser.h Fri Oct 31 13:03:15 2014 UTC
@@ -1704,7 +1704,7 @@
IdentifierT js_pattern = this->GetNextSymbol(scanner());
if (!scanner()->ScanRegExpFlags()) {
Next();
- ReportMessage("invalid_regexp_flags");
+ ReportMessage("malformed_regexp_flags");
*ok = false;
return Traits::EmptyExpression();
}
=======================================
--- /branches/bleeding_edge/src/scanner.cc Tue Oct 21 08:25:14 2014 UTC
+++ /branches/bleeding_edge/src/scanner.cc Fri Oct 31 13:03:15 2014 UTC
@@ -57,20 +57,10 @@
uc32 Scanner::ScanHexNumber(int expected_length) {
DCHECK(expected_length <= 4); // prevent overflow
- uc32 digits[4] = { 0, 0, 0, 0 };
uc32 x = 0;
for (int i = 0; i < expected_length; i++) {
- digits[i] = c0_;
int d = HexValue(c0_);
if (d < 0) {
- // According to ECMA-262, 3rd, 7.8.4, page 18, these hex escapes
- // should be illegal, but other JS VMs just return the
- // non-escaped version of the original character.
-
- // Push back digits that we have advanced past.
- for (int j = i-1; j >= 0; j--) {
- PushBack(digits[j]);
- }
return -1;
}
x = x * 16 + d;
@@ -894,9 +884,7 @@
Advance();
if (c0_ != 'u') return -1;
Advance();
- uc32 result = ScanHexNumber(4);
- if (result < 0) PushBack('u');
- return result;
+ return ScanHexNumber(4);
}
@@ -1145,31 +1133,19 @@
bool Scanner::ScanLiteralUnicodeEscape() {
DCHECK(c0_ == '\\');
- uc32 chars_read[6] = {'\\', 'u', 0, 0, 0, 0};
+ AddLiteralChar(c0_);
Advance();
- int i = 1;
+ int hex_digits_read = 0;
if (c0_ == 'u') {
- i++;
- while (i < 6) {
+ AddLiteralChar(c0_);
+ while (hex_digits_read < 4) {
Advance();
if (!IsHexDigit(c0_)) break;
- chars_read[i] = c0_;
- i++;
+ AddLiteralChar(c0_);
+ ++hex_digits_read;
}
}
- if (i < 6) {
- // Incomplete escape. Undo all advances and return false.
- while (i > 0) {
- i--;
- PushBack(chars_read[i]);
- }
- return false;
- }
- // Complete escape. Add all chars to current literal buffer.
- for (int i = 0; i < 6; i++) {
- AddLiteralChar(chars_read[i]);
- }
- return true;
+ return hex_digits_read == 4;
}
@@ -1181,7 +1157,7 @@
AddLiteralCharAdvance();
} else {
if (!ScanLiteralUnicodeEscape()) {
- break;
+ return false;
}
Advance();
}
=======================================
--- /branches/bleeding_edge/test/cctest/test-parsing.cc Tue Oct 28 13:23:54
2014 UTC
+++ /branches/bleeding_edge/test/cctest/test-parsing.cc Fri Oct 31 13:03:15
2014 UTC
@@ -4253,3 +4253,17 @@
RunParserSyncTest(context_data, data, kError, NULL, 0, always_flags,
arraysize(always_flags));
}
+
+
+TEST(InvalidUnicodeEscapes) {
+ const char* context_data[][2] = {{"", ""},
+ {"'use strict';", ""},
+ {NULL, NULL}};
+ const char* data[] = {
+ "var foob\\u123r = 0;",
+ "var \\u123roo = 0;",
+ "\"foob\\u123rr\"",
+ "/regex/g\\u123r",
+ NULL};
+ RunParserSyncTest(context_data, data, kError);
+}
--
--
v8-dev mailing list
[email protected]
http://groups.google.com/group/v8-dev
---
You received this message because you are subscribed to the Google Groups "v8-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.