Reviewers: rossberg,

Message:
rossberg, ptal

Description:
Scanner: disallow unicode escapes in regexp flags.

The spec explicitly forbids them. V8 never handled them properly either, just
the Scanner accepted them (it had code to add them literally to the
LiteralBuffer) and later on, Regexp constructor disallowed them.

According to the spec, unicode escapes in regexp flags should be an early error ("It is a Syntax Error if IdentifierPart contains a Unicode escape sequence.").

Note that Scanner is still more relaxed about regexp flags than the
spec. Especially, it accepts any identifier parts (not just a small set of
letters) and doesn't check for duplicates.

Please review this at https://codereview.chromium.org/700373003/

Base URL: https://v8.googlecode.com/svn/branches/bleeding_edge

Affected files (+4, -27 lines):
  M src/scanner.h
  M src/scanner.cc
  M test/cctest/test-parsing.cc


Index: src/scanner.cc
diff --git a/src/scanner.cc b/src/scanner.cc
index e63239d6eb0d3afed82d7af59498cc6c9a1cbfe4..ddcd937584d50fc45fa921a25e2a5d21fcf5697e 100644
--- a/src/scanner.cc
+++ b/src/scanner.cc
@@ -1138,24 +1138,6 @@ bool Scanner::ScanRegExpPattern(bool seen_equal) {
 }


-bool Scanner::ScanLiteralUnicodeEscape() {
-  DCHECK(c0_ == '\\');
-  AddLiteralChar(c0_);
-  Advance();
-  int hex_digits_read = 0;
-  if (c0_ == 'u') {
-    AddLiteralChar(c0_);
-    while (hex_digits_read < 4) {
-      Advance();
-      if (!IsHexDigit(c0_)) break;
-      AddLiteralChar(c0_);
-      ++hex_digits_read;
-    }
-  }
-  return hex_digits_read == 4;
-}
-
-
 bool Scanner::ScanRegExpFlags() {
   // Scan regular expression flags.
   LiteralScope literal(this);
@@ -1163,10 +1145,7 @@ bool Scanner::ScanRegExpFlags() {
     if (c0_ != '\\') {
       AddLiteralCharAdvance();
     } else {
-      if (!ScanLiteralUnicodeEscape()) {
-        return false;
-      }
-      Advance();
+      return false;
     }
   }
   literal.Complete();
Index: src/scanner.h
diff --git a/src/scanner.h b/src/scanner.h
index 387d3319c167c014177499e90698047e2b4364ae..e626f206c74952c856d77d55b708b9e01661b1ac 100644
--- a/src/scanner.h
+++ b/src/scanner.h
@@ -637,10 +637,6 @@ class Scanner {
   // Decodes a Unicode escape-sequence which is part of an identifier.
   // If the escape sequence cannot be decoded the result is kBadChar.
   uc32 ScanIdentifierUnicodeEscape();
-  // Scans a Unicode escape-sequence and adds its characters,
-  // uninterpreted, to the current literal. Used for parsing RegExp
-  // flags.
-  bool ScanLiteralUnicodeEscape();

   // Return the current source position.
   int source_pos() {
Index: test/cctest/test-parsing.cc
diff --git a/test/cctest/test-parsing.cc b/test/cctest/test-parsing.cc
index 1909b3e66b4039c3e2548bad2256e1d06d3b6a84..79d76546e90b117404d191bcf59bfa2f33ec5c20 100644
--- a/test/cctest/test-parsing.cc
+++ b/test/cctest/test-parsing.cc
@@ -4302,7 +4302,9 @@ TEST(InvalidUnicodeEscapes) {
     "var foob\\u123r = 0;",
     "var \\u123roo = 0;",
     "\"foob\\u123rr\"",
-    "/regex/g\\u123r",
+    // No escapes allowed in regexp flags
+    "/regex/\\u0069g",
+    "/regex/\\u006g",
     NULL};
   RunParserSyncTest(context_data, data, kError);
 }


--
--
v8-dev mailing list
[email protected]
http://groups.google.com/group/v8-dev
--- You received this message because you are subscribed to the Google Groups "v8-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to