[v8-dev] Scanner: disallow unicode escapes in regexp flags. (issue 700373003 by [email protected])

marja Thu, 06 Nov 2014 01:11:36 -0800

Reviewers: rossberg,

Message:
rossberg, ptal


Description:
Scanner: disallow unicode escapes in regexp flags.

The spec explicitly forbids them. V8 never handled them properly either,just

the Scanner accepted them (it had code to add them literally to the
LiteralBuffer) and later on, Regexp constructor disallowed them.

According to the spec, unicode escapes in regexp flags should be an earlyerror("It is a Syntax Error if IdentifierPart contains a Unicode escapesequence.").


Note that Scanner is still more relaxed about regexp flags than the
spec. Especially, it accepts any identifier parts (not just a small set of
letters) and doesn't check for duplicates.

Please review this at https://codereview.chromium.org/700373003/

Base URL: https://v8.googlecode.com/svn/branches/bleeding_edge

Affected files (+4, -27 lines):
  M src/scanner.h
  M src/scanner.cc
  M test/cctest/test-parsing.cc


Index: src/scanner.cc
diff --git a/src/scanner.cc b/src/scanner.cc

indexe63239d6eb0d3afed82d7af59498cc6c9a1cbfe4..ddcd937584d50fc45fa921a25e2a5d21fcf5697e100644

--- a/src/scanner.cc
+++ b/src/scanner.cc
@@ -1138,24 +1138,6 @@ bool Scanner::ScanRegExpPattern(bool seen_equal) {
 }


-bool Scanner::ScanLiteralUnicodeEscape() {
-  DCHECK(c0_ == '\\');
-  AddLiteralChar(c0_);
-  Advance();
-  int hex_digits_read = 0;
-  if (c0_ == 'u') {
-    AddLiteralChar(c0_);
-    while (hex_digits_read < 4) {
-      Advance();
-      if (!IsHexDigit(c0_)) break;
-      AddLiteralChar(c0_);
-      ++hex_digits_read;
-    }
-  }
-  return hex_digits_read == 4;
-}
-
-
 bool Scanner::ScanRegExpFlags() {
   // Scan regular expression flags.
   LiteralScope literal(this);
@@ -1163,10 +1145,7 @@ bool Scanner::ScanRegExpFlags() {
     if (c0_ != '\\') {
       AddLiteralCharAdvance();
     } else {
-      if (!ScanLiteralUnicodeEscape()) {
-        return false;
-      }
-      Advance();
+      return false;
     }
   }
   literal.Complete();
Index: src/scanner.h
diff --git a/src/scanner.h b/src/scanner.h

index387d3319c167c014177499e90698047e2b4364ae..e626f206c74952c856d77d55b708b9e01661b1ac100644

--- a/src/scanner.h
+++ b/src/scanner.h
@@ -637,10 +637,6 @@ class Scanner {
   // Decodes a Unicode escape-sequence which is part of an identifier.
   // If the escape sequence cannot be decoded the result is kBadChar.
   uc32 ScanIdentifierUnicodeEscape();
-  // Scans a Unicode escape-sequence and adds its characters,
-  // uninterpreted, to the current literal. Used for parsing RegExp
-  // flags.
-  bool ScanLiteralUnicodeEscape();

   // Return the current source position.
   int source_pos() {
Index: test/cctest/test-parsing.cc
diff --git a/test/cctest/test-parsing.cc b/test/cctest/test-parsing.cc

index1909b3e66b4039c3e2548bad2256e1d06d3b6a84..79d76546e90b117404d191bcf59bfa2f33ec5c20100644

--- a/test/cctest/test-parsing.cc
+++ b/test/cctest/test-parsing.cc
@@ -4302,7 +4302,9 @@ TEST(InvalidUnicodeEscapes) {
     "var foob\\u123r = 0;",
     "var \\u123roo = 0;",
     "\"foob\\u123rr\"",
-    "/regex/g\\u123r",
+    // No escapes allowed in regexp flags
+    "/regex/\\u0069g",
+    "/regex/\\u006g",
     NULL};
   RunParserSyncTest(context_data, data, kError);
 }


--
--
v8-dev mailing list
[email protected]
http://groups.google.com/group/v8-dev

---You received this message because you are subscribed to the Google Groups "v8-dev" group.

To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

[v8-dev] Scanner: disallow unicode escapes in regexp flags. (issue 700373003 by [email protected])

Reply via email to