[Caja] [google-caja] r5622 committed - fix backslash-newline handling the JS parser...

google-caja Thu, 31 Oct 2013 15:25:06 -0700

Revision: 5622
Author:   [email protected]
Date:     Thu Oct 31 22:24:13 2013 UTC
Log:      fix backslash-newline handling the JS parser
https://codereview.appspot.com/20470044


The JS lexer elides backslash-newline sequences at an early stage,
before tokenizing. This seems to be fantasy. It's not in Ecmascript,
and I can't find any JS implementation that treats backslash-newline
as a continuation.

That behavior causes unexpected effects when a // comment ends with
a backslash, like in escodegen.js as described in issue 1868.

So, this CL eliminates the weirdness.

InputElementSplitter is where the backslash-newline elision happens.
Deleting that is straightforward.

Ecmascript does say that backslash-newline in strings gets elided,
so the rest of this CL is about supporting that.

Our JS lexical tokens hold the original source code's char sequence,
so at the lexer level nothing needs to be done with the backslash-newline,
except for fixing up the lexer tests to match reality.

At the JS parser level, StringLiteral nodes have the logic to convert
the source code text to an actual value. Everyone defers to that,
and adding handling of backslash-newline there is straightforward.

Ecmascript is strict about "use strict" directives. Escape sequences
and backslash-newline are not allowed in the directive. Our existing
logic handles that fine. I just added some testcases to verify that.

Stray backslashes in JS will become a WORD token or part of a WORD
token, which is consistent with how we handle \u escapes. These will
be rejected at the parser level. I don't see any particular reason
to reject backslashes in the lexer level, so I left that alone.

R=kpreid2


http://code.google.com/p/google-caja/source/detail?r=5622

Modified:
 /trunk/src/com/google/caja/lexer/InputElementSplitter.java
 /trunk/src/com/google/caja/parser/js/StringLiteral.java
 /trunk/src/com/google/caja/render/TokenClassification.java
 /trunk/tests/com/google/caja/lexer/lexergolden1.txt
 /trunk/tests/com/google/caja/lexer/lexertest1.js
 /trunk/tests/com/google/caja/parser/js/ParserTest.java
 /trunk/tests/com/google/caja/parser/js/StringLiteralTest.java
 /trunk/tests/com/google/caja/parser/js/parsergolden10.txt
 /trunk/tests/com/google/caja/parser/js/parsertest10.js
 /trunk/tests/com/google/caja/render/JsMinimalPrinterTest.java

=======================================

--- /trunk/src/com/google/caja/lexer/InputElementSplitter.java Fri Oct 900:37:37 2009 UTC+++ /trunk/src/com/google/caja/lexer/InputElementSplitter.java Thu Oct 3122:24:13 2013 UTC

@@ -24,7 +24,7 @@
  * @author [email protected]
  */
 final class InputElementSplitter extends AbstractTokenStream<JsTokenType> {
-  private final DecodingCharProducer p;
+  private final CharProducer p;
   /**
    * A trie used to split a chunk of text into punctuation tokens and
    * non-punctuation tokens.
@@ -49,7 +49,7 @@

public InputElementSplitter(CharProducer p, PunctuationTrie<?>punctuation,

                               boolean isQuasiliteral) {
-    this.p = lineContinuingCharProducer(p);
+    this.p = p;
     this.punctuation = punctuation;
     this.isQuasiliteral = isQuasiliteral;
   }
@@ -71,11 +71,6 @@
     if (start < limit && JsLexer.isJsSpace(buf[start])) {
       ++start;
       while (start < limit && JsLexer.isJsSpace(buf[start])) {
-        if (tokenBreak(start)) {
-          p.consumeTo(start);
-          return Token.instance(
-              "\\", JsTokenType.LINE_CONTINUATION, p.getCurrentPosition());
-        }
         ++start;
       }
       p.consumeTo(start);
@@ -96,9 +91,8 @@
           if (ch2 == ch && !escaped) {
             closed = true;
             break;
-          } else if (JsLexer.isJsLineSeparator(ch2)) {
+          } else if (!escaped && JsLexer.isJsLineSeparator(ch2)) {
             // will register as an unterminated string token below
-            break;
           }
           escaped = !escaped && ch2 == '\\';
         }
@@ -135,7 +129,7 @@
                   ++end;
                   break;
                 } else {
-                  star = (ch2 == '*') && !tokenBreak(end);
+                  star = (ch2 == '*');
                 }
               }
               if (!closed) {
@@ -235,7 +229,7 @@
             if (isQuasi && (ch2 == '*' || ch2 == '+' || ch2 == '?')) {
               ++end;
               break;
-            } else if (JsLexer.isJsSpace(ch2) || tokenBreak(end)
+            } else if (JsLexer.isJsSpace(ch2)
                 || '\'' == ch2 || '"' == ch2
                 || punctuation.contains(ch2)) {
               break;
@@ -306,64 +300,10 @@
     while (end < limit) {
       char ch = buf[end];
       PunctuationTrie<?> t2 = t.lookup(ch);
-      if (null == t2 || !t2.isTerminal() || tokenBreak(end)) { break; }
+      if (null == t2 || !t2.isTerminal()) { break; }
       ++end;
       t = t2;
     }
     return end;
   }
-
-  /**
-   * True if the given offset into p fell on a line continuation.
-   * This helps us distinguish between <pre>
-   * a++
-   * b
-   * </pre>
-   * and
-   * <pre>
-   * a+\
-   * +
-   * b
-   * </pre>

- * where the fist is equivalent to <pre>{ a ++; b; }</pre> and thelatter to

-   * <pre>{ (a + (+b)); }</pre>.
-   */
-  private boolean tokenBreak(int offset) {
-    if (offset == p.getLimit()) { return false; }
-    int nUnderlyingChars = (
-        p.getUnderlyingOffset(offset + 1) - p.getUnderlyingOffset(offset));
-    return nUnderlyingChars != 1;
-  }
-
-  DecodingCharProducer lineContinuingCharProducer(CharProducer p) {
-    return DecodingCharProducer.make(new DecodingCharProducer.Decoder() {
-      @Override
-      void decode(char[] chars, int offset, int limit) {
-        int end = offset;
-        while (end + 1 < limit
-               && chars[end] == '\\'
-               && (chars[end + 1] == '\r' || chars[end + 1] == '\n')) {
-          if (chars[end + 1] == '\r'
-              && chars[end + 2] < limit && chars[end + 2] == '\n') {
-            end += 3;
-          } else {
-            end += 2;
-          }
-        }
-        // TODO: can this first clause go away?
-        if (end == offset) {
-          this.end = offset + 1;
-          this.codePoint = chars[offset];

- } else if (end <= limit) { // Skipped one or more linecontinuations

-          this.end = end + 1;
-          this.codePoint = chars[end];
-        } else {
-          this.end = end;
-          // If a run of line escapes runs right up until the end-of-file,
-          // pretend there is a newline at the end of file.
-          this.codePoint = '\n';
-        }
-      }
-    }, p);
-  }
 }
=======================================

--- /trunk/src/com/google/caja/parser/js/StringLiteral.java Wed Jun 1306:08:32 2012 UTC+++ /trunk/src/com/google/caja/parser/js/StringLiteral.java Thu Oct 3122:24:13 2013 UTC

@@ -149,6 +149,7 @@

     StringBuffer sb = new StringBuffer(s.length());
     do {
+      m.appendReplacement(sb, "");
       String g;
       char repl;
       if (null != (g = m.group(1))) {  // unicode escape
@@ -166,10 +167,10 @@
           case 'f': repl = '\f'; break;
           case 't': repl = '\t'; break;
           case 'v': repl = '\u000b'; break;
+          case '\n': continue;      // backslash newline is elided
           default: repl = ch; break;
         }
       }
-      m.appendReplacement(sb, "");
       sb.append(repl);
     } while (m.find());
     m.appendTail(sb);
=======================================

--- /trunk/src/com/google/caja/render/TokenClassification.java Mon Apr 422:55:19 2011 UTC+++ /trunk/src/com/google/caja/render/TokenClassification.java Thu Oct 3122:24:13 2013 UTC

@@ -51,8 +51,6 @@
             return COMMENT;
           }
           if (ch1 == '/') {
-            // This would escape the following newline.
-            if (chLast == '\\') { throw new IllegalArgumentException(); }
             return COMMENT;
           }
           if (n > 2) {  // /= is 2 characters and / is 1
=======================================

--- /trunk/tests/com/google/caja/lexer/lexergolden1.txt Mon Mar 9 22:39:442009 UTC+++ /trunk/tests/com/google/caja/lexer/lexergolden1.txt Thu Oct 31 22:24:132013 UTC

@@ -36,7 +36,9 @@
 PUNC [;]: lexertest1.js:9+20@119 - 21@120
 WORD [s]: lexertest1.js:10+1@121 - 2@122
 PUNC [=]: lexertest1.js:10+3@123 - 4@124

-STRI ['a string that spans multiple physical lines but not logical ones']:lexertest1.js:10+5@125 - 12+22@195

+STRI ['a string that \
+spans multiple physical lines \
+but not logical ones']: lexertest1.js:10+5@125 - 12+22@195
 PUNC [;]: lexertest1.js:12+22@195 - 23@196

STRI ['a string with "double quotes" inside and ']: lexertest1.js:13+1@197- 44@240

 PUNC [+]: lexertest1.js:13+45@241 - 46@242
@@ -49,98 +51,96 @@
 KEYW [var]: lexertest1.js:16+1@339 - 4@342
 WORD [s]: lexertest1.js:16+5@343 - 6@344
 PUNC [=]: lexertest1.js:16+7@345 - 8@346

-STRI ["double quotes work inside strings too.pretty well actually"]:lexertest1.js:16+9@347 - 17+22@409

+STRI ["double quotes work inside strings too.\
+pretty well actually"]: lexertest1.js:16+9@347 - 17+22@409
 PUNC [;]: lexertest1.js:17+22@409 - 23@410

-COMM [// a line comment that oddly spans multiple physical lines]:lexertest1.js:19+1@412 - 20+30@472

-COMM [/* multiline comments have
-   no need for such silliness */]: lexertest1.js:22+1@474 - 23+33@533

+COMM [// a line comment that ends with a \]: lexertest1.js:19+1@412 -37@448

+STRI ['does not absorb the next line']: lexertest1.js:20+1@449 - 32@480
+PUNC [;]: lexertest1.js:20+32@480 - 33@481
+COMM [/* multiline comments also *\

+/ "don't care about backslash newline" */]: lexertest1.js:22+1@483 -23+42@554

 COMM [/*/ try and confuse the lexer
     with a star-slash before
     the end of the comment.
- */]: lexertest1.js:25+1@535 - 28+4@625

-COMM [/* comments can have embedded "strings" */]: lexertest1.js:30+1@627- 43@669

-STRI ["and /*vice-versa*/ "]: lexertest1.js:31+1@670 - 22@691
-WORD [we]: lexertest1.js:33+1@693 - 3@695
-PUNC [(]: lexertest1.js:33+3@695 - 4@696
-WORD [need]: lexertest1.js:33+4@696 - 8@700
-PUNC [-]: lexertest1.js:33+9@701 - 10@702
-WORD [to]: lexertest1.js:33+11@703 - 13@705
-PUNC [+]: lexertest1.js:33+14@706 - 15@707
-PUNC [{]: lexertest1.js:33+16@708 - 17@709
-PUNC [{]: lexertest1.js:33+17@709 - 18@710
-PUNC [{]: lexertest1.js:33+18@710 - 19@711
-WORD [test]: lexertest1.js:33+19@711 - 23@715
-WORD [punctuation]: lexertest1.js:33+24@716 - 35@727
-WORD [thoroughly]: lexertest1.js:33+36@728 - 46@738
-PUNC [}]: lexertest1.js:33+46@738 - 47@739
-PUNC [}]: lexertest1.js:33+47@739 - 48@740
-PUNC [}]: lexertest1.js:33+48@740 - 49@741
-PUNC [)]: lexertest1.js:33+49@741 - 50@742
-PUNC [;]: lexertest1.js:33+50@742 - 51@743
-WORD [left]: lexertest1.js:35+1@745 - 5@749
-PUNC [<<=]: lexertest1.js:35+6@750 - 9@753
-WORD [shift_amount]: lexertest1.js:35+10@754 - 22@766
-PUNC [;]: lexertest1.js:35+22@766 - 23@767
-FLOA [14.0005e-6]: lexertest1.js:37+1@769 - 11@779
-WORD [is]: lexertest1.js:37+12@780 - 14@782
-WORD [one]: lexertest1.js:37+15@783 - 18@786
-WORD [token]: lexertest1.js:37+19@787 - 24@792
-PUNC [?]: lexertest1.js:37+24@792 - 25@793

-COMM [// check that exponentials with signs extracted properly duringsplitting]: lexertest1.js:39+1@795 - 74@868

-KEYW [var]: lexertest1.js:40+1@869 - 4@872
-WORD [num]: lexertest1.js:40+5@873 - 8@876
-PUNC [=]: lexertest1.js:40+9@877 - 10@878
-INTE [1000]: lexertest1.js:40+11@879 - 15@883
-PUNC [-]: lexertest1.js:40+15@883 - 16@884
-FLOA [1e+2]: lexertest1.js:40+16@884 - 20@888
-PUNC [*]: lexertest1.js:40+20@888 - 21@889
-INTE [2]: lexertest1.js:40+21@889 - 22@890
-PUNC [;]: lexertest1.js:40+22@890 - 23@891

-COMM [// check that dotted identifiers split, but decimal numbers not.]:lexertest1.js:42+1@893 - 65@957

-WORD [foo]: lexertest1.js:43+1@958 - 4@961
-PUNC [.]: lexertest1.js:43+4@961 - 5@962
-WORD [bar]: lexertest1.js:43+5@962 - 8@965
-PUNC [=]: lexertest1.js:43+9@966 - 10@967
-FLOA [4.0]: lexertest1.js:43+11@968 - 14@971
-PUNC [;]: lexertest1.js:43+14@971 - 15@972
-WORD [foo2]: lexertest1.js:44+1@973 - 5@977
-PUNC [.]: lexertest1.js:44+5@977 - 6@978
-WORD [bar]: lexertest1.js:44+6@978 - 9@981
-PUNC [=]: lexertest1.js:44+10@982 - 11@983
-WORD [baz]: lexertest1.js:44+12@984 - 15@987
-PUNC [;]: lexertest1.js:44+15@987 - 16@988
-FLOA [.5]: lexertest1.js:46+1@990 - 3@992
-COMM [// a numeric token]: lexertest1.js:46+5@994 - 23@1012

-COMM [// test how line continuations affect punctuation]:lexertest1.js:48+1@1014 - 50@1063

-INTE [1]: lexertest1.js:49+1@1064 - 2@1065
-PUNC [+]: lexertest1.js:49+2@1065 - 3@1066
-PUNC [+]: lexertest1.js:49+3@1066 - 50+2@1069
-INTE [2]: lexertest1.js:50+2@1069 - 3@1070
-PUNC [;]: lexertest1.js:50+3@1070 - 4@1071

-COMM [// should parse as 1 + + 2, not 1 ++ 2;]: lexertest1.js:51+1@1072 -40@1111

-WORD [foo]: lexertest1.js:52+1@1112 - 4@1115
-WORD [bar]: lexertest1.js:52+4@1115 - 53+4@1120
-PUNC [;]: lexertest1.js:53+4@1120 - 5@1121
-WORD [elipsis]: lexertest1.js:55+1@1123 - 8@1130
-PUNC [...]: lexertest1.js:55+8@1130 - 11@1133
-PUNC [;]: lexertest1.js:55+11@1133 - 12@1134

-COMM [/* and extending the example at line 30 " interleaved */]:lexertest1.js:57+1@1136 - 57@1192

-STRI [" */"]: lexertest1.js:57+58@1193 - 58+2@1200
-WORD [also]: lexertest1.js:58+2@1200 - 6@1204
-COMM [/* " /* */]: lexertest1.js:58+7@1205 - 17@1215

-COMM [// Backslashes in character sets do not end regexs.]:lexertest1.js:60+1@1217 - 52@1268

-WORD [r]: lexertest1.js:61+1@1269 - 2@1270
-PUNC [=]: lexertest1.js:61+3@1271 - 4@1272
-REGE [/./]: lexertest1.js:61+5@1273 - 8@1276
-PUNC [,]: lexertest1.js:61+8@1276 - 9@1277
-REGE [/\//]: lexertest1.js:61+10@1278 - 14@1282
-PUNC [,]: lexertest1.js:61+14@1282 - 15@1283
-REGE [/[/]/]: lexertest1.js:61+16@1284 - 21@1289
-PUNC [,]: lexertest1.js:61+21@1289 - 22@1290
-REGE [/[\/]\//]: lexertest1.js:61+23@1291 - 31@1299
-WORD [isNaN]: lexertest1.js:63+1@1301 - 6@1306
-PUNC [(]: lexertest1.js:63+6@1306 - 7@1307
-WORD [NaN]: lexertest1.js:63+7@1307 - 10@1310
-PUNC [)]: lexertest1.js:63+10@1310 - 11@1311
-PUNC [;]: lexertest1.js:63+11@1311 - 12@1312

-COMM [// leave some whitespace at the end of this file ]:lexertest1.js:65+1@1314 - 51@1364

+ */]: lexertest1.js:25+1@556 - 28+4@646

+COMM [/* comments can have embedded "strings" */]: lexertest1.js:30+1@648- 43@690

+STRI ["and /*vice-versa*/ "]: lexertest1.js:31+1@691 - 22@712
+WORD [we]: lexertest1.js:33+1@714 - 3@716
+PUNC [(]: lexertest1.js:33+3@716 - 4@717
+WORD [need]: lexertest1.js:33+4@717 - 8@721
+PUNC [-]: lexertest1.js:33+9@722 - 10@723
+WORD [to]: lexertest1.js:33+11@724 - 13@726
+PUNC [+]: lexertest1.js:33+14@727 - 15@728
+PUNC [{]: lexertest1.js:33+16@729 - 17@730
+PUNC [{]: lexertest1.js:33+17@730 - 18@731
+PUNC [{]: lexertest1.js:33+18@731 - 19@732
+WORD [test]: lexertest1.js:33+19@732 - 23@736
+WORD [punctuation]: lexertest1.js:33+24@737 - 35@748
+WORD [thoroughly]: lexertest1.js:33+36@749 - 46@759
+PUNC [}]: lexertest1.js:33+46@759 - 47@760
+PUNC [}]: lexertest1.js:33+47@760 - 48@761
+PUNC [}]: lexertest1.js:33+48@761 - 49@762
+PUNC [)]: lexertest1.js:33+49@762 - 50@763
+PUNC [;]: lexertest1.js:33+50@763 - 51@764
+WORD [left]: lexertest1.js:35+1@766 - 5@770
+PUNC [<<=]: lexertest1.js:35+6@771 - 9@774
+WORD [shift_amount]: lexertest1.js:35+10@775 - 22@787
+PUNC [;]: lexertest1.js:35+22@787 - 23@788
+FLOA [14.0005e-6]: lexertest1.js:37+1@790 - 11@800
+WORD [is]: lexertest1.js:37+12@801 - 14@803
+WORD [one]: lexertest1.js:37+15@804 - 18@807
+WORD [token]: lexertest1.js:37+19@808 - 24@813
+PUNC [?]: lexertest1.js:37+24@813 - 25@814

+COMM [// check that exponentials with signs extracted properly duringsplitting]: lexertest1.js:39+1@816 - 74@889

+KEYW [var]: lexertest1.js:40+1@890 - 4@893
+WORD [num]: lexertest1.js:40+5@894 - 8@897
+PUNC [=]: lexertest1.js:40+9@898 - 10@899
+INTE [1000]: lexertest1.js:40+11@900 - 15@904
+PUNC [-]: lexertest1.js:40+15@904 - 16@905
+FLOA [1e+2]: lexertest1.js:40+16@905 - 20@909
+PUNC [*]: lexertest1.js:40+20@909 - 21@910
+INTE [2]: lexertest1.js:40+21@910 - 22@911
+PUNC [;]: lexertest1.js:40+22@911 - 23@912

+COMM [// check that dotted identifiers split, but decimal numbers not.]:lexertest1.js:42+1@914 - 65@978

+WORD [foo]: lexertest1.js:43+1@979 - 4@982
+PUNC [.]: lexertest1.js:43+4@982 - 5@983
+WORD [bar]: lexertest1.js:43+5@983 - 8@986
+PUNC [=]: lexertest1.js:43+9@987 - 10@988
+FLOA [4.0]: lexertest1.js:43+11@989 - 14@992
+PUNC [;]: lexertest1.js:43+14@992 - 15@993
+WORD [foo2]: lexertest1.js:44+1@994 - 5@998
+PUNC [.]: lexertest1.js:44+5@998 - 6@999
+WORD [bar]: lexertest1.js:44+6@999 - 9@1002
+PUNC [=]: lexertest1.js:44+10@1003 - 11@1004
+WORD [baz]: lexertest1.js:44+12@1005 - 15@1008
+PUNC [;]: lexertest1.js:44+15@1008 - 16@1009
+FLOA [.5]: lexertest1.js:46+1@1011 - 3@1013
+COMM [// a numeric token]: lexertest1.js:46+5@1015 - 23@1033

+COMM [// javascript does not have line continuations.]:lexertest1.js:48+1@1035 - 48@1082

+WORD [foo\]: lexertest1.js:49+1@1083 - 5@1087
+WORD [bar]: lexertest1.js:50+1@1088 - 4@1091
+PUNC [;]: lexertest1.js:50+4@1091 - 5@1092
+WORD [ellipsis]: lexertest1.js:52+1@1094 - 9@1102
+PUNC [...]: lexertest1.js:52+9@1102 - 12@1105
+PUNC [;]: lexertest1.js:52+12@1105 - 13@1106

+COMM [/* and extending the example at line 30 " interleaved */]:lexertest1.js:54+1@1108 - 57@1164

+STRI [" */\
+"]: lexertest1.js:54+58@1165 - 55+2@1172
+WORD [also]: lexertest1.js:55+2@1172 - 6@1176
+COMM [/* " /* */]: lexertest1.js:55+7@1177 - 17@1187

+COMM [// Backslashes in character sets do not end regexs.]:lexertest1.js:57+1@1189 - 52@1240

+WORD [r]: lexertest1.js:58+1@1241 - 2@1242
+PUNC [=]: lexertest1.js:58+3@1243 - 4@1244
+REGE [/./]: lexertest1.js:58+5@1245 - 8@1248
+PUNC [,]: lexertest1.js:58+8@1248 - 9@1249
+REGE [/\//]: lexertest1.js:58+10@1250 - 14@1254
+PUNC [,]: lexertest1.js:58+14@1254 - 15@1255
+REGE [/[/]/]: lexertest1.js:58+16@1256 - 21@1261
+PUNC [,]: lexertest1.js:58+21@1261 - 22@1262
+REGE [/[\/]\//]: lexertest1.js:58+23@1263 - 31@1271
+WORD [isNaN]: lexertest1.js:60+1@1273 - 6@1278
+PUNC [(]: lexertest1.js:60+6@1278 - 7@1279
+WORD [NaN]: lexertest1.js:60+7@1279 - 10@1282
+PUNC [)]: lexertest1.js:60+10@1282 - 11@1283
+PUNC [;]: lexertest1.js:60+11@1283 - 12@1284

+COMM [// leave some whitespace at the end of this file ]:lexertest1.js:62+1@1286 - 51@1336

=======================================

--- /trunk/tests/com/google/caja/lexer/lexertest1.js Thu Mar 13 21:49:352008 UTC+++ /trunk/tests/com/google/caja/lexer/lexertest1.js Thu Oct 31 22:24:132013 UTC

@@ -16,11 +16,11 @@
 var s = "double quotes work inside strings too.\
 pretty well actually";

-// a line comment that oddly \
-spans multiple physical lines
+// a line comment that ends with a \
+'does not absorb the next line';

-/* multiline comments have
-   no need for such silliness */
+/* multiline comments also *\
+/ "don't care about backslash newline" */

 /*/ try and confuse the lexer
     with a star-slash before
@@ -45,14 +45,11 @@

 .5  // a numeric token

-// test how line continuations affect punctuation
-1+\
-+2;
-// should parse as 1 + + 2, not 1 ++ 2;
+// javascript does not have line continuations.
 foo\
 bar;

-elipsis...;
+ellipsis...;

 /* and extending the example at line 30 " interleaved */ " */\
 "also /* " /* */
=======================================

--- /trunk/tests/com/google/caja/parser/js/ParserTest.java Wed Aug 2818:04:38 2013 UTC+++ /trunk/tests/com/google/caja/parser/js/ParserTest.java Thu Oct 3122:24:13 2013 UTC

@@ -149,6 +149,16 @@
         MessageType.UNRECOGNIZED_DIRECTIVE_IN_PROLOGUE,
         "parsertest10.js:52+3 - 15",
         MessagePart.Factory.valueOf("bogusburps"));
+    assertNextMessage(
+        msgs,
+        MessageType.UNRECOGNIZED_DIRECTIVE_IN_PROLOGUE,
+        "parsertest10.js:56+3 - 18",
+        MessagePart.Factory.valueOf("use\\x20strict"));
+    assertNextMessage(
+        msgs,
+        MessageType.UNRECOGNIZED_DIRECTIVE_IN_PROLOGUE,
+        "parsertest10.js:60+3 - 61+8",
+        MessagePart.Factory.valueOf("use \\\nstrict"));

     assertFalse(msgs.hasNext());
   }
@@ -211,9 +221,8 @@
       assertEquals(MessageType.EXPECTED_TOKEN,
                    ex.getCajaMessage().getMessageType());
     }
-    // But it should pass if there is a line-continuation
-    js(fromString("throw \\\n new Error()"));
   }
+
   public final void testCommaOperatorInReturn() throws Exception {
     Block bl = js(fromString("return 1  \n  , 2;"));
     assertTrue("" + mq.getMessages(), mq.getMessages().isEmpty());
=======================================

--- /trunk/tests/com/google/caja/parser/js/StringLiteralTest.java Thu Aug8 19:36:28 2013 UTC+++ /trunk/tests/com/google/caja/parser/js/StringLiteralTest.java Thu Oct31 22:24:13 2013 UTC

@@ -33,6 +33,7 @@
     assertEquals("\"\"", StringLiteral.getUnquotedValueOf("'\\\"\\\"'"));

assertEquals("foo\bar",StringLiteral.getUnquotedValueOf("'foo\\bar'"));assertEquals("foo\nbar",StringLiteral.getUnquotedValueOf("'foo\\nbar'"));+ assertEquals("foobar",StringLiteral.getUnquotedValueOf("'foo\\\nbar'"));

     assertEquals("foo\\bar\\baz",
         StringLiteral.getUnquotedValueOf("'foo\\\\bar\\\\baz'"));
     assertEquals(
=======================================

--- /trunk/tests/com/google/caja/parser/js/parsergolden10.txt Sun Jun 321:36:01 2012 UTC+++ /trunk/tests/com/google/caja/parser/js/parsergolden10.txt Thu Oct 3122:24:13 2013 UTC

@@ -89,3 +89,18 @@
       Block
         DirectivePrologue
           Directive : bogusburps
+  FunctionDeclaration
+    Identifier : directiveCannotHaveEscape
+    FunctionConstructor
+      Identifier : directiveCannotHaveEscape
+      Block
+        DirectivePrologue
+          Directive : use\x20strict
+  FunctionDeclaration
+    Identifier : directiveCannotHaveContinuation
+    FunctionConstructor
+      Identifier : directiveCannotHaveContinuation
+      Block
+        DirectivePrologue
+          Directive : use \
+strict
=======================================

--- /trunk/tests/com/google/caja/parser/js/parsertest10.js Sun Jun 321:36:01 2012 UTC+++ /trunk/tests/com/google/caja/parser/js/parsertest10.js Thu Oct 3122:24:13 2013 UTC

@@ -51,3 +51,12 @@
 function malformedOkayWithWarning() {
   "bogusburps";
 }
+
+function directiveCannotHaveEscape() {
+  "use\x20strict";
+}
+
+function directiveCannotHaveContinuation() {
+  "use \
+strict";
+}
=======================================

--- /trunk/tests/com/google/caja/render/JsMinimalPrinterTest.java Thu Aug8 19:36:28 2013 UTC+++ /trunk/tests/com/google/caja/render/JsMinimalPrinterTest.java Thu Oct31 22:24:13 2013 UTC

@@ -116,6 +116,22 @@
         + "a +  // Line comment\n"
         + "b;");
   }
+
+  public final void testBackslashNewline() throws Exception {
+    // Backslash newline is elided in strings.
+    assertRendered(
+        "{var x='string continuation'}",
+        "var x = 'string \\\ncontinuation'");
+    // Backslash newline is not special in comments.
+    assertRendered(
+        "{var x='noncontinuation'}",
+        "var x = // comment \\\n'noncontinuation'");
+    // Backslash newline elsewhere is a syntax error, and is rendered
+    // in a way that is still a syntax error.
+    assertLexed(
+        "illegal\\ continuation",
+        "illegal\\\ncontinuation");
+  }

   public final void testDivisionByRegex() throws Exception {
     assertLexed("3/ /foo/", "3 / /foo/;");

--

---You received this message because you are subscribed to the Google Groups "Google Caja Discuss" group.

To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

[Caja] [google-caja] r5622 committed - fix backslash-newline handling the JS parser...

Reply via email to