Re: Raw string literals and Unicode escapes

2018-02-14 Thread John Rose
On Feb 14, 2018, at 2:42 PM, Alex Buckley wrote: > > Also, the inclusion of RawSP makes the lexing of RawStringLiteral ambiguous, > since RawStringBody allows opening and closing whitespace. No doubt this can > be fixed with rules involving "If the first character after RawSP is a > backtick .

Re: Raw string literals and Unicode escapes

2018-02-14 Thread Alex Buckley
On 2/14/2018 1:48 PM, John Rose wrote: P.S. I posted another version that takes a slightly different tack on the restriction of "cannot begin with a backquote". It basically lifts the whole design of Markdown code quotes. http://cr.openjdk.java.net/~jrose/jls/raw-string-pages-v5.pdf The inclus

Re: Raw string literals and Unicode escapes

2018-02-14 Thread John Rose
On Feb 14, 2018, at 1:43 PM, Alex Buckley wrote: > > Strictly speaking, the semantic rule is unnecessary because InputCharacter is > DEFINED to exclude the CR and LF line terminators! But the semantic rule > makes the intent very very clear. Writing rules in this form also prevents > the spec

Re: Raw string literals and Unicode escapes

2018-02-14 Thread Alex Buckley
On 2/14/2018 12:42 PM, John Rose wrote: On Feb 14, 2018, at 12:24 PM, Alex Buckley mailto:alex.buck...@oracle.com>> wrote: There is plenty of precedent for semantic rules In my draft version this is done with "where" clauses on the grammar rules: RawStringLiteral: RawQuote RawStringBody

Re: Raw string literals and Unicode escapes

2018-02-14 Thread John Rose
On Feb 14, 2018, at 12:24 PM, Alex Buckley wrote: > > There is plenty of precedent for semantic rules In my draft version this is done with "where" clauses on the grammar rules: > > RawStringLiteral: > > RawQuote RawStringBody RawQuote > where the two raw-quotes are constrained to be ide

Re: Raw string literals and Unicode escapes

2018-02-14 Thread Alex Buckley
On 2/13/2018 2:19 PM, Jim Laskey wrote: 10a. String s = `abc`; 10b. String s = \u0060abc`; ... So, change the scanner to A) Peek back to make sure the first open backtick was exactly a backtick. B) Turn off Unicode escapes immediately so that only backtick characters can be part of the delimiter

Re: Raw string literals and Unicode escapes

2018-02-14 Thread Alex Buckley
On 2/13/2018 2:11 PM, John Rose wrote: On Feb 13, 2018, at 9:58 AM, Alex Buckley mailto:alex.buck...@oracle.com>> wrote: I suspect the trickiest part of specifying raw string literals will be the lexer's modal behavior for Unicode escapes. As such, I am going to put the behavior under the micro