Re: LR(1) grammar/parser and lookahead-restrictions

Isiah Meadows Thu, 12 Jan 2017 12:55:53 -0800

Okay. I see now. Lots of backtracking in the case of arrow functions and
ternary expressions vs type annotations.


On Thu, Jan 12, 2017, 00:45 Dmitry Soshnikov <[email protected]>
wrote:

> The "early errors" are just parser errors which are enforced not by
> grammar rules, but by additional validation algorithms which follow the
> parsed node. E.g. `eval` cannot be assigned in strict mode: it's not a
> grammar issue (grammar allows `eval` on LHS), and it's not a runtime issue,
> it's an "early error" enforced by static semantics, which still issues it
> as a `ParseError`. Example in handling it in a semantic action for a
> production (on a simplified `AssignmentExpression`):
>
> ```
> %%
>
> AssignmentExpression
>   : IDENTIFIER '=' Expression {
>       const id = $1;
>
>       // Static semantics: early parser error:
>
>       if (mode.isStrict && (id === 'eval' || id == 'arguments')) {
>         throw new ParseError(
>           '`eval` and `arguments` cannot be assigned in strict mode.'
>         );
>       }
>
>       return {
>         type: 'AssigmentExpression',
>         left: $1,
>         right: $3,
>       };
>     }
> ```
>
> Similarly other "early errors".
>
> BTW, here is the example I mentioned above of how Flow has to do
> backtracking in parsing in some edge cases, making the grammar completely
> not LR(1) anymore:
> https://github.com/babel/babel/issues/1991#issuecomment-121708242
>
> Dmitry
>
>
> On Wed, Jan 11, 2017 at 7:13 PM, Isiah Meadows <[email protected]>
> wrote:
>
> Okay. I see now. How do early errors factor into this?
>
> On Wed, Jan 11, 2017, 21:49 Dmitry Soshnikov <[email protected]>
> wrote:
>
> Well, it's still context-free (contex-sensitive may have terminals on LHS
> too, JS cannot). It requires extra cover grammar, and additional static
> semantics, but still potentially can be parsed by a LALR/CLR parser - yes,
> you'll have to do parsed nodes reinterpretation in semantics handlers (e.g.
> validate and transform parsed ObjectLiteral into an ObjectPattern when
> parsing destructuring), but should be doable in general. It won't be doable
> if lookaheafs won't be enough, and only a backtracking can recover (like
> the cases with Flow parser).
>
> Dmitry
>
>
> On Wed, Jan 11, 2017 at 6:09 PM Isiah Meadows <[email protected]>
> wrote:
>
> Heuristically, I doubt it's even context-free at this point, considering
> the concept and widespread prevalence of early errors now. I suspect it's
> mildly context-sensitive (maybe tree-adjoining?), but I'm no formal
> language expert here.
>
>
>
> On Wed, Jan 11, 2017, 14:23 Dmitry Soshnikov <[email protected]>
> wrote:
>
> One yet has to check the pure ES6+ grammar to be LR(1) compatible (the
> Cover grammar with nodes reinterpretations makes it harder), but for a very
> practical use-case, e.g. combined with Flow, it'll not be easy to have an
> automated parser (since Flow's parser uses backtracking for some edge cases
> which is already not LR-parser).
>
> I have some subset of ES6 working as an example language for Syntax
> tool[1] (since I just took a bunch of productions from JS), but I haven't
> tried the full ES6 grammar (not having time to work on this project, since
> it'll be not a straightforward process of just matching the grammar 1:1 to
> the parser generator -- based on the above reasons).
>
> Long time ago I also had a brief discussion with authors of another
> popular parser generators for JS. Both didn't try it as well, but mentioned
> that even for ES5 it was hard to parse e.g. regexp, etc.
>
> Someone (with enough time for the project) should just sit, and spend some
> time actually porting the grammar from the spec on some powerful parsers
> generator.
>
> [1] Syntax: https://github.com/DmitrySoshnikov/syntax
> [2] https://twitter.com/DmitrySoshnikov/status/666327209959751680
>
> On Wed, Jan 11, 2017 at 10:28 AM, Michael Dyck <[email protected]> wrote:
>
> In the past, it has been said (usually by Brendan Eich) that TC39 intends
> that the ECMAScript grammar be LR(1). Is that still the case? (I'm not so
> much asking about the "1", but more about the "LR".)
>
>
>
>
>
> If so, I'm wondering how lookahead-restrictions (e.g., [lookahead <!
> terminals]) fit into the LR approach. It seems like there are three
> possibilities:
>
>
>   - modify the grammar (before the parser is constructed),
>
>
>   - modify the construction of the parser, and/or
>
>
>   - modify the operation of the parser.
>
>
> It's not clear to me that any of these (or any combination of these) will
> work.
>
>
>
>
>
> Has anyone algorithmically generated an LR parser from the spec grammar?
> If so, how did you you handle lookahead-restrictions?
>
>
>
>
>
> -Michael
>
>
> _______________________________________________
>
>
> es-discuss mailing list
>
>
> [email protected]
>
>
> https://mail.mozilla.org/listinfo/es-discuss
>
>
>
>
>
> _______________________________________________
>
>
> es-discuss mailing list
>
>
> [email protected]
>
>
> https://mail.mozilla.org/listinfo/es-discuss
>
>
>
>
>
>

_______________________________________________
es-discuss mailing list
[email protected]
https://mail.mozilla.org/listinfo/es-discuss

Re: LR(1) grammar/parser and lookahead-restrictions

Reply via email to