Re: LR(1) grammar/parser and lookahead-restrictions

Michael Dyck Mon, 30 Jan 2017 14:23:33 -0800

On 17-01-30 03:00 PM, Dmitry Soshnikov wrote:


On Mon, Jan 30, 2017 at 9:26 AM, Michael Dyck <[email protected]
<mailto:[email protected]>> wrote:

    On 17-01-30 12:15 AM, Dmitry Soshnikov wrote:


        As mentioned above, there are two ways to handle lookahead
        restrictions in
        LR-parser:

        1. Using "No<Lookahead>" productions.

    So, e.g., one could take the production
        IterationStatement : ...  [lookahead != 'let'] LHSExpr ...
    and transform it into
        IterationStatement : ... LHSExpr-that-does-not-start-with-let ...
    and then generate productions for the new nonterminal, e.g.
        LHSExpr-that-does-not-start-with-let :
            New-Expression-that-does-not-start-with-let
            Call-Expression-that-does-not-start-with-let
    etc, (eventually bottoming out when a RHS of the original grammar will
    either always or never start with 'let').

Correct, and this is how ES spec itself handles all those "No<lookahead>"
duplicated productions.

If you're talking about the 'NoIn' productions in ES5, that seems onlymarginally relevant, since they don't address a lookahead problem (i.e., aproblem that could have been solved via a lookahead-restriction).

    The problem with this approach is that, in general, a
    lookahead-restriction is not a restriction on the nonterminal that
    immediately follows it in the production, it's a restriction on the next
    few input elements, which might be generated by more than just that
    nonterminal.


Well, in this case it's not a "lookahead" anymore.

It isn't a "lookahead" in the resulting grammar, but it is in the original,and the two grammars should define the same language. What I'm saying isthat it's incorrect to assume that the effect of the lookahead (in theoriginal) can be confined to the nonterminal that immediately follows, whichis what the suggested transformation does.

You could have k=2,3,...N for lookahead in parser generators, but it's
not like parse the whole non-terminal, and then see a lookahead after it.
This "parse whole non-terminal till the end" might be parse the whole
program.


I don't really follow what you're suggesting there.

In this case a cover grammar is simpler, and that's why it's used
starting since ES6 spec.

But note that none of the ES5 lookahead-restrictions was replaced with acover grammar in ES6. They're all still lookahead-restrictions.


    I'm not saying this is insurmountable, but it does seem to get messy
    fairly quickly.

    And it's unclear how this approach would handle a lookahead-restriction
    at the end of a production.


Yes, it adds a bunch of duplicated productions to the grammar, which on
practice may increases number of parsing states simnifically. As a real
practical example: ES5 spec in LALR(1) mode would increase from ~350 states
to 470.

By "messy", I wasn't talking about the duplicated productions, but ratherthe process of generating them.

        2. Another approach is to use Cover grammar.


    Is there a way to automatically (a) construct the cover grammar and (b)
    reconstruct the parse tree as if parsed by the original grammar, or do
    you code it manually?


Usually it's a manual process of designing a grammar, and defining
post-processing static semantics.

Okay, I'm not really interested then. I'm looking for an algorithm (i.e., amodification of the LR algorithm) that starts with agrammar-with-lookahead-restrictions and ends with an LR parser.

        P.S.: there is actually option #3 -- to use "S-attributed" parser
        actions,
        it's when you may execute handling action not only at the end of
        production
        (i.e. not only on "reduce"), but after each element on RHS:


    I'm doubtful this would help. I wouldn't expect the LR-parser-generator
    to 'understand' the content of actions, so it can't use them to alter
    the generation of the parser, and so the LR automaton will have lots of
    conflicts (because it can't 'separate' the alternatives that the
    lookahead-restrictions are there to separate).


Oh, it'll work for sure, depending on how powerful a parser generator is.
Yes, technically it'll be a "reduce-reduce" conflict in the table,

So when your generated parser has conflicts, is it easy to tell which youdon't have to worry about and which you do?

but parser will never enter this state based on the semantic action executed
*before* entering some non-terminal (not after). In this "before" action you
can communicated to lexer, and check the lookahead.

It seems to me that the parser *will* enter such a state, but always viaonly one of the conflicting 'paths' (the other(s) being prevented atparse-time by the semantic actions). So only one of the conflicting actionswill be valid, but you've still got to figure out which one that is.


-Michael
_______________________________________________
es-discuss mailing list
[email protected]
https://mail.mozilla.org/listinfo/es-discuss

Re: LR(1) grammar/parser and lookahead-restrictions

Reply via email to