On 02/03/2017 08:17, Michael Dyck wrote:
On 17-02-02 06:23 PM, Waldemar Horwat wrote:
Lookahead restrictions fit very well into an LR(1) engine
Again: Great, but how? E.g., do you pre-process the grammar, modify the
construction of the automaton, and/or modify the operation of the parser?
For each state × token combination, the automaton describes what happens when
you're in state S and see a token T. The lookahead restrictions remove
possible transitions; without them there would be ambiguities where a given
state × token combination would want to do two incompatible things.
That's different from parametrized rules, which simply macro-expand into lots
of individual rules.
as long as they're limited to only one token, and that's what I've
implemented in the validator.
So when the grammar has a prohibited lookahead-sequence with more than one
token (in ExpressionStatement, IterationStatement, and ExportDeclaration), does
the validator just use the first token?
My LR(1) validator can't actually handle the case of multiple tokens of
lookahead, and I didn't feel like doing an LR(2) grammar. I had to check these
by hand.
You need to be very careful with them if looking more than one token
ahead because lexing of the tokens can vary based on context. For
example, if the next few characters in front of the cursor are )/abc/i+,
then what is the second token? What is the third token? It's actually
context-dependent.
But the context-dependentness of lexing is a parse-time problem, not a
validation-time problem, right?
No.
The syntactic level can just assume a stream of (correctly lexed) input
elements.
No! It's impossible to create a stream of correctly lexed input elements
without doing syntactic level parsing. See the example I gave above.
Or do you validate deeper than the syntactic grammar?
Yes. The syntactic grammar controls the lexer. The validator looks for
problems such as the syntactic grammar giving the lexer contradictory
instructions. An example would be any syntactic grammar automaton state where
one outgoing syntactic transition would swallow a regexp token and another
outgoing syntactic transition from the same state would swallow a / token. If
any such state exists, the grammar is broken.
(And it seems to me that multi-token lookahead-restrictions would be hard for a
validator to handle even if lexing *wasn't* context-dependent, but maybe that
depends on how you handle them.)
-Michael
_______________________________________________
es-discuss mailing list
[email protected]
https://mail.mozilla.org/listinfo/es-discuss