Hi John!
On Fri, 2010-11-26 at 21:38 -0500, John B. Brodie wrote:
> i suggest something like this (untested):
>
> ID : ID_HEAD ( ' '* ID_TAIL )* ;
> fragment ID_HEAD : LETTER ;
> fragment ID_TAIL : LETTER | DIGIT | '_' ;
That is what I tried. I just reduced it to the bare minimum to
demonstrate the problem.
> > - Why is the lexer for test2 only using a 1 character lookahead?
>
> because that is all that is needed to disambiguate the situation. recall
> that the lexer operates without any knowledge of parsing context. so, to
> the lexer, (assuming a rule like ID:LETTER(' '|LETTER)*) "a " is clearly
> an ID and not an 'a' followed by ' '.
I tend to disagree: For the input
iden tifier =
the first space is a continuation of the ID, but the second space
is just whitespace to be ignored. To distinguish between the two cases
the lexer would have to look past the spaces.
I know that the LL(*) parser / lexer engine is capable of doing that,
however ANTLR chooses to create a 1 character lookahead.
I have temporarily worked around the problem by manually changing
the lookahead code in the generated code.
Michael
PS: Your explanation of lexer rule priorities was helpful.
List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe:
http://www.antlr.org/mailman/options/antlr-interest/your-email-address
--
You received this message because you are subscribed to the Google Groups
"il-antlr-interest" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/il-antlr-interest?hl=en.