[il-antlr-interest: 30682] Re: [antlr-interest] Identifiers with Spaces

Michael Bosch Mon, 29 Nov 2010 14:24:32 -0800

Hi John!

On Fri, 2010-11-26 at 21:38 -0500, John B. Brodie wrote:
> i suggest something like this (untested):
> 
> ID : ID_HEAD ( ' '* ID_TAIL )* ;
> fragment ID_HEAD : LETTER ;
> fragment ID_TAIL : LETTER | DIGIT | '_' ;


That is what I tried.  I just reduced it to the bare minimum to
demonstrate the problem.

> > - Why is the lexer for test2 only using a 1 character lookahead?
> 
> because that is all that is needed to disambiguate the situation. recall
> that the lexer operates without any knowledge of parsing context. so, to
> the lexer, (assuming a rule like ID:LETTER(' '|LETTER)*) "a " is clearly
> an ID and not an 'a' followed by ' '.

I tend to disagree: For the input 
  iden tifier =
the first space is a continuation of the ID, but the second space
is just whitespace to be ignored.  To distinguish between the two cases
the lexer would have to look past the spaces.

I know that the LL(*) parser / lexer engine is capable of doing that,
however ANTLR chooses to create a 1 character lookahead.
I have temporarily worked around the problem by manually changing
the lookahead code in the generated code.

Michael

PS: Your explanation of lexer rule priorities was helpful.


List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
"il-antlr-interest" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.

[il-antlr-interest: 30682] Re: [antlr-interest] Identifiers with Spaces

Reply via email to