[il-antlr-interest: 27268] Re: [antlr-interest] Lexer code not generated as expected?

Jim Idle Tue, 15 Dec 2009 11:58:45 -0800

Backtracking is for parsing. However it is really only useful when you can 
guarantee that the input will not have syntax errors or when the incoming 
language is so ambiguous that you really don't have any choice. Generally you 
want to left factor your grammar and remove ambiguities.


The difference between ANTLR and say flex is that flex is a set of patterns 
which it tries to match one after the other (simplistically speaking) so it 
will try the first pattern that might match, that fails so it moves on to the 
next pattern that might match and so on. ANTLR however lexes using LL(n) 
grammar and in its analysis it distinguishes the tokens and then generates 
enough code to send the match down the predicted path. It does not generate 
code that goes all the way to the 'end' of the token then if it matches, 
selects that token. So, with your rules it sees the first few characters and 
says 'that has to be this token'. If you want the match or move on type 
behavior, then you explicitly do that with the predicates. Ter has talked about 
changing the lexers to behave a bit more like 'people expect' but that is not 
there right now. It is to do with seeing beyond the end of the token. 

So when you first look you might feel it is complicated, but you are soon used 
to looking at it and it begins to make sense. Generally I recommend that you 
don't just take the easy path when writing recognizers. Counter intuitively, 
turning on things like global backtracking requires that you know MORE about 
constructing grammars rather than less as you are effectively turning off all 
the warnings about ambiguities that ANTLR would give you. This is fine in some 
senses but you are also turning off warnings about things that you didn’t 
realize were there/possible/encoded. Being a bit more explicit in the way you 
specify things usually results in a simpler/better grammar, in my experience.

Jim

> -----Original Message-----
> From: Peter Boughton [mailto:[email protected]]
> Sent: Tuesday, December 15, 2009 11:25 AM
> To: Jim Idle
> Cc: [email protected]
> Subject: Re: [antlr-interest] Lexer code not generated as expected?
> 
> > You have to be more specific with the lexer here if you want that
> kind of behavior
> 
> I don't get why?
> 
> 
> The ANTLR book mentions auto-backtracking (which seems to be what is
> *not* happening here), and which can be turned on with "options
> {backtrack=true;}"
> 
> Would that not solve this problem?
> If so, are there any issues (aside from performance) that might be a
> reason not to have this turned on?
> 
> It certainly seems a general approach of "take the easy path, then
> optimise proven slow parts, as required" is more favourable than
> adding this complexity.




List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

--

You received this message because you are subscribed to the Google Groups 
"il-antlr-interest" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.

[il-antlr-interest: 27268] Re: [antlr-interest] Lexer code not generated as expected?

Reply via email to