Hello,
I have found out a strange problem using Antlr and I wonder if it is a bug or
not.
Here is part of my grammar:
WS
: ' ' {$channel=HIDDEN;}
;
CUTLINE
: ('\n' ' '* '+') {$channel=HIDDEN;}
;
NEWLINE
: '\n'
;
and here is what antlr generates in the function mTokens:
static void
mTokens(pAntlrTestbenchLexer ctx)
{
{
// antlr/AntlrTestbench.g:1:8: ( T__10 | WS | CUTLINE | NEWLINE | ID |
INT )
ANTLR3_UINT32 alt4;
alt4=6;
switch ( LA(1) )
{
...
case '\n':
{
switch ( LA(2) )
{
case ' ':
case '+':
{
alt4=3; //CUTLINE
}
break;
default:
alt4=4;} //NEWLINE
}
break;
...
It doesn't correspond to what I want because when the input of the lexer is "\n
", I would expect it to recognize the lexemes NEWLINE and WS, but with the code
above it will try to recognize the lexeme CUTLINE and fail.
Indeed, when a '\n' has been first recognized, the lexer should look ahead to
find the first non ' ' character, and then if it is a '+' character, OK the
correct alternative is the CUTLINE rule, if not then only in this case the
correct alternative is the NEWLINE rule.
The workarounbd I have found is to change the grammar this way:
NEWLINE
: '\n' ' '*
;
Then it is working as I want, but I find it strange having to resolve the
ambiguity this way.
So is the C code generated by antlr correct or is it a bug?
Thanks,
Yann
____________________________________________________
Venez faire le plein d’idées et remplir votre hotte de cadeaux sur
http://evenementiel.voila.fr/Noel/
List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe:
http://www.antlr.org/mailman/options/antlr-interest/your-email-address
--
You received this message because you are subscribed to the Google Groups
"il-antlr-interest" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/il-antlr-interest?hl=en.