Thanks for your answers, I now understand the stategy of lexers. The left factoring you propose does not work better: because of the 'F' letter of the identifier following the minus sign, the problem remains the same in the example '-FOO -FIN-' !
~/Soft/Antlr/LexJava: java Main test line 1:2 mismatched character 'O' expecting 'I' --> [...@-1,3:3='O',<6>,1:3] --> [...@-1,4:4='\n',<7>,channel=99,1:4] --> [...@-1,5:9='-FIN-',<5>,2:0] --> [...@-1,10:30=' \n',<7>,channel=99,2:5] Jean-Claude Durand LIG, équipe GETALP 385, rue de la Bibliothèque BP 53 38041 Grenoble cedex 9 France [email protected] tél: +33 (0)4 76 51 43 81 fax: +33 (0)4 76 63 56 86 Le 14 déc. 09 à 19:35, John B. Brodie a écrit : > Greetings! > On Mon, 2009-12-14 at 19:18 +0100, Jean-Claude Durand wrote: >> My lexical grammar (I use antlr v3.2): >> >> lexer grammar Lex; >> options >> { language=Java; } >> >> >> WS: ( ' ' | '\t' | '\n' )+ { $channel=HIDDEN; } ; >> >> >> FIN : '-FIN-' ; >> Moins : '-' ; >> >> >> // Identifiers: >> Idf : ('A'..'Z')+ ; >> >> >> I want to enumerate the tokens for the following example (Main.java >> is >> in the archive): >> >> >> VLEG-XLEG-FCINFZU >> >> >> And the output is: >> >> >> ~/Soft/Antlr/LexJava: java Main test >> --> [...@-1,0:3='VLEG',<7>,1:0] >> --> [...@-1,4:4='-',<6>,1:4] >> --> [...@-1,5:8='XLEG',<7>,1:5] >> line 1:11 mismatched character 'C' expecting 'I' >> --> [...@-1,12:16='INFZU',<7>,1:12] >> --> [...@-1,17:36=' ',<4>,channel=99,1:17] >> ~/Soft/Antlr/LexJava: >> >> >> The lexer is looking for the keyword -FIN- and not for minus sign >> followed by an identifier (which begins with an F). > > This is a well-known "feature" of ANTLR lexers. that once it sees the > left prefix of a token it commits itself to only that token and will > not > backup and consider other possibilities. > > you need to left factor your FIN and Moins rules. Something like the > following (off the top of my head, untested, but gives the general > idea): > > lexer grammar Lex; > options { language=Java; } > tokens { FIN; } > > WS: ( ' ' | '\t' | '\n' )+ { $channel=HIDDEN; } ; > > Moins : '-' ( 'FIN-' { $type = FIN; } )?; > > // Identifiers: > Idf : ('A'..'Z')+ ; > > List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups "il-antlr-interest" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
