Both the DAR book and the Javadoc (http://www.antlr.org/api/ActionScript/org/antlr/runtime/Lexer.html#emitToke n() ) mention that if you want to emit multiple tokens for a single lexer rule, you need to override emit() or emitToken(). Does anyone have any examples of doing that?
I assume nextToken() would also need to be overridden. In case I have an XY Problem (http://www.perlmonks.org/index.pl?node_id=542341), my use case is to parse as in the following examples: 23 -> DIGITS 23, -> DIGITS PUNC 23,450 -> NUMERIC 23,450, -> NUMERIC PUNC To do that, I'm using a lexer rule that consumes all the numeric & permitted in-numeric punctuation, then I fix it up afterwards: ----------------------- token : ... | DIGITS | NUMERIC -> {fixNum($text)} | PUNC PUNC : '-' | ',' | '.' ; fragment DIGIT : '0'..'9' ; NUMERIC : DIGIT (DIGIT | PUNC)* {if ($text.matches("^[0-9]+$")) {$type=DIGITS;}} ; ----------------------- My fixNum() method is trying to fix things up at the parser level, but I really want to do it in the lexer. An alternate solution might be to "push back" any trailing punctuation onto the input stream. Not sure if that's possible? -- Ken Williams Sr. Research Scientist Thomson Reuters Phone: 651-848-7712 [email protected] List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups "il-antlr-interest" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
