Yeah, probably I should be using parser rules. I was trying to keep things
"simple" by making everything a linear stream of tokens from the point of
view of the Java caller, while still having high-level constructs like DATE.
Perhaps what I really want is something like this:
-------------------
options {
backtrack=true;
memoize=true;
output=AST;
}
tokens {
DATE;
}
cite : token+ EOF ;
token : date -> DATE | SLASH | DIGITS;
date : DIGITS SLASH DIGITS SLASH DIGITS ;
SLASH : '/' ;
DIGITS : ('0'..'9')+ ;
WS : ( ' ' | '\t'| '\f' | '\n' | '\r' ) {skip();} ;
-------------------
The only thing missing now is the character-data from DATE. Is there a way
to change that 'token' rule to something like this?
token : date -> {new CommonToken(DATE, $text)} | SLASH | DIGITS;
I appreciate all the help.
On 6/2/10 4:41 PM, "Jim Idle" <[email protected]> wrote:
> This isn't left factored, it is doing the lookahead that you require to
> distinguish the two. In v4 this will be different, but for now, this is what
> you will need to do.
>
> Or, don't try to do it in the lexer at all and construct parser rules for it.
--
Ken Williams
Sr. Research Scientist
Thomson Reuters
Phone: 651-848-7712
[email protected]
List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe:
http://www.antlr.org/mailman/options/antlr-interest/your-email-address
--
You received this message because you are subscribed to the Google Groups
"il-antlr-interest" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/il-antlr-interest?hl=en.