Yeah, probably I should be using parser rules.  I was trying to keep things
"simple" by making everything a linear stream of tokens from the point of
view of the Java caller, while still having high-level constructs like DATE.

Perhaps what I really want is something like this:

-------------------
options {
    backtrack=true;
    memoize=true;
    output=AST;
}

tokens {
    DATE;
}

cite    :    token+ EOF ;
token   :    date -> DATE | SLASH | DIGITS;
date    :    DIGITS SLASH DIGITS SLASH DIGITS ;

SLASH   :    '/' ;
DIGITS  :    ('0'..'9')+ ;
WS      :    ( ' ' | '\t'| '\f' | '\n' | '\r' ) {skip();} ;
-------------------


The only thing missing now is the character-data from DATE.  Is there a way
to change that 'token' rule to something like this?

token   :    date -> {new CommonToken(DATE, $text)} | SLASH | DIGITS;


I appreciate all the help.



On 6/2/10 4:41 PM, "Jim Idle" <[email protected]> wrote:

> This isn't left factored, it is doing the lookahead that you require to
> distinguish the two. In v4 this will be different, but for now, this is what
> you will need to do.
> 
> Or, don't try to do it in the lexer at all and construct parser rules for it.

-- 
Ken Williams
Sr. Research Scientist
Thomson Reuters
Phone: 651-848-7712
[email protected]



List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
"il-antlr-interest" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.

Reply via email to