[il-antlr-interest: 28508] Re: [antlr-interest] Ambiguous lexing task

Cliff Hudson Fri, 02 Apr 2010 13:59:45 -0700

I've played around with it a bit, and I modified NAMECHAR to be:

fragment NAMECHAR
    : LETTER
    | DIGIT
    | '_'
    | {input.LA(2) != '>'}?=> '-'
    ;


This seems to do the trick.  However, I'm concerned this is not a best
practice for this kind of situation.  Could I get a suggestion as to the
"correct" way to go about this?

On Fri, Apr 2, 2010 at 1:48 PM, Cliff Hudson <[email protected]>wrote:

> I have a string which I need to parse for IDs and operators.  This is
> normally pretty easy, but there is one case where a character in the ID can
> also match one character in the operator.  The tokens are:
>
> OP_TRANSFORM : '->'
>
> ID : (LETTER | '_') (options { greedy=true } : NAMECHAR)*
>
> fragment NAMECHAR : LETTER | DIGIT | '_' | '-' ;
>
> LETTER : 'a'..'z' | 'A'..'Z' ;
> NUMBER: '0'..'9' ;
>
>
> The issue is in parsing the following string:
>
> my-identifier->foo
>
> The ID token of course matches 'my-identifier-', and then I am left with an
> extraneous '>'.  Is there a way to construct a set of lexing rules, possibly
> with actions, that would correctly separate out the -> from the ID?  In this
> case, I want the '-' in OP_TRANSFORM to be the preferred path and to match
> '->' even in the above case.
>
> Thanks.
>

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
"il-antlr-interest" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.

[il-antlr-interest: 28508] Re: [antlr-interest] Ambiguous lexing task

Reply via email to