Ref
http://www.antlr.org:8888/browse/ANTLR-225
http://www.antlr.org:8888/browse/ANTLR-233
if you have
NUM : INT | FLOAT ;
you probably want INT or FLOAT type but if you have
STRING : '"' (ESC|.)* '"' ;
you don't want ESC type. You want STRING. Currently, calling ESC and
setting $type only sets a local. at end of each rule I do:
state.type = _type;
so STRING resets state.type to STRING at end even though ESC
temporarily sets it. We have to decide how this should behave. Jim
is proposing that $type is really
( ruleNestingLevel == 0 ? state.type : _type )
I.e., if we're in invoked rule, don't set global token type. Makes
sense to me, but requires a tweak in your runtime templates. :(
For example, my template looks like:
lexerRulePropertyRef_type(scope,attr) ::= "_type"
in contrast to $channel:
lexerRulePropertyRef_channel(scope,attr) ::= "state.channel"
I get bitten by this. Call WS from a rule:
X : ID WS? '=' ID ;
WS : ' '+ {$channel = HIDDEN; } ;
and then X is HIDDEN! :(
But if we do the rule nesting thing, this doesn't work:
NUM : INT | FLOAT ;
You always get NUM. would have to do:
NUM : INT {$type=INT;} | FLOAT {$type=FLOAT;} ;
If we use:
( ruleNestingLevel == 0 ? state.type : _type )
then we need to detect assignments and call a
setLexerRulePropertyRef_type template rather than
lexerRulePropertyRef_type. We already have this idea:
ruleSetPropertyRef_tree(scope,attr,expr) ::= "retval.tree =<expr>;"
So, we kind of need to fix this. I think we use the rule nesting.
Valid examples then:
X : ID WS? '=' ID ; // result is X on normal channel
WS : ' '+ {$channel = HIDDEN; } ;
NUM : INT {$type=INT;} | FLOAT {$type=FLOAT;} ;
STRING : '"' (ESC|.)* '"' ; // result is STRING not ESC
All in favor? many/most of the lexerRulePropertyRef_XXX templates
would need to change.
Ter
_______________________________________________
antlr-dev mailing list
[email protected]
http://www.antlr.org:8080/mailman/listinfo/antlr-dev