Terence Parr schrieb:
> Ref
> http://www.antlr.org:8888/browse/ANTLR-225
> http://www.antlr.org:8888/browse/ANTLR-233
> 
> if you have
> 
> NUM : INT | FLOAT ;
> 
> you probably want INT or FLOAT type but if you have
> 
> STRING : '"' (ESC|.)* '"' ;
> 
> you don't want ESC type.  You want STRING.  Currently, calling ESC and  
> setting $type only sets a local.  at end of each rule I do:
> 
>     state.type = _type;
> 
> so STRING resets state.type to STRING at end even though ESC  
> temporarily sets it.  We have to decide how this should behave.  Jim  
> is proposing that $type is really
> 
> ( ruleNestingLevel == 0 ? state.type : _type )
> 
> I.e., if we're in invoked rule, don't set global token type.  Makes  
> sense to me, but requires a tweak in your runtime templates. :(
> 
> For example, my template looks like:
> 
> lexerRulePropertyRef_type(scope,attr) ::= "_type"
> 
> in contrast to $channel:
> 
> lexerRulePropertyRef_channel(scope,attr) ::= "state.channel"
> 
> I get bitten by this.  Call WS from a rule:
> 
> X : ID WS? '=' ID ;
> WS : ' '+ {$channel = HIDDEN; } ;
> 
> and then X is HIDDEN! :(
> 
> But if we do the rule nesting thing, this doesn't work:
> 
> NUM : INT | FLOAT ;
> 
> You always get NUM.  would have to do:
> 
> NUM : INT {$type=INT;} | FLOAT {$type=FLOAT;} ;
> 
> If we use:
> 
> ( ruleNestingLevel == 0 ? state.type : _type )
> 
> then we need to detect assignments and call a  
> setLexerRulePropertyRef_type template rather than  
> lexerRulePropertyRef_type.  We already have this idea:
> 
> ruleSetPropertyRef_tree(scope,attr,expr) ::= "retval.tree =<expr>;"
> 
> So, we kind of need to fix this.  I think we use the rule nesting.   
> Valid examples then:
> 
> X : ID WS? '=' ID ;  // result is X on normal channel
> WS : ' '+ {$channel = HIDDEN; } ;
> 
> NUM : INT {$type=INT;} | FLOAT {$type=FLOAT;} ;
> 
> STRING : '"' (ESC|.)* '"' ;  // result is STRING not ESC
> 
> All in favor? many/most of the lexerRulePropertyRef_XXX templates  
> would need to change.

Do I understand this correctly? If you currently call from within a 
lexer rule another lexer rule, then all attributes from the called rule 
are transferred to the calling rule. If you fix it then called rules 
don't do this anymore, but you have to do it in the calling rule... Hmm. 
That would concentrate all behaviour into one place and makes sense from 
a pure design point of view, although I have no idea if that works only 
in theory. This does confuse me at one point:

X : ID WS? '=' ID ;

If ID and WS aren't fragment rules, do they generate tokens? Or those 
tokens somehow subsumed into the X token? How does the order of rules 
change the lexing behaviour?

Johannes
_______________________________________________
antlr-dev mailing list
[email protected]
http://www.antlr.org:8080/mailman/listinfo/antlr-dev

Reply via email to