At 13:37 12/12/2009, David-Sarah Hopwood wrote: >I thought lexer rules were supposed to find the longest match? >How can they do that if they're unable to handle common left >prefixes? > >(I have the impression that "longest match" may not be quite >accurate, but if so, I've never seen the actual behaviour >documented precisely.)
In v3, for fixed-length input (eg. keywords), usually the longest match wins, yes. (Though as Jim said, the order of rules also plays a role.) When loops are involved things get a bit more murky. I've heard mixed stories on how well it copes with that -- I think it might depend on whether it decides to generate a DFA or stick with lookahead conditions. Even if it does manage to make something that'll work, it'll certainly cause extra processing both at compile and runtime, though, so it's definitely something to be avoided. In v2, it's impossible to deal with. v2 lexers operate with completely fixed lookahead; for example, if k=3 then it'll look ahead, see "123" and find that this matches both rules; it can't look any further ahead to disambiguate. So it'll correctly produce a FLOAT if the prefix prior to the decimal point or exponent marker is zero to (k-1) digits long, but any more than that and it'll probably make an INT, since that rule was listed first. (And you always want k to be minimal, to reduce overhead and improve performance. So increasing k is not the right answer. Refactoring the rules is.) List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups "il-antlr-interest" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
