One way to handle keywords is with a zero-width look-ahead assertion. You append this construct to a keyword production with a test that passes if the next character is a non-alpha-numeric but then leaves it on the input stream.
Best regards, Jason On Mon, Apr 11, 2011 at 3:25 PM, Terence Parr <[email protected]> wrote: > I see in an early 2004 workshop that I intended to handle Context-sensitive > lexing: > > http://www.antlr.org/workshop/ANTLR2004/proceedings/ANTLR-3.0-Features.pdf > > Each parser decision point generates special rule > in lexer with possible choices: e.g., (ID|INT) > Difficulties > “for”, find “fore”must say “missing for, found ID” > whitespace > The C++ template vs ">>" token problem simply > disappears; i.e., when lexing > List<List<int>> a; > nested template has ">>" in it. Lexer, without context, > cannot know which to pick. Only the parser knows that > it expects ">" followed by ">" not ">>" token > > Scott Stanchfield also has some thoughts along these lines > > http://javadude.com/articles/antlr-context-sensitive-scanner.html > > I'm glad I wrote that slide because I couldn't remember what the > difficulties were with context-sensitive Lexing. keywords are an issue as > is white space. If I remember correctly Rats has a predicate in its > identifier rule that makes it fail if it finds the id is also a keyword > (yep, just checked). For whitespace, it simply scarfs whitespace I think in > between rule references maybe. > > Instead of forcing context-sensitive entry points into the lexer, I think a > scannerless parser is simpler to understand conceptually. Rats is very good > at combining grammars and it might be fun to come up with a scannerless > version of ANTLR. It can be done easily right now by simply passing in > characters as tokens and turning on backtracking with memoization. Perhaps > I'll try that out. > > stat : 'return' e ';' | id '=' e ';' {String s = $id.text;} ; > > id : 'a' | 'b' | 'c' | ... ; > e : int ; > int : '0' | '1' | '2' ... ; > > yep, that should work even with that action. There is no notion of a token > really. hhm...cool. > > Ter > PS oh crap...I should be preparing to teach in 30 minutes! > > > List: http://www.antlr.org/mailman/listinfo/antlr-interest > Unsubscribe: > http://www.antlr.org/mailman/options/antlr-interest/your-email-address > -- --Jason Doege [email protected] List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups "il-antlr-interest" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
