I have created an obnoxious grammar and need help lexing it. Basically, a left-bracket plus a string represents an open tag, and there's a matching close tag with a right bracket. If you really want a bracket, you type the bracket twice.
To be concrete, [/ this is text in a tag /] should lex as L_TAG(text="[/") ... tokens representing "this is text in a tag" ... R_TAG(text="/]") The problem comes when I want to explain this grammar using the grammar. To put stuff in a tag, type [[/ stuff /]] should lex as ... lots of tokens ... L_BRACKET(text="[[") ... tokens representing "/ stuff /" ... R_BRACKET(text="]]") Unfortunately, I can't figure out how to keep the lexer from matching "/]" as an R_TAG and then having the extra "]" left over. Conceptually, what I'd like to do is say that R_TAG matches a character of the appropriate type followed by ']', as long as there's no ']' immediately after. If there is are two right brackets after the character, the lexer should make those a R_BRACKET token and make the first character a simple text token. Does this make any sense? Is there some way to deal with it? Thanks, Todd List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups "il-antlr-interest" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
