I have created an obnoxious grammar and need help lexing it.
Basically, a left-bracket plus a string represents an open tag, and
there's a matching close tag with a right bracket. If you really want
a bracket, you type the bracket twice.

To be concrete,

[/ this is text in a tag /]
should lex as
L_TAG(text="[/") ... tokens representing "this is text in a tag" ...
R_TAG(text="/]")

The problem comes when I want to explain this grammar using the grammar.

To put stuff in a tag, type [[/ stuff /]]
should lex as
... lots of tokens ... L_BRACKET(text="[[") ... tokens representing "/
stuff /" ... R_BRACKET(text="]]")

Unfortunately, I can't figure out how to keep the lexer from matching
"/]" as an R_TAG and then having the extra "]" left over.

Conceptually, what I'd like to do is say that R_TAG matches a
character of the appropriate type followed by ']', as long as there's
no ']' immediately after. If there is are two right brackets after the
character, the lexer should make those a R_BRACKET token and make the
first character a simple text token.

Does this make any sense? Is there some way to deal with it?

Thanks,
Todd

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
"il-antlr-interest" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.

Reply via email to