I'm not much of an expert, and I'm not really sure what you are attempting to accomplish. So it is really hard to advise you on how to resolve the problem in the cleanest way. Hence my comment about parroting common advice.
One item about ANTLR to remember is that the lexer runs from start to finish before the parser does anything. Even if that isn't always technically true, it is effectively true. There is no backtracking from the parser to get the lexer to "retry" and realize that the token given is a NAME vs. a VALUE. You can run the lexer without a parser, I'd highly recommend you do so, and feed the lexer every kind of input you can think of and make sure you know exactly what the lexer will return before doing much of anything else. My standard advice to everyone is that I got a lot more traction once I ruthlessly enforced two rules: 1. Always separate the lexer from the parser rules. It really helps me keep in mind, that the lexer runs, then the parser runs. The two really aren't connected, and there is absolutely no feedback loop between the two. 2. Never use an inline token. The example in yours is the '=' sign (that creates a token because it is part of a parse rule, that isn't explicitly part of your lexer, so debugging problems can be harder, because there is a rule you can't see while reading the lexer source). While the book does this all the time, it really only works out in trivial grammars from what I can tell. So while it is great while quickly hacking together a calculator where all the work is done in the parser, it is harder to do if you are planning on writing a more complex example that involves multiple passes. I know why the ANTLR and the book always show the combined grammars and why they use the inline tokens, but until you grok what is really going on, they seem to cause more headaches than they solve. If you're the guy that wrote the tool, and can see the generated code, I'm sure it's really handy. For most mere mortals, it just screws you up (at least it did me, I had to sent ANTLR down for a year and come back to it before I could get over that mental hurdle). Finally, every time you have a problem, run the lexer, and print out the token stream. Make really, really sure, you are getting exactly the tokens you think you should be in exactly the order you think you should be. I've spent a ton of time tracking down a parser problem, only to realize the lexer was not doing what I expected it to. Once you are sure the lexer is doing exactly what you expect, then move on to validating the parser. Anyways, those are my lessons I learned the hardway. I'd recommend reading the list archives, several folks post pretty sound advise. Jim Idle especially so. Read the FAQ entries, as they cover a lot of useful ground. Then go look at all the grammars that are out there, and try and identify a grammar that is close the language you want to parse. So if you want to parse a C-like language, study the C parser, and maybe the Java parsers. Figuring out how to Lex C/C++ comments, or Java strings yourself is reasonable difficult when you are first starting out. Go look at the grammars of languages you understand and are similar to get clues about how to structure core pieces. If you screw up the core pieces, you'll find yourself patching up corner cases everywhere. When you are doing that it is generally a clue that you've just used the wrong approach. Best of luck, Kirby On Sun, Apr 10, 2011 at 1:43 AM, Kazuki <[email protected]> wrote: > If I understood well, the lexer will check for all tokens (NAME and VALUE) > even if I tell him to look only for VALUE. That look like weird ^^' > > I'm a little confused about tokens... Your answer is very clear : I have not > to do such low level validation because, with the context, I can tell > exactly where the error is in the tree parser. > > At this point of view, I should make an only token that match all other > tokens : > TOKEN : ('a'..'z'|'A'..'Z'|'0'..'9')+ > > What can of tokens (expressions, conditions, etc...) I can use to correctly > distinguish them ? > > -- > View this message in context: > http://antlr.1301665.n2.nabble.com/Problem-with-MismatchedTokenException-tp6257670p6258339.html > Sent from the ANTLR mailing list archive at Nabble.com. > > List: http://www.antlr.org/mailman/listinfo/antlr-interest > Unsubscribe: > http://www.antlr.org/mailman/options/antlr-interest/your-email-address > List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups "il-antlr-interest" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
