This test grammar was called "crap" by Jim Idle. I am willing to eat the humble
pie and admit where I am an ANTLR novice or don't know something about
grammars, but I am just not seeing it in this simple case:
grammar testerrors;
options
{
language='C';
}
NAME : ( 'a'..'z' | 'A'..'Z' | '0'..'9' )+ ;
WS : ( ' ' | '\t' | '\r' | '\n' )+ { $channel = HIDDEN; } ;
parse:
decl ( options { greedy = true; }: ',' decl )* ','? EOF
;
decl:
NAME ':' type
;
type:
'int' | 'float'
;
The start symbol is a comma-delimited list of simple '<name> : <type>'
declarations and allows the list to optionally end in a comma as is done in
some languages (Python, etc). This is a pretty common way to structure it. In
JavaCC, for example, you'd use a local LOOKAHEAD(2) inside the ()* to
disambiguate the choice between matching one more decl or ending the list.
Without it and with the default k=1, JavaCC emits an ambiguity warning at
parser generation time. In ANTLR case, the ambiguity can be dealt with
similarly, with a local k=2 option or the way done above (which I borrowed from
http://www.antlr.org/grammar/1200715779785/Python.g). Without either, ANTLR
also emits a warning at parser generation time. All of this seems to work as
expected.
So, what is so obviously wrong with the grammar snippet that deserves the
"crap" moniker? I am learning ANTLR because I want to add a multi-target parser
generator tool to my skill set. For Java work, JavaCC is still out there and
generates fast parsers, has good error handling, and can build ASTs/visitors.
In C++, I would normally do a simple case like this via boost.spirit but it's a
bit of a template metaprogramming monster. With ANTLR I am successfully
compiling my C parser within a larger C++ codebase and the only learning curve
issues are odd error messages on relatively trivial input errors, where ANTLR
can't seem to identify the token it is expecting. E.g., input "name : bad"
results in
-memory-(1) : error 10 : Unexpected token, at offset 6
near [Index: 0 (Start: 0-Stop: 0) ='<missing <invalid>>', type<0> Line: 1
LinePos:6]
: Missing <invalid>
I would be happy to get specific pointers to docs and articles on how to
improve error handling by ANTLR *C* parsers. At least being able to modify the
stock error display function to tackle the common case of mis-spelling a token
name would be great.
Thank you,
Vlad
List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe:
http://www.antlr.org/mailman/options/antlr-interest/your-email-address
--
You received this message because you are subscribed to the Google Groups
"il-antlr-interest" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/il-antlr-interest?hl=en.