> -----Original Message-----
> From: [email protected] [mailto:antlr-interest-
> [email protected]] On Behalf Of André Næss
> Sent: Saturday, December 12, 2009 8:23 AM
> To: [email protected]
> Subject: [antlr-interest] Struggling with recursion error
>
> Hi
>
> I've been trying for quite some time now to put together a minimal toy
> parser, but I'm completely stuck on this recursion error that I don't
> understand.
>
> The parser I'm trying to write is a simple parser for a language that
> has only let-bindings, if-then-else, integers, variables, and function
> application. The problem is the latter. A function application
> typically looks like this:
>
> foo a b 10
>
> And I need this to parse as (foo a b 10), not (foo (a (b 10))) (I want
> to be able to pass functions as values).
>
> After some back and forth I figured that by making the topmost expr
> match either a function application or any other expression, where
> function applications all begin with ID, I should get the correct
> parse. And indeed, I do, with backtrack on. If I turn off backtrack,
> the whole thing fails with
Yes - this isn't the way to do it, you need to combine your ID and function
rules in some way. I think that what you are lacking is understanding of LL(n)
parsing and how the ANTLR rule layout relates to this. The things in your top
level expression have the lowest precedence and the things down at the end of
the chain (your atoms) have the highest. When seeing an ID your current rules
will pick the higher precedence, which is your atom. So, you need to left
factor those as they are ambiguous.
However a bigger problem is that you have a completely ambiguous language,
which would only work if this was a runtime interpreted script as without
parenthesis to delimit the function arguments, you cannot know what is an ID
and what is a function. So, we can make it parse as foo a b 10, but once you
see ID, you will consume everything else as arguments to your function up to
the ';'. Are you trying to design a language or write something to parse an
existing language? The nearest I can see to something like this is VBScript,
which was horrendous to write a parser for and the only way to do it was using
global backtracking. However, even VBScript is not this ambiguous.
Designing a language is a difficult thing and you will need a lot more
experience before being able to do this well. However, as there are so many
horrible languages out there already now, then why not one that is completely
logically invalid ;-)
So, if we rework your example then you will find you get:
[10:21:11] error(211): T.g:25:12: [fatal] rule atomExpr has non-LL(*) decision
due to recursive rule invocations reachable from alts 1,2. Resolve by
left-factoring or using syntactic predicates or using backtrack=true option.
[10:21:11] error(211): T.g:25:8: [fatal] rule atomExpr has non-LL(*) decision
due to recursive rule invocations reachable from alts 1,2. Resolve by
left-factoring or using syntactic predicates or using backtrack=true option.
Because there is no way to decide between continuing to consume arguments or
ending the list of them. Generally the dynamic languages that think it is good
to lose things like ',' and '(' ')' just create huge ambiguities. I have never
really understood why people think it such a pain to type ';' at the end of the
statement - productivity gained through compiler errors and warnings far
outweighs any 'pain' of typing ';'.
So, we can make your grammar work if we delimit the function arguments with '('
')' and use a single token predicate. However, unless you are trying to parse
something that already exists and have no choice, then I would abandon this
avenue and first work your way through the book and use the examples downloaded
from the download page to see how things are solved for languages you know such
as Java or C etc. I would also recommend that you don't use 'literals' in your
parser but create lexer rules as the literals can lead you astray when learning
and are a pain to deal with when processing error messages.
grammar T;
options {
output=AST;
}
tokens {
FUNC_APPL;
IFELSE;
LET;
VAR_REF;
}
program
: expr (';' expr)*
;
expr : ifElseExpr
| letExpr
| atomExpr
;
atomExpr
: INT
| ID ( ('(')=>'(' expr* ')'
-> ^(FUNC_APPL ID expr*)
| -> ^(VAR_REF ID)
)
| '('! expr^ ')'!
;
letExpr
: 'let' letBinding+ (',' letBinding)* 'in' expr -> ^(LET
letBinding+ expr)
;
letBinding
: (ID^ '='! expr)
;
ifElseExpr
: 'if' cond=expr 'then' thenPart=expr elsePart='else' expr
-> ^(IFELSE $cond $thenPart $elsePart)
;
INT : ('0'..'9')+ ;
ID : ('a'..'z'|'A'..'Z')+ ;
Jim
List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe:
http://www.antlr.org/mailman/options/antlr-interest/your-email-address
--
You received this message because you are subscribed to the Google Groups
"il-antlr-interest" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/il-antlr-interest?hl=en.