> -----Original Message-----
> From: [email protected] [mailto:antlr-interest-
> [email protected]] On Behalf Of André Næss
> Sent: Saturday, December 12, 2009 8:23 AM
> To: [email protected]
> Subject: [antlr-interest] Struggling with recursion error
> 
> Hi
> 
> I've been trying for quite some time now to put together a minimal toy
> parser, but I'm completely stuck on this recursion error that I don't
> understand.
> 
> The parser I'm trying to write is a simple parser for a language that
> has only let-bindings, if-then-else, integers, variables, and function
> application. The problem is the latter. A function application
> typically looks like this:
> 
>    foo a b 10
> 
> And I need this to parse as (foo a b 10), not (foo (a (b 10))) (I want
> to be able to pass functions as values).
> 
> After some back and forth I figured that by making the topmost expr
> match either a function application or any other expression, where
> function applications all begin with ID, I should get the correct
> parse. And indeed, I do, with backtrack on. If I turn off backtrack,
> the whole thing fails with

Yes - this isn't the way to do it, you need to combine your ID and function 
rules in some way. I think that what you are lacking is understanding of LL(n) 
parsing and how the ANTLR rule layout relates to this. The things in your top 
level expression have the lowest precedence and the things down at the end of 
the chain (your atoms) have the highest. When seeing an ID your current rules 
will pick the higher precedence, which is your atom. So, you need to left 
factor those as they are ambiguous. 

However a bigger problem is that you have a completely ambiguous language, 
which would only work if this was a runtime interpreted script as without 
parenthesis to delimit the function arguments, you cannot know what is an ID 
and what is a function. So, we can make it parse as foo a b 10, but once you 
see ID, you will consume everything else as arguments to your function up to 
the ';'. Are you trying to design a language or write something to parse an 
existing language? The nearest I can see to something like this is VBScript, 
which was horrendous to write a parser for and the only way to do it was using 
global backtracking. However, even VBScript is not this ambiguous. 

Designing a language is a difficult thing and you will need a lot more 
experience before being able to do this well. However, as there are so many 
horrible languages out there already now, then why not one that is completely 
logically invalid ;-)

So, if we rework your example then you will find you get:

[10:21:11] error(211): T.g:25:12: [fatal] rule atomExpr has non-LL(*) decision 
due to recursive rule invocations reachable from alts 1,2.  Resolve by 
left-factoring or using syntactic predicates or using backtrack=true option.
[10:21:11] error(211): T.g:25:8: [fatal] rule atomExpr has non-LL(*) decision 
due to recursive rule invocations reachable from alts 1,2.  Resolve by 
left-factoring or using syntactic predicates or using backtrack=true option.

Because there is no way to decide between continuing to consume arguments or 
ending the list of them. Generally the dynamic languages that think it is good 
to lose things like ',' and '(' ')' just create huge ambiguities. I have never 
really understood why people think it such a pain to type ';' at the end of the 
statement - productivity gained through compiler errors and warnings far 
outweighs any 'pain' of typing ';'.

So, we can make your grammar work if we delimit the function arguments with '(' 
')' and use a single token predicate. However, unless you are trying to parse 
something that already exists and have no choice, then I would abandon this 
avenue and first work your way through the book and use the examples downloaded 
from the download page to see how things are solved for languages you know such 
as Java or C etc. I would also recommend that you don't use 'literals' in your 
parser but create lexer rules as the literals can lead you astray when learning 
and are a pain to deal with when processing error messages.


grammar T;
 
options {
  output=AST;
}
 
tokens {
FUNC_APPL;
IFELSE;
LET;
VAR_REF;
}

program
        :       expr (';' expr)*
        ;
 
expr    :       ifElseExpr
        |       letExpr
        |       atomExpr
        ;
 
atomExpr 
        :       INT
        |       ID (    ('(')=>'(' expr* ')' 
                       -> ^(FUNC_APPL ID expr*)
                     | -> ^(VAR_REF ID)
                   )
        |       '('! expr^ ')'!
        ;
 
letExpr
        :       'let' letBinding+ (',' letBinding)* 'in' expr -> ^(LET 
letBinding+ expr)
        ;
 
letBinding
        :       (ID^ '='! expr)
        ;
 
ifElseExpr
        :       'if' cond=expr 'then' thenPart=expr elsePart='else' expr 
                
                -> ^(IFELSE $cond $thenPart $elsePart)
        ;

INT : ('0'..'9')+ ;
ID : ('a'..'z'|'A'..'Z')+ ;


Jim






List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

--

You received this message because you are subscribed to the Google Groups 
"il-antlr-interest" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.


Reply via email to