I'm having a strange problem with ANTLR. I'm building a grammar for a 
language with a huge number (hundreds) of non-reserved keywords. I'm 
using the approach of having the lexer return a different token type for 
each keyword, and then having a parser rule of the form:

    id : ( ID | QUOTED_ID | KW_A | KW_B | ... | KW_ZZZ );

This was working great until today. In fact, ANTLR 3.2 generates 
surprisingly clever code for this - all the keywords are assigned 
consecutive token numbers, and generated code just says:

    if ( (input.LA(1)>=KW_A && input.LA(1)<=KW_ZZZ)||(input.LA(1)>=ID && 
input.LA(1)<=QUOTED_ID) ) {
        input.consume();
        ...

This works all the way up to 631 keywords. ANTLR runs in about 20 
seconds, and never uses more than 269MB of memory. When I add a 632nd 
keyword (doesn't matter what the keyword is), and change nothing else, 
ANTLR runs for 2 minutes and runs out of heap space. I kept bumping the 
max space up, but even going to 2GB doesn't make any difference.

What's really interesting is that I was using ANTLR 3.1 until now. When 
I ran into this I upgraded to 3.2, but both of them fail at exactly the 
same spot, 632 keywords. Not surprisingly, the stack trace varies from 
one run to the next, depending on the exact point it runs out of memory, 
but it always has deeply nested calls to these and other methods:

    
org.antlr.stringtemplate.language.ASTExpr.writeTemplate(ASTExpr.java:750)
    org.antlr.stringtemplate.language.ASTExpr.write(ASTExpr.java:680)
    
org.antlr.stringtemplate.language.ASTExpr.writeAttribute(ASTExpr.java:660)
    
org.antlr.stringtemplate.language.ActionEvaluator.action(ActionEvaluator.java:86)
    org.antlr.stringtemplate.language.ASTExpr.write(ASTExpr.java:149)
    org.antlr.stringtemplate.StringTemplate.write(StringTemplate.java:705)

I don't know if it makes a difference, but I'm using backtracking 
(otherwise, this approach to non-reserved keywords doesn't work without 
a lot of synpreds), and outputting ASTs.

Since this is size related, it's hard to narrow it down to a simple 
example. I could try to duplicate it with just the id rule and nothing else.

Any ideas what might be happening here, and whether a fix might be possible?

Thanks,
Ron

-- 
Ron Hunter-Duvar | Software Developer V | 403-272-6580
Oracle Service Engineering
Gulf Canada Square 401 - 9th Avenue S.W., Calgary, AB, Canada T2P 3C5

All opinions expressed here are mine, and do not necessarily represent
those of my employer.


List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
You received this message because you are subscribed to the Google Groups 
"il-antlr-interest" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en.

Reply via email to