You might try the following:

 

COMMENT :    '!' .* '\n' {skip();}
        ;

WS      :   ( ' '
        |       '\t'
        |       '\r'
        |       '\n' )+ {skip();}
        ;

 

1.       Use skip() instead of $channel=HIDDEN to prevent the token from
ever being created. Setting the channel still creates the token, it just
hides it from the parser.

2.       Use a + (1 or more) in the WS rule to parse whitespace runs instead
of individual characters.

3.       Since your code doesn't handle old-style Mac line endings (carriage
return '\r' by itself), simplify the COMMENT rule using a wildcard.

 

From: [email protected] [mailto:[email protected]] On
Behalf Of Mahesh R. Seshan
Sent: Tuesday, July 13, 2010 4:34 PM
To: [email protected]
Subject: [antlr-dev] Java - Out of heap space when parsing huge file

 

Greetings,

I am trying to use an ANTLR parser to parse a huge file but runs into a Java
Error indicating out of heap space. The grammar (as follows) itself is
relatively simple. After going over some posts, I do not believe that
UnbufferedTokenStream is an option because white-space is to be ignored in
the input file...Also, UnbufferedTokenStream is not available in ANTLRv3.2
which is what I am using...Any advise is greatly appreciated....

file    :    line+
        ;
line    :    STRING ID data ';'
        ;
data    :    primitive | sequence
        ;    
primitive 
        :    INTEGER
        ;    
sequence 
        :    '{' data? (',' data?)* '}'
        ;
    
INTEGER :    ('0'..'9')+
        ;
                
STRING  :    '"' ~('"')* '"'
        ;
            
ID      :       ('a'..'z'|'A'..'Z') ('a'..'z'|'A'..'Z'|'0'..'9'|'-')*
        ;

COMMENT :    '!' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;}
        ;

WS      :   ( ' '
        |       '\t'
        |       '\r'
        |       '\n' ) {$channel=HIDDEN;}
        ;

Following is the Java error:

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
        at org.antlr.runtime.Lexer.emit(Lexer.java:151)
        at org.antlr.runtime.Lexer.nextToken(Lexer.java:86)
        at
org.antlr.runtime.CommonTokenStream.fillBuffer(CommonTokenStream.java:119)
        at
org.antlr.runtime.CommonTokenStream.LT(CommonTokenStream.java:238)
        at
org.antlr.runtime.CommonTokenStream.LA(CommonTokenStream.java:300)
        at parser.FileImportParser.file(FileImportParser.java:56)
        at test.FileImport.main(FileImport.java:42)

-mahesh

_______________________________________________
antlr-dev mailing list
[email protected]
http://www.antlr.org/mailman/listinfo/antlr-dev

Reply via email to