Re: [antlr-dev] Java - Out of heap space when parsing huge file

Terence Parr Wed, 14 Jul 2010 12:59:18 -0700

note that the char streams still buffer; just token buffers toss stuff out.
ter
On Jul 14, 2010, at 11:11 AM, Mahesh R. Seshan wrote:


> Sam,
> 
> Thank you very much for taking time and responding...
> 
> I made the changes that you suggested (very valuable and helpful) but ran 
> into the Java Error : Out of heap space. Then I tried using the 
> UnbufferedTokenStream and the approach worked for a file with 300,000 lines. 
> However, when I attempt to parse a file with 1 Million lines, I get the Out 
> of heap space error from the ANTLRReaderStream.
> 
> Any suggestions ?
> 
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>         at org.antlr.runtime.ANTLRReaderStream.load(ANTLRReaderStream.java:78)
>         at org.antlr.runtime.ANTLRInputStream.<init>(ANTLRInputStream.java:68)
>         at org.antlr.runtime.ANTLRInputStream.<init>(ANTLRInputStream.java:52)
>         at org.antlr.runtime.ANTLRInputStream.<init>(ANTLRInputStream.java:48)
>         at org.antlr.runtime.ANTLRInputStream.<init>(ANTLRInputStream.java:40)
>         at test.FileImport.main(FileImport.java:23)
> 
> -mahesh
> 
> On 7/13/2010 6:16 PM, Sam Harwell wrote:
>> You might try the following:
>>  
>> COMMENT :    '!' .* '\n' {skip();}
>>         ;
>> 
>> WS      :   ( ' '
>>         |       '\t'
>>         |       '\r'
>>         |       '\n' )+ {skip();}
>>         ;
>>  
>> 1.       Use skip() instead of $channel=HIDDEN to prevent the token from 
>> ever being created. Setting the channel still creates the token, it just 
>> hides it from the parser.
>> 2.       Use a + (1 or more) in the WS rule to parse whitespace runs instead 
>> of individual characters.
>> 3.       Since your code doesn’t handle old-style Mac line endings (carriage 
>> return '\r' by itself), simplify the COMMENT rule using a wildcard.
>>  
>> From: [email protected] [mailto:[email protected]] On 
>> Behalf Of Mahesh R. Seshan
>> Sent: Tuesday, July 13, 2010 4:34 PM
>> To: [email protected]
>> Subject: [antlr-dev] Java - Out of heap space when parsing huge file
>>  
>> Greetings,
>> 
>> I am trying to use an ANTLR parser to parse a huge file but runs into a Java 
>> Error indicating out of heap space. The grammar (as follows) itself is 
>> relatively simple. After going over some posts, I do not believe that 
>> UnbufferedTokenStream is an option because white-space is to be ignored in 
>> the input file...Also, UnbufferedTokenStream is not available in ANTLRv3.2 
>> which is what I am using...Any advise is greatly appreciated....
>> 
>> file    :    line+
>>         ;
>> line    :    STRING ID data ';'
>>         ;
>> data    :    primitive | sequence
>>         ;    
>> primitive 
>>         :    INTEGER
>>         ;    
>> sequence 
>>         :    '{' data? (',' data?)* '}'
>>         ;
>>     
>> INTEGER :    ('0'..'9')+
>>         ;
>>                 
>> STRING  :    '"' ~('"')* '"'
>>         ;
>>             
>> ID      :       ('a'..'z'|'A'..'Z') ('a'..'z'|'A'..'Z'|'0'..'9'|'-')*
>>         ;
>> 
>> COMMENT :    '!' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;}
>>         ;
>> 
>> WS      :   ( ' '
>>         |       '\t'
>>         |       '\r'
>>         |       '\n' ) {$channel=HIDDEN;}
>>         ;
>> Following is the Java error:
>> 
>> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>>         at org.antlr.runtime.Lexer.emit(Lexer.java:151)
>>         at org.antlr.runtime.Lexer.nextToken(Lexer.java:86)
>>         at 
>> org.antlr.runtime.CommonTokenStream.fillBuffer(CommonTokenStream.java:119)
>>         at org.antlr.runtime.CommonTokenStream.LT(CommonTokenStream.java:238)
>>         at org.antlr.runtime.CommonTokenStream.LA(CommonTokenStream.java:300)
>>         at parser.FileImportParser.file(FileImportParser.java:56)
>>         at test.FileImport.main(FileImport.java:42)
>> 
>> -mahesh
> _______________________________________________
> antlr-dev mailing list
> [email protected]
> http://www.antlr.org/mailman/listinfo/antlr-dev

_______________________________________________
antlr-dev mailing list
[email protected]
http://www.antlr.org/mailman/listinfo/antlr-dev

Re: [antlr-dev] Java - Out of heap space when parsing huge file

Reply via email to