Hi Jim your correct im new to ANTLR below is my CSV grammar, This is what im running
CharStream lex = new ANTLRFileStream("Dealsall3.csv"); DealsAll3Lexer csv3Lexer = new DealsAll3Lexer(lex); csv3Lexer.setBacktrackingLevel(0); CommonTokenStream tokens = new CommonTokenStream(csv3Lexer); tokens.discardOffChannelTokens(true); DealsAll3Parser csv3Parser = new DealsAll3Parser(tokens); csv3Parser.file(); System.out.println(csv3Parser.getNumberOfSyntaxErrors()); You right I could fix the above by not using the ANTLRFileStream and just using a ANTLRStringStream and chunking the file by myself outside of ANTLR. But my general issue is that not all my data is a simple CSV file some will be multi line records. Hence I didn't want to keep a record of the tokens. Any ideas . By the way thanks for your reply. Cheers Kumaap0 grammar DealsAll3 ; file : header ( detail )* EOF ; SEP : WS? ( ',') WS? ; header : 'IdentID,FGamma Tot,FutDeltaTot,FutGamma Tot,Barrier2,BarrierLevel,Cmp_CP,Cmp_Delivery,Cmp_Expiry,Cmp_Strike' NL ; detail : f_IdentID=20=20=20 SEP ( f_FGamma_Tot )? SEP ( f_FutDeltaTot )? SEP ( f_FutGamma_Tot )? SEP ( f_Barrier2 )? SEP ( f_Barrier_Level )? SEP ( f_Cmp_CP )? SEP ( f_Cmp_Delivery )? SEP ( f_Cmp_Expiry )? SEP ( f_Cmp_Strike )? NL ; f_IdentID : NUMBER ;=20=20=20=20=20=20 f_FGamma_Tot : NUMBER ; f_FutDeltaTot : NUMBER ; f_FutGamma_Tot : NUMBER ; f_Barrier2 : STRING ;=20 f_Barrier_Level : STRING ;=20 f_Cmp_CP : STRING ;=20 f_Cmp_Delivery : STRING ;=20 f_Cmp_Expiry : STRING ;=20 f_Cmp_Strike : STRING ;=20 DATETIME : DATE ( SP | 'T' ) TIME ; DATE : ( ( ( ( '0' | '1' | '2' ) '0'..'9' ) | ( '3' ( '0' | '1' ) ) ) ( '-' | '/' ) ( ( '01' | '02' | '03' | '04' | '05' | '06' | '07' | '08' | '09' | '10' | '11' | '12' ) | ( 'JAN' | 'FEB' | 'MAR' | 'APR' | 'MAY' | 'JUN' | 'JUL' | 'SEP' | 'OCT' | 'NOV' | 'DEC' ) | ( 'Jan' | 'Feb' | 'Mar' | 'Apr' | 'May' | 'Jun' | 'Jul' | 'Sep' | 'Oct' | 'Nov' | 'Dec' ) ) ( '-' | '/' ) ( ( '0'..'9' '0'..'9' )? '0'..'9' '0'..'9' ) ) | ( ( '0'..'9' '0'..'9' '0'..'9' '0'..'9' ) ( '-' | '/' ) ( '01' | '02' | '03' | '04' | '05' | '06' | '07' | '08' | '09' | '10' | '11' | '12' ) ( '-' | '/' ) ( ( ( '0' | '1' | '2' ) '0'..'9' ) | ( '3' ( '0' | '1' ) ) ) ) ; MONTH_YEAR : ( ( '01' | '02' | '03' | '04' | '05' | '06' | '07' | '08' | '09' | '10' | '11' | '12' ) | ( 'JAN' | 'FEB' | 'MAR' | 'APR' | 'MAY' | 'JUN' | 'JUL' | 'SEP' | 'OCT' | 'NOV' | 'DEC' ) | ( 'Jan' | 'Feb' | 'Mar' | 'Apr' | 'May' | 'Jun' | 'Jul' | 'Sep' | 'Oct' | 'Nov' | 'Dec' ) ) '-' ( ( '0'..'9' '0'..'9' )? '0'..'9' '0'..'9' ) ; MONTH_DAY : ( ( ( '0' | '1' | '2' ) '0'..'9' ) | ( '3' ( '0' | '1' ) ) ) '-' ( ( '01' | '02' | '03' | '04' | '05' | '06' | '07' | '08' | '09' | '10' | '11' | '12' ) | ( 'JAN' | 'FEB' | 'MAR' | 'APR' | 'MAY' | 'JUN' | 'JUL' | 'SEP' | 'OCT' | 'NOV' | 'DEC' ) | ( 'Jan' | 'Feb' | 'Mar' | 'Apr' | 'May' | 'Jun' | 'Jul' | 'Sep' | 'Oct' | 'Nov' | 'Dec' ) ) ; TIME : ( ( '0'..'1' '0'..'9' ) | ('2' '0'..'4' ) ) // '00' to '24' ':' ( '0'..'5' '0'..'9' ) // '00' to '60' ':' ( '0'..'5' '0'..'9' ) // '00' to '60' ( ( 'Z' // UTC | ( '+' | '-' ) '00' ( (':' | ' ' ) '00' )? ) ? ) ; NUMBER : ( '+' | '-' )? // It may be signed ( ( '0'..'9' )+ '.' ( '0'..'9' )* // Decimal point with leading and trailing digits | '.' ( '0'..'9' )+ // or it may be just a mantissa | '0'..'9'+ // or it may be an integer ) ; STRING : ('"') VALID_CHAR+ ('"') // Must have quotes at both ends | VALID_CHAR+ // or no quote at either end ; fragment VALID_CHAR : ( 'a'..'z' | 'A'..'Z' | '0'..'9' // the alphanumeric characters | ' ' // x20 =3D SPACE | '!' // x21 =3D EXCLAMATION MARK | '#' // x23 =3D NUMBER SIGN | '$' // x24 =3D DOLLAR SIGN | '%' // x25 =3D PERCENT SIGN | '&' // x26 =3D AMPERSAND | '(' // x28 =3D LEFT PARENTHESIS | ')' // x29 =3D RIGHT PARENTHESIS | '*' // x2a =3D ASTERISK | '+' // x2b =3D PLUS SIGN // SEP char ',' // x2c =3D COMMA | '-' // x2d =3D HYPHEN-MINUS | '.' // x2e =3D FULL STOP | '/' // x2f =3D SOLIDUS | ':' // x3a =3D COLON | ';' // x3b =3D SEMICOLON | '<' // x3c =3D LESS-THAN SIGN | '=3D' // x3d =3D EQUALS SIGN | '>' // x3e =3D GREATER-THAN SIGN | '?' // x3f =3D QUESTION MARK | '@' // x40 =3D COMMERCIAL AT | '[' // x5b =3D LEFT SQUARE BRACKET | ']' // x5d =3D RIGHT SQUARE BRACKET | '^' // x5e =3D CIRCUMFLEX ACCENT | '_' // x5f =3D LOW LINE | '`' // x60 =3D GRAVE ACCENT | '{' // x7b =3D LEFT CURLY BRACKET | '|' // x7c =3D VERTICAL LINE | '}' // x7d =3D RIGHT CURLY BRACKET | '~' // x7e =3D TILDE ) ; -----Original Message----- From: antlr-interest-boun...@antlr.org [mailto:antlr-interest-boun...@antlr.org] On Behalf Of Jim Idle Sent: 01 April 2010 14:58 To: antlr-interest@antlr.org Subject: Re: [antlr-interest] Parsing Large Files The other possibility is of course that you are trying to parse a massive file in one lump. You probably just want to reinvoke the parser for each deal record (break it up in the string tream. Jim > -----Original Message----- > From: antlr-interest-boun...@antlr.org [mailto:antlr-interest- > boun...@antlr.org] On Behalf Of Kumar, Amitesh > Sent: Thursday, April 01, 2010 2:13 AM > To: antlr-interest@antlr.org > Subject: [antlr-interest] Parsing Large Files > > Hi Guys what we are looking for is just parsing the file and recording > the errors we don't need to keep a track of any tokens or a AST. > Im getting > Exception in thread "main" java.lang.OutOfMemoryError: Java heap space > at java.util.Arrays.copyOf(Arrays.java:2760) > at java.util.Arrays.copyOf(Arrays.java:2734) > at java.util.ArrayList.ensureCapacity(ArrayList.java:167) > at java.util.ArrayList.add(ArrayList.java:351) > at > org.antlr.runtime.CommonTokenStream.fillBuffer(CommonTokenStream.java: > 1 > 1 > 6) > at > org.antlr.runtime.CommonTokenStream.LT(CommonTokenStream.java:238) > at > org.antlr.runtime.Parser.getCurrentInputSymbol(Parser.java:54) > at > org.antlr.runtime.BaseRecognizer.match(BaseRecognizer.java:104) > at DealsAll2Parser.header(DealsAll2Parser.java:123) > at DealsAll2Parser.file(DealsAll2Parser.java:67) > at AntlrMain.main(AntlrMain.java:53) I see where the error is > coming from the CommonTokenStream is keeping track of all past tokens, > how can I make it so it doesn't. Do I have to create my own Token > Stream? Or is there a easy way. > > Cheers > Kumaap0 > > > ********************************************************************** > * > ****** > More information on Standard Bank is available at www.standardbank.com > > Everything in this email and any attachments relating to the official > business of Standard Bank Group Limited and any or all subsidiaries, > ("the Company"), is proprietary to the Company. It is confidential, > legally privileged and protected by relevant laws. The Company does > not own and endorse any other content. > Views and opinions are those of the sender unless clearly stated as > being that of the Company. > > The person or persons addressed in this email are the sole authorised > recipient. Please notify the sender immediately if it has > unintentionally, or inadvertently reached you and do not read, > disclose or use the content in any way and delete this e-mail from > your system. > > The Company cannot ensure that the integrity of this email has been > maintained nor that it is free of errors, virus, interception or > interference. > The sender therefore does not accept liability for any errors or > omissions in the contents of this message which arise as a result of > e-mail transmission. > If verification is required please request a hard-copy version. This > message is provided for informational purposes and should not be > construed as a solicitation or offer to buy or sell any securities or > related financial instruments. > ********************************************************************** > * > ****** > > > This message has been scanned for viruses by BlackSpider MailControl - > www.blackspider.com > > List: http://www.antlr.org/mailman/listinfo/antlr-interest > Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your- > email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups "il-antlr-interest" group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.