[il-antlr-interest: 32365] [antlr-interest] Source code level of ANTLRWorks
Hi Is there a way to set the source code level of antlrworks? I've checked the preferences pane and launchers with no avail. I need to set it to 1.5, since I'm using Java Generics: [15:47:58] 102. ERROR in /home/bcorne/Downloads/at2-parser-3/grammar/output/ATGrammar3Lexer.java (at line 114) [15:47:58] StackString paraphrase = new StackString(); [15:47:58] ^^ [15:47:58] Syntax error, parameterized types are only available if source level is 1.5 Regards Ben Corne List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 32366] Re: [antlr-interest] Inserting missing nodes
No one can help me with this? :S Let me know if something is not clear. I need to fix this issue as soon as I can. Thanks -Original Message- From: antlr-interest-boun...@antlr.org [mailto:antlr-interest-boun...@antlr.org] On Behalf Of Jean-Sebastien Vachon Sent: April-28-11 4:07 PM To: antlr-interest@antlr.org Subject: [antlr-interest] Inserting missing nodes Hi All, First, I'd like to make it clear that I'm new to ANTLR so please be kind with me ;) Second, my main problem right now is that I'm currently building a grammar that will let me validate and parse a Boolean query with some special features. I got 90% of my parser working but I'm stuck with the last feature that is required. Basically, I need to be able to insert missing operators (AND/OR) where required. Considering the following query: software engineer java I need to build a tree representing the query as if it was software AND engineer AND java but I also need to be able to change the inserted operator 'AND' to something else. My first thought was to push a new type of node (let's say DEFAULT_OP) into my tree using a rewrite rule that I could rewrite to the proper operator using a tree walker and/or translator. I made a few tries and got it working in some situations but I can't get it to parse everything I'm throwing at it. My best try so far is shown in the listing below... I did not include the lexer as it is pretty straight forward... All hints and comments are welcomed... Thanks for your help === grammar MyGrammar; options { language = Java; output = AST; ASTLabelType = CommonTree; } query : and_expr+ EOF! ; and_expr : (expr expr+) = default_op | (u1=or_expr (AND^ u2=or_expr)*); or_expr : u1=expr (OR^ u2=expr)* ; default_op : (e1=or_expr e2=or_expr) - ^(DEFAULT_OP $e1 $e2) ; expr : (NOT^)? (operand) ; operand : (FIELD^)(operand) | PREFIX | WORD | SENTENCE | WORDLIST | NEGATIVE(w=PREFIX|w=WORD|w=SENTENCE|w=WORDLIST) - ^(NOT $w) | MUST | LPAREN! expr RPAREN! ; List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 32367] Re: [antlr-interest] Inserting missing nodes
On Wed, May 4, 2011 at 4:12 PM, Jean-Sebastien Vachon jean-sebastien.vac...@wantedtech.com wrote: No one can help me with this? :S Let me know if something is not clear. I need to fix this issue as soon as I can. Thanks The fact that you didn't provide the lexer rules (although they might be straight-forward as you mentioned), and you didn't mention what input you're specifically having problems with parsing (the following is a bit vague: *... but I can't get it to parse everything I'm throwing at it ...*), might be some reasons why you haven't been answered. Regards, Bart. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 32368] Re: [antlr-interest] Memory requirements of C runtime when backtracking
You should not be using backtrack=true if you are short on memory, but without seeing your grammar I cannot comment on the ram usage. It might be that your grammar causes the generation of huge DFA tables. Backtracking itself does not cost lots of memory though. jim -Original Message- From: antlr-interest-boun...@antlr.org [mailto:antlr-interest- boun...@antlr.org] On Behalf Of Ivan Brezina Sent: Wednesday, May 04, 2011 5:20 AM To: antlr-interest Interest Subject: [antlr-interest] Memory requirements of C runtime when backtracking Hi all doing some unit test of Oracle SQL grammar I'm facing problems with memory requirements when parsing input having many parenthesis. In general the grammar can have three types of statements which can be enclosed in parens. 1. value expression like (A),(1+2). For an sql_expression there is set of rules using numerical operator sql_expression - expr_add - expr_mul - expr_sign - expr_pow - expr_paren 2. logical expression like (A 2) or (A is null) For an condition expression there is a set of rules using locical operator sql_condition - condition_or - condition_and - condition_not - condition_paren 3. a subquery. It is an sql statement enclosed in parenthesis like (SELECT * FROM dual). These 3 types can by combined in many ways. For example ((SELECT count(1) from dual) 1) While testing if found queries - probably generated by some evil sick robot - whose require more than 8B of RAM to parse even they are quite short For example: SELECT * FROM TABLE1, TABLE2 WHERE ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( TABLE2.DT = '2' ) OR ( TABLE2.DT = '3' ) ) AND ( TABLE2.CODE 9 ) ) AND ( TABLE2.WH = 'XXX' ) ) AND ( TABLE1.ID = '001' ) ) AND ( ( TABLE1.ATTR_1 IS NULL ) OR ( TABLE1._ATTR1 = '*' ) ) ) AND ( ( TABLE1.ATTR_2 IS NULL ) OR ( TABLE1._ATTR2 = '*' ) ) ) AND ( ( TABLE1.ATTR_3 IS NULL ) OR ( TABLE1._ATTR3 = '*' ) ) ) AND ( ( TABLE1.ATTR_4 IS NULL ) OR ( TABLE1._ATTR4 = '*' ) ) ) AND ( ( TABLE1.ATTR_5 IS NULL ) OR ( TABLE1._ATTR5 = '*' ) ) ) AND ( ( TABLE2.TYPE IS NULL ) OR ( TABLE2.TYPE = '*' ) ) ) AND ( ( TABLE2.NBR IS NULL ) OR ( TABLE2.NBR = '*' ) ) ) AND ( ( TABLE2.STAT = '01' ) OR ( TABLE2.STAT = '*' ) ) ) AND ( ( TABLE2.ORGN IS NULL ) OR ( TABLE2.ORGN = '*' ) ) ) AND ( TABLE2.NBR = '' ) ) AND ( TABLE2.PO IS NULL ) ) Both value expression and condition expression rules do backtracking. In the example above every parenthesis nesting doubles memory requirements. Are there any ways how to reduce/monitor memory requirements? What exactly is remembered when backtracking? I tried to add some syntactic predicates into value/conditional expression but it usually led to failure of other tests. thx Ivan This message was sent using IMP, the Internet Messaging Program. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your- email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 32369] Re: [antlr-interest] Source code level of ANTLRWorks
That just means you are calling a compiler that has a default level less than 1.5. So, chance the javacc command in the preferences or remove the old compiler and install 1.6, or makes sure that 1.6 is the first on the command line. ANTLR works does not influence the Java compiler - it only calls the one that you tell it to. Jim -Original Message- From: antlr-interest-boun...@antlr.org [mailto:antlr-interest- boun...@antlr.org] On Behalf Of Ben Corne Sent: Wednesday, May 04, 2011 6:56 AM To: antlr-interest@antlr.org Subject: [antlr-interest] Source code level of ANTLRWorks Hi Is there a way to set the source code level of antlrworks? I've checked the preferences pane and launchers with no avail. I need to set it to 1.5, since I'm using Java Generics: [15:47:58] 102. ERROR in /home/bcorne/Downloads/at2-parser-3/grammar/output/ATGrammar3Lexer.java (at line 114) [15:47:58] StackString paraphrase = new StackString(); [15:47:58] ^^ [15:47:58] Syntax error, parameterized types are only available if source level is 1.5 Regards Ben Corne List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your- email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 32370] Re: [antlr-interest] Inserting missing nodes
Thanks for your input. So here is the whole thing with two use cases that are not giving me the expected results... (Sorry for the long post) INPUT = abc def zyx toto RESULT = (DEFAULT_OP abc def) (DEFAULT_OP zyx toto) EXPECTED = (DEFAULT_OP (DEFAULT_OP abc def) (DEFAULT_OP zyx toto)) INPUT = software engineer OR java programmer RESULT = (DEFAULT_OP software (OR engineer java)) programmer EXPECTED = (DEFAULT_OP (DEFAULT_OP software (OR engineer java)) programmer) I'm also having some trouble using the Interpreter within Eclipse. The same expressions are not working in the interpreter. It fails to generate the tree with a NoViableAltException at input 'abc' (for the first case). I don't think this is related to my other problem since I can't get it to generate any tree. Thanks again for your time and comments -- Grammar (validation by building a tree and trying to insert missing operators) -- grammar MyGrammar; options { language = Java; output = AST; ASTLabelType = CommonTree; } // Rules to build the tree representation of our expression... query : and_expr+ EOF! ; // Each AND expression can contain OR expressions... and_expr : (expr expr+) = default_op | (u1=or_expr (AND^ u2=or_expr)*) ; // A OR expression contains one or more expression or_expr : u1=expr (OR^ u2=expr)* ; default_op : (e1=or_expr e2=or_expr) - ^(DEFAULT_OP $e1 $e2) ; expr : (NOT^)? (operand) ; // The leafs of the tree.. Words, sentence and so on... // Note that an expression such as '-word' is rewritten in its 'NOT word' form operand : (f=FIELD^)(o=operand) | PREFIX | WORD | SENTENCE | WORDLIST | NEGATIVE(w=PREFIX|w=WORD|w=SENTENCE|w=WORDLIST) - ^(NOT $w) | MUST | LPAREN! and_expr RPAREN! ; // Lexer ... NEGATIVE: '-'; LPAREN : '(' ; RPAREN : ')' ; DOUBLEQUOTE : ''; STAR : '*'; AND : 'AND' | '+'; OR : 'OR'; NOT : 'NOT'; DEFAULT_OP : 'DEF_OP'; FIELD : ('title'|'TITLE'|'Title')(FIELDSEPARATOR); WS : (WSCHAR)+ { $channel=HIDDEN; }; PREFIX : WORDCHAR+(STAR); WORD: WORDCHAR+(('-'|'+')WORDCHAR*)*; SENTENCE: ((DOUBLEQUOTE)(~(DOUBLEQUOTE))*(DOUBLEQUOTE)); WORDLIST: ((PREFIX | WORD | SENTENCE)(','(WS)* (PREFIX | WORD | SENTENCE))+); MUST : '+'(PREFIX|WORD|SENTENCE|WORDLIST); fragment WORDCHAR : (~( WSCHAR | LPAREN | RPAREN | '-' |':' | '+' | ',' | STAR | DOUBLEQUOTE) ); fragment FIELDSEPARATOR : ':'; fragment WSCHAR : ( ' ' | '\t' | '\r' | '\n'); = END OF GRAMMAR == -Original Message- From: antlr-interest-boun...@antlr.org [mailto:antlr-interest-boun...@antlr.org] On Behalf Of Bart Kiers Sent: May-04-11 10:21 AM To: antlr-interest@antlr.org Subject: Re: [antlr-interest] Inserting missing nodes On Wed, May 4, 2011 at 4:12 PM, Jean-Sebastien Vachon jean-sebastien.vac...@wantedtech.com wrote: No one can help me with this? :S Let me know if something is not clear. I need to fix this issue as soon as I can. Thanks The fact that you didn't provide the lexer rules (although they might be straight-forward as you mentioned), and you didn't mention what input you're specifically having problems with parsing (the following is a bit vague: *... but I can't get it to parse everything I'm throwing at it ...*), might be some reasons why you haven't been answered. Regards, Bart. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 32371] Re: [antlr-interest] Inserting missing nodes
You need to fix your lexer first: WORDLIST: ((PREFIX | WORD | SENTENCE)(','(WS)* (PREFIX | WORD | SENTENCE))+); is ambiguous with: PREFIX: WORDCHAR+(STAR); WORD: WORDCHAR+(('-'|'+')WORDCHAR*)*; You need to contstruct the lists in the parser not the lexer and should probably left factor the common roots in the lexer anyway. Jim -Original Message- From: antlr-interest-boun...@antlr.org [mailto:antlr-interest- boun...@antlr.org] On Behalf Of Jean-Sebastien Vachon Sent: Wednesday, May 04, 2011 7:51 AM To: Bart Kiers; antlr-interest@antlr.org Subject: Re: [antlr-interest] Inserting missing nodes Thanks for your input. So here is the whole thing with two use cases that are not giving me the expected results... (Sorry for the long post) INPUT = abc def zyx toto RESULT = (DEFAULT_OP abc def) (DEFAULT_OP zyx toto) EXPECTED = (DEFAULT_OP (DEFAULT_OP abc def) (DEFAULT_OP zyx toto)) INPUT = software engineer OR java programmer RESULT = (DEFAULT_OP software (OR engineer java)) programmer EXPECTED = (DEFAULT_OP (DEFAULT_OP software (OR engineer java)) programmer) I'm also having some trouble using the Interpreter within Eclipse. The same expressions are not working in the interpreter. It fails to generate the tree with a NoViableAltException at input 'abc' (for the first case). I don't think this is related to my other problem since I can't get it to generate any tree. Thanks again for your time and comments --- --- Grammar (validation by building a tree and trying to insert missing operators) --- --- grammar MyGrammar; options { language = Java; output = AST; ASTLabelType = CommonTree; } // Rules to build the tree representation of our expression... query : and_expr+ EOF! ; // Each AND expression can contain OR expressions... and_expr : (expr expr+) = default_op | (u1=or_expr (AND^ u2=or_expr)*) ; // A OR expression contains one or more expression or_expr : u1=expr (OR^ u2=expr)* ; default_op : (e1=or_expr e2=or_expr) - ^(DEFAULT_OP $e1 $e2) ; expr : (NOT^)? (operand) ; // The leafs of the tree.. Words, sentence and so on... // Note that an expression such as '-word' is rewritten in its 'NOT word' form operand : (f=FIELD^)(o=operand) | PREFIX | WORD | SENTENCE | WORDLIST | NEGATIVE(w=PREFIX|w=WORD|w=SENTENCE|w=WORDLIST) - ^(NOT $w) | MUST | LPAREN! and_expr RPAREN! ; // Lexer ... NEGATIVE: '-'; LPAREN : '(' ; RPAREN : ')' ; DOUBLEQUOTE : ''; STAR: '*'; AND : 'AND' | '+'; OR : 'OR'; NOT : 'NOT'; DEFAULT_OP : 'DEF_OP'; FIELD : ('title'|'TITLE'|'Title')(FIELDSEPARATOR); WS : (WSCHAR)+ { $channel=HIDDEN; }; PREFIX: WORDCHAR+(STAR); WORD: WORDCHAR+(('-'|'+')WORDCHAR*)*; SENTENCE: ((DOUBLEQUOTE)(~(DOUBLEQUOTE))*(DOUBLEQUOTE)); WORDLIST: ((PREFIX | WORD | SENTENCE)(','(WS)* (PREFIX | WORD | SENTENCE))+); MUST: '+'(PREFIX|WORD|SENTENCE|WORDLIST); fragment WORDCHAR : (~( WSCHAR | LPAREN | RPAREN | '-' |':' | '+' | ',' | STAR | DOUBLEQUOTE) ); fragment FIELDSEPARATOR : ':'; fragment WSCHAR : ( ' ' | '\t' | '\r' | '\n'); = END OF GRAMMAR == -Original Message- From: antlr-interest-boun...@antlr.org [mailto:antlr-interest- boun...@antlr.org] On Behalf Of Bart Kiers Sent: May-04-11 10:21 AM To: antlr-interest@antlr.org Subject: Re: [antlr-interest] Inserting missing nodes On Wed, May 4, 2011 at 4:12 PM, Jean-Sebastien Vachon jean- sebastien.vac...@wantedtech.com wrote: No one can help me with this? :S Let me know if something is not clear. I need to fix this issue as soon as I can. Thanks The fact that you didn't provide the lexer rules (although they might be straight-forward as you mentioned), and you didn't mention what input you're specifically having problems with parsing (the following is a bit vague: *... but I can't get it to parse everything I'm throwing at it ...*), might be some reasons why you haven't been answered. Regards, Bart. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your- email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your- email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to
[il-antlr-interest: 32372] [antlr-interest] Function Expressions
Hello all, I have a simple C/JavaScript-style grammar for my interpreter project. Right now, functions can be called via identifier(), or identifier(param1, param2). This works fine for simple cases, but in my language functions are first-class objects. I'm in the process of redoing my identifier logic for properties, arrays, and function calls. I've gotten the first two working. I'm trying to allow expressions to be callable as functions, so I can do stuff like createFunction()(), where createFunction would be a function that returns a function. Another example would be (1 + 1)(). Obviously that should throw an error, but it should be a permissible language construct. JavaScript allows this. The func.g file in my gist is a pared down version of the language grammar, with only the relevant rules in it. Understandably, it fails with the following errors: [java] error(210): The following sets of rules are mutually left-recursive [boolNegation, unary, add, mult, relation, term, expression] [java] error(206): /home/user/955488/func.g:66:2: Alternative 1: after matching input such as IDENT '(' decision cannot predict what comes next due to recursion overflow to relation from expression [java] error(201): /home/user/955488/func.g:66:2: The following alternatives can never be matched: 2 https://gist.github.com/955488 demonstrates the issue. I've stripped out everything except the expression rules. The gist can be cloned as a git repo and then built via Ant + Ivy. I understand why it's failing. There's a conflict between the IDENT expression and IDENT '(' ')' for function calls. What I'm trying to figure out is how to allow both identifiers and function calls. If I figure that out, it should give me the rest of what I need. Any help would be appreciated. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 32373] Re: [antlr-interest] Inserting missing nodes
Ok I've changed my lexer and parser as you suggested but it didn't help. However, I found why the interpreter Is not able to generate the tree in Eclipse. I found the cause but not the explanation... It has something to do with the definition of the and_expr rule and_expr : (u1=or_expr (AND^ u2=or_expr)*) {System.out.println( *and_expr: + $u1.text + , + $u2.text);} | (expr expr+) = default_op ; If I remove the second alternative then the interpreter is able to create the tree for my expression but I'm losing the operators that were inserted by the second alternative. I don't understand why it is complaining about a viable alternative not being found for a simple input such as 'abc AND def'. It should match the first alternative since both 'abc' and 'def' match the or_expr rule (through the expr rule). [ I tried changing the order of the two altervatives but it didn't help] Any idea? -Original Message- From: antlr-interest-boun...@antlr.org [mailto:antlr-interest-boun...@antlr.org] On Behalf Of Jim Idle Sent: May-04-11 11:22 AM To: antlr-interest@antlr.org Subject: Re: [antlr-interest] Inserting missing nodes You need to fix your lexer first: WORDLIST: ((PREFIX | WORD | SENTENCE)(','(WS)* (PREFIX | WORD | SENTENCE))+); is ambiguous with: PREFIX: WORDCHAR+(STAR); WORD: WORDCHAR+(('-'|'+')WORDCHAR*)*; You need to contstruct the lists in the parser not the lexer and should probably left factor the common roots in the lexer anyway. Jim -Original Message- From: antlr-interest-boun...@antlr.org [mailto:antlr-interest- boun...@antlr.org] On Behalf Of Jean-Sebastien Vachon Sent: Wednesday, May 04, 2011 7:51 AM To: Bart Kiers; antlr-interest@antlr.org Subject: Re: [antlr-interest] Inserting missing nodes Thanks for your input. So here is the whole thing with two use cases that are not giving me the expected results... (Sorry for the long post) INPUT = abc def zyx toto RESULT = (DEFAULT_OP abc def) (DEFAULT_OP zyx toto) EXPECTED = (DEFAULT_OP (DEFAULT_OP abc def) (DEFAULT_OP zyx toto)) INPUT = software engineer OR java programmer RESULT = (DEFAULT_OP software (OR engineer java)) programmer EXPECTED = (DEFAULT_OP (DEFAULT_OP software (OR engineer java)) programmer) I'm also having some trouble using the Interpreter within Eclipse. The same expressions are not working in the interpreter. It fails to generate the tree with a NoViableAltException at input 'abc' (for the first case). I don't think this is related to my other problem since I can't get it to generate any tree. Thanks again for your time and comments --- --- Grammar (validation by building a tree and trying to insert missing operators) --- --- grammar MyGrammar; options { language = Java; output = AST; ASTLabelType = CommonTree; } // Rules to build the tree representation of our expression... query : and_expr+ EOF! ; // Each AND expression can contain OR expressions... and_expr : (expr expr+) = default_op | (u1=or_expr (AND^ u2=or_expr)*) ; // A OR expression contains one or more expression or_expr : u1=expr (OR^ u2=expr)* ; default_op : (e1=or_expr e2=or_expr) - ^(DEFAULT_OP $e1 $e2) ; expr : (NOT^)? (operand) ; // The leafs of the tree.. Words, sentence and so on... // Note that an expression such as '-word' is rewritten in its 'NOT word' form operand : (f=FIELD^)(o=operand) | PREFIX | WORD | SENTENCE | WORDLIST | NEGATIVE(w=PREFIX|w=WORD|w=SENTENCE|w=WORDLIST) - ^(NOT $w) | MUST | LPAREN! and_expr RPAREN! ; // Lexer ... NEGATIVE: '-'; LPAREN : '(' ; RPAREN : ')' ; DOUBLEQUOTE : ''; STAR: '*'; AND : 'AND' | '+'; OR : 'OR'; NOT : 'NOT'; DEFAULT_OP : 'DEF_OP'; FIELD : ('title'|'TITLE'|'Title')(FIELDSEPARATOR); WS : (WSCHAR)+ { $channel=HIDDEN; }; PREFIX: WORDCHAR+(STAR); WORD: WORDCHAR+(('-'|'+')WORDCHAR*)*; SENTENCE: ((DOUBLEQUOTE)(~(DOUBLEQUOTE))*(DOUBLEQUOTE)); WORDLIST: ((PREFIX | WORD | SENTENCE)(','(WS)* (PREFIX | WORD | SENTENCE))+); MUST: '+'(PREFIX|WORD|SENTENCE|WORDLIST); fragment WORDCHAR : (~( WSCHAR | LPAREN | RPAREN | '-' |':' | '+' | ',' | STAR | DOUBLEQUOTE) ); fragment FIELDSEPARATOR : ':'; fragment WSCHAR : ( ' ' | '\t' | '\r' | '\n'); = END OF GRAMMAR == -Original Message- From: antlr-interest-boun...@antlr.org [mailto:antlr-interest- boun...@antlr.org] On Behalf Of Bart Kiers Sent: May-04-11 10:21 AM To: antlr-interest@antlr.org Subject: Re: [antlr-interest] Inserting missing nodes On Wed, May 4, 2011 at 4:12 PM,
[il-antlr-interest: 32374] Re: [antlr-interest] Function Expressions
Generally when parsing you do this: expr ... atom : i=ID ( LPAREN e=exprList RPAREN -^(FUNC $i $e) | - ^(IDENT $i) ) ... Jim -Original Message- From: antlr-interest-boun...@antlr.org [mailto:antlr-interest- boun...@antlr.org] On Behalf Of Jeff Hair Sent: Wednesday, May 04, 2011 9:27 AM To: antlr-interest@antlr.org Subject: [antlr-interest] Function Expressions Hello all, I have a simple C/JavaScript-style grammar for my interpreter project. Right now, functions can be called via identifier(), or identifier(param1, param2). This works fine for simple cases, but in my language functions are first-class objects. I'm in the process of redoing my identifier logic for properties, arrays, and function calls. I've gotten the first two working. I'm trying to allow expressions to be callable as functions, so I can do stuff like createFunction()(), where createFunction would be a function that returns a function. Another example would be (1 + 1)(). Obviously that should throw an error, but it should be a permissible language construct. JavaScript allows this. The func.g file in my gist is a pared down version of the language grammar, with only the relevant rules in it. Understandably, it fails with the following errors: [java] error(210): The following sets of rules are mutually left- recursive [boolNegation, unary, add, mult, relation, term, expression] [java] error(206): /home/user/955488/func.g:66:2: Alternative 1: after matching input such as IDENT '(' decision cannot predict what comes next due to recursion overflow to relation from expression [java] error(201): /home/user/955488/func.g:66:2: The following alternatives can never be matched: 2 https://gist.github.com/955488 demonstrates the issue. I've stripped out everything except the expression rules. The gist can be cloned as a git repo and then built via Ant + Ivy. I understand why it's failing. There's a conflict between the IDENT expression and IDENT '(' ')' for function calls. What I'm trying to figure out is how to allow both identifiers and function calls. If I figure that out, it should give me the rest of what I need. Any help would be appreciated. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your- email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 32375] Re: [antlr-interest] Inserting missing nodes
Don't use the interpreter, use the debugger. Jim -Original Message- From: Jean-Sebastien Vachon [mailto:jean- sebastien.vac...@wantedtech.com] Sent: Wednesday, May 04, 2011 9:33 AM To: Jim Idle; antlr-interest@antlr.org Subject: RE: [antlr-interest] Inserting missing nodes Ok I've changed my lexer and parser as you suggested but it didn't help. However, I found why the interpreter Is not able to generate the tree in Eclipse. I found the cause but not the explanation... It has something to do with the definition of the and_expr rule and_expr : (u1=or_expr (AND^ u2=or_expr)*) {System.out.println( *and_expr: + $u1.text + , + $u2.text);} | (expr expr+) = default_op ; If I remove the second alternative then the interpreter is able to create the tree for my expression but I'm losing the operators that were inserted by the second alternative. I don't understand why it is complaining about a viable alternative not being found for a simple input such as 'abc AND def'. It should match the first alternative since both 'abc' and 'def' match the or_expr rule (through the expr rule). [ I tried changing the order of the two altervatives but it didn't help] Any idea? -Original Message- From: antlr-interest-boun...@antlr.org [mailto:antlr-interest- boun...@antlr.org] On Behalf Of Jim Idle Sent: May-04-11 11:22 AM To: antlr-interest@antlr.org Subject: Re: [antlr-interest] Inserting missing nodes You need to fix your lexer first: WORDLIST: ((PREFIX | WORD | SENTENCE)(','(WS)* (PREFIX | WORD | SENTENCE))+); is ambiguous with: PREFIX : WORDCHAR+(STAR); WORD: WORDCHAR+(('-'|'+')WORDCHAR*)*; You need to contstruct the lists in the parser not the lexer and should probably left factor the common roots in the lexer anyway. Jim -Original Message- From: antlr-interest-boun...@antlr.org [mailto:antlr-interest- boun...@antlr.org] On Behalf Of Jean-Sebastien Vachon Sent: Wednesday, May 04, 2011 7:51 AM To: Bart Kiers; antlr-interest@antlr.org Subject: Re: [antlr-interest] Inserting missing nodes Thanks for your input. So here is the whole thing with two use cases that are not giving me the expected results... (Sorry for the long post) INPUT = abc def zyx toto RESULT = (DEFAULT_OP abc def) (DEFAULT_OP zyx toto) EXPECTED = (DEFAULT_OP (DEFAULT_OP abc def) (DEFAULT_OP zyx toto)) INPUT = software engineer OR java programmer RESULT = (DEFAULT_OP software (OR engineer java)) programmer EXPECTED = (DEFAULT_OP (DEFAULT_OP software (OR engineer java)) programmer) I'm also having some trouble using the Interpreter within Eclipse. The same expressions are not working in the interpreter. It fails to generate the tree with a NoViableAltException at input 'abc' (for the first case). I don't think this is related to my other problem since I can't get it to generate any tree. Thanks again for your time and comments - - - --- Grammar (validation by building a tree and trying to insert missing operators) - - - --- grammar MyGrammar; options { language = Java; output = AST; ASTLabelType = CommonTree; } // Rules to build the tree representation of our expression... query : and_expr+ EOF! ; // Each AND expression can contain OR expressions... and_expr : (expr expr+) = default_op | (u1=or_expr (AND^ u2=or_expr)*) ; // A OR expression contains one or more expression or_expr : u1=expr (OR^ u2=expr)* ; default_op : (e1=or_expr e2=or_expr) - ^(DEFAULT_OP $e1 $e2) ; expr : (NOT^)? (operand) ; // The leafs of the tree.. Words, sentence and so on... // Note that an expression such as '-word' is rewritten in its 'NOT word' form operand : (f=FIELD^)(o=operand) | PREFIX | WORD | SENTENCE | WORDLIST | NEGATIVE(w=PREFIX|w=WORD|w=SENTENCE|w=WORDLIST) - ^(NOT $w) | MUST | LPAREN! and_expr RPAREN! ; // Lexer ... NEGATIVE: '-'; LPAREN : '(' ; RPAREN : ')' ; DOUBLEQUOTE : ''; STAR : '*'; AND : 'AND' | '+'; OR : 'OR'; NOT : 'NOT'; DEFAULT_OP : 'DEF_OP'; FIELD : ('title'|'TITLE'|'Title')(FIELDSEPARATOR); WS : (WSCHAR)+ { $channel=HIDDEN; }; PREFIX : WORDCHAR+(STAR); WORD: WORDCHAR+(('-'|'+')WORDCHAR*)*; SENTENCE: ((DOUBLEQUOTE)(~(DOUBLEQUOTE))*(DOUBLEQUOTE)); WORDLIST: ((PREFIX | WORD | SENTENCE)(','(WS)* (PREFIX | WORD | SENTENCE))+); MUST : '+'(PREFIX|WORD|SENTENCE|WORDLIST); fragment WORDCHAR : (~( WSCHAR | LPAREN | RPAREN | '-' |':' | '+' | ',' | STAR | DOUBLEQUOTE) ); fragment FIELDSEPARATOR : ':';
[il-antlr-interest: 32376] Re: [antlr-interest] Function Expressions
Right. That's basically what I'm doing right now. The problem is that I can't call do stuff like anonymous function calls (which can be generated by an expression). So I'm trying to figure out how to do enable that without getting the recursion errors. JavaScript allows you to do stuff like (function() { ... })(); in order to call the function. You can even do it with (1 + 1)(), which of course returns an error. But the point is that it's possible. That's what I'm trying to enable. On Wed, May 4, 2011 at 1:05 PM, Jim Idle j...@temporal-wave.com wrote: Generally when parsing you do this: expr ... atom : i=ID ( LPAREN e=exprList RPAREN -^(FUNC $i $e) | - ^(IDENT $i) ) ... Jim -Original Message- From: antlr-interest-boun...@antlr.org [mailto:antlr-interest- boun...@antlr.org] On Behalf Of Jeff Hair Sent: Wednesday, May 04, 2011 9:27 AM To: antlr-interest@antlr.org Subject: [antlr-interest] Function Expressions Hello all, I have a simple C/JavaScript-style grammar for my interpreter project. Right now, functions can be called via identifier(), or identifier(param1, param2). This works fine for simple cases, but in my language functions are first-class objects. I'm in the process of redoing my identifier logic for properties, arrays, and function calls. I've gotten the first two working. I'm trying to allow expressions to be callable as functions, so I can do stuff like createFunction()(), where createFunction would be a function that returns a function. Another example would be (1 + 1)(). Obviously that should throw an error, but it should be a permissible language construct. JavaScript allows this. The func.g file in my gist is a pared down version of the language grammar, with only the relevant rules in it. Understandably, it fails with the following errors: [java] error(210): The following sets of rules are mutually left- recursive [boolNegation, unary, add, mult, relation, term, expression] [java] error(206): /home/user/955488/func.g:66:2: Alternative 1: after matching input such as IDENT '(' decision cannot predict what comes next due to recursion overflow to relation from expression [java] error(201): /home/user/955488/func.g:66:2: The following alternatives can never be matched: 2 https://gist.github.com/955488 demonstrates the issue. I've stripped out everything except the expression rules. The gist can be cloned as a git repo and then built via Ant + Ivy. I understand why it's failing. There's a conflict between the IDENT expression and IDENT '(' ')' for function calls. What I'm trying to figure out is how to allow both identifiers and function calls. If I figure that out, it should give me the rest of what I need. Any help would be appreciated. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your- email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 32377] Re: [antlr-interest] Function Expressions
Greetings! On Wed, 2011-05-04 at 12:27 -0400, Jeff Hair wrote: Hello all, I have a simple C/JavaScript-style grammar for my interpreter project. Right now, functions can be called via identifier(), or identifier(param1, param2). This works fine for simple cases, but in my language functions are first-class objects. I'm in the process of redoing my identifier logic for properties, arrays, and function calls. I've gotten the first two working. I'm trying to allow expressions to be callable as functions, so I can do stuff like createFunction()(), where createFunction would be a function that returns a function. Another example would be (1 + 1)(). Obviously that should throw an error, but it should be a permissible language construct. JavaScript allows this. The func.g file in my gist is a pared down version of the language grammar, with only the relevant rules in it. Understandably, it fails with the following errors: [java] error(210): The following sets of rules are mutually left-recursive [boolNegation, unary, add, mult, relation, term, expression] [java] error(206): /home/user/955488/func.g:66:2: Alternative 1: after matching input such as IDENT '(' decision cannot predict what comes next due to recursion overflow to relation from expression [java] error(201): /home/user/955488/func.g:66:2: The following alternatives can never be matched: 2 https://gist.github.com/955488 demonstrates the issue. I've stripped out everything except the expression rules. The gist can be cloned as a git repo and then built via Ant + Ivy. I understand why it's failing. There's a conflict between the IDENT expression and IDENT '(' ')' for function calls. What I'm trying to figure out is how to allow both identifiers and function calls. If I figure that out, it should give me the rest of what I need. Any help would be appreciated. I think that basically you want Application (I come from the lambda calculus and function invocation is called Application therein) to be a post-fix operator. If you look at grammars for Java or C or other C-like languages you will see (I believe) that indexing into an array and/or projecting a field out of a tuple (record) and probably others-like-them are post-fix operators. So I would recommend the following (tested, but not using your incomplete gist framework) --- replacing your term rule with: //Expressions primary : IDENT | '('! expression ')'! | INTEGER | DOUBLE | BOOLEAN ; term : (primary - primary) ( suffix[$term.tree] - suffix )* ; suffix [CommonTree term] : ( x='(' modifiers? ')' - ^(APPLICATION[$x,A] {$term} modifiers?) ) | ( x='[' modifiers ']' - ^(INDEXING[$x,I] {$term} modifiers) ) | ( x='.' (p=IDENT|p=INTEGER) - ^(PROJECTION[$x,P] {$term} $p) ) ; modifiers : expression (','! expression)* ; with an appropriate tokens{} section defining the APPLICATION; INDEXING; and PROJECTION imaginary tokens. the above is a little complicated in order to get the imaginary token representing the suffix operator to be a tree root. note that the stuff in the []'s after the imaginary token name is for error reporting and/or tree pretty printing. keep the $x but change the string to suite your need. hope this helps -jbb List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 32378] Re: [antlr-interest] Function Expressions
This does indeed work as expected. Wouldn't have been able to come up with that on my own. I started down the path of putting function calls in between term and booleanExpression, and that was *sort of* working. But I also uncovered some other issues with properties (what you are calling projection). I sort of understand what's going on with this code via referencing the last example under Rewrite rules at http://www.antlr.org/wiki/display/ANTLR3/Tree+construction. But I don't understand it fully. Perhaps I just need some time to digest it. I only discovered ANTLR like 3 weeks ago, so there's still a lot I have to learn. I'm surprised I've gotten as far as I have without help, actually... On Wed, May 4, 2011 at 6:19 PM, John B. Brodie j...@acm.org wrote: Greetings! On Wed, 2011-05-04 at 12:27 -0400, Jeff Hair wrote: Hello all, I have a simple C/JavaScript-style grammar for my interpreter project. Right now, functions can be called via identifier(), or identifier(param1, param2). This works fine for simple cases, but in my language functions are first-class objects. I'm in the process of redoing my identifier logic for properties, arrays, and function calls. I've gotten the first two working. I'm trying to allow expressions to be callable as functions, so I can do stuff like createFunction()(), where createFunction would be a function that returns a function. Another example would be (1 + 1)(). Obviously that should throw an error, but it should be a permissible language construct. JavaScript allows this. The func.g file in my gist is a pared down version of the language grammar, with only the relevant rules in it. Understandably, it fails with the following errors: [java] error(210): The following sets of rules are mutually left-recursive [boolNegation, unary, add, mult, relation, term, expression] [java] error(206): /home/user/955488/func.g:66:2: Alternative 1: after matching input such as IDENT '(' decision cannot predict what comes next due to recursion overflow to relation from expression [java] error(201): /home/user/955488/func.g:66:2: The following alternatives can never be matched: 2 https://gist.github.com/955488 demonstrates the issue. I've stripped out everything except the expression rules. The gist can be cloned as a git repo and then built via Ant + Ivy. I understand why it's failing. There's a conflict between the IDENT expression and IDENT '(' ')' for function calls. What I'm trying to figure out is how to allow both identifiers and function calls. If I figure that out, it should give me the rest of what I need. Any help would be appreciated. I think that basically you want Application (I come from the lambda calculus and function invocation is called Application therein) to be a post-fix operator. If you look at grammars for Java or C or other C-like languages you will see (I believe) that indexing into an array and/or projecting a field out of a tuple (record) and probably others-like-them are post-fix operators. So I would recommend the following (tested, but not using your incomplete gist framework) --- replacing your term rule with: //Expressions primary : IDENT | '('! expression ')'! | INTEGER | DOUBLE | BOOLEAN ; term : (primary - primary) ( suffix[$term.tree] - suffix )* ; suffix [CommonTree term] : ( x='(' modifiers? ')' - ^(APPLICATION[$x,A] {$term} modifiers?) ) | ( x='[' modifiers ']' - ^(INDEXING[$x,I] {$term} modifiers) ) | ( x='.' (p=IDENT|p=INTEGER) - ^(PROJECTION[$x,P] {$term} $p) ) ; modifiers : expression (','! expression)* ; with an appropriate tokens{} section defining the APPLICATION; INDEXING; and PROJECTION imaginary tokens. the above is a little complicated in order to get the imaginary token representing the suffix operator to be a tree root. note that the stuff in the []'s after the imaginary token name is for error reporting and/or tree pretty printing. keep the $x but change the string to suite your need. hope this helps -jbb List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 32379] Re: [antlr-interest] Function Expressions
Oh, and thanks. I've put your code through a bunch of crazy scenarios after adapting it to my full needs and it works perfectly. I can now move forward with changes to the tree parser and interpreter. Thanks again. On Wed, May 4, 2011 at 6:19 PM, John B. Brodie j...@acm.org wrote: I think that basically you want Application (I come from the lambda calculus and function invocation is called Application therein) to be a post-fix operator. If you look at grammars for Java or C or other C-like languages you will see (I believe) that indexing into an array and/or projecting a field out of a tuple (record) and probably others-like-them are post-fix operators. So I would recommend the following (tested, but not using your incomplete gist framework) --- replacing your term rule with: //Expressions primary : IDENT | '('! expression ')'! | INTEGER | DOUBLE | BOOLEAN ; term : (primary - primary) ( suffix[$term.tree] - suffix )* ; suffix [CommonTree term] : ( x='(' modifiers? ')' - ^(APPLICATION[$x,A] {$term} modifiers?) ) | ( x='[' modifiers ']' - ^(INDEXING[$x,I] {$term} modifiers) ) | ( x='.' (p=IDENT|p=INTEGER) - ^(PROJECTION[$x,P] {$term} $p) ) ; modifiers : expression (','! expression)* ; with an appropriate tokens{} section defining the APPLICATION; INDEXING; and PROJECTION imaginary tokens. the above is a little complicated in order to get the imaginary token representing the suffix operator to be a tree root. note that the stuff in the []'s after the imaginary token name is for error reporting and/or tree pretty printing. keep the $x but change the string to suite your need. hope this helps -jbb List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.