[il-antlr-interest: 28209] Re: [antlr-interest] Using previously matched parser rule in decision making
I agree Ron. Ron Burk wrote: It is an interesting idea for a top-down parser generator to just make the parsing stack of non-terminals available to user actions. Whether that's easy or hard depends on the details of how the tool generates parser code. But certainly knowing the context you expect to be in is arguably an advantage of top-down over bottom-up parsing, so there's an argument to be made for making that information available. As I struggle to think of common/practical use for it, mainly error reporting or recovery comes to mind. But, if the syntax made it easy to ask things like is X on the stack, I suppose there are a variety of semantic checks that could be made clearer and simpler than via flags and such. E.g. checking that a 'break' keyword in C occurs within a do/for/switch/while. I usually try to do things in one pass, so it may be more interesting of an idea to me than to someone who intends to build a syntax tree first before doing any actual work. Dinking with syntax: A: B C: B B: { if($Stack[A])... else if($Stack[C])... else assert(FALSE); } or maybe (also?) { if($Stack[-1]==$NonTerm[A]) ...; else ...; } or LoopStmt: Do | For | Switch | While ; ... BreakStmt: 'break' { if(!$Stack[LoopStmt]) SynError(break is not inside do/for/switch/while.\n); } List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 28210] Re: [antlr-interest] Using previously matched parser rule in decision making
At 15:47 9/03/2010, Ron Burk wrote: It is an interesting idea for a top-down parser generator to just make the parsing stack of non-terminals available to user actions. Whether that's easy or hard depends on the details of how the tool generates parser code. But certainly knowing the context you expect to be in is arguably an advantage of top-down over bottom-up parsing, so there's an argument to be made for making that information available. You can use ANTLR's scopes to do that. There are ways to tell if a particular scope has been entered, how many times it has been entered, and to retrieve information from any of those levels. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 28216] Re: [antlr-interest] Unexpected behavior - Error?
Hi Bart! Thanks for the quick answer! Adding an EOF to the rule solves the issue in the toy example. Unfortunately we are using custom token label types and are now getting a ClassCastException. It seems that we now have the problem mentioned here: http://www.antlr.org/pipermail/antlr-interest/2009-November/036712.html Any thoughts on that? On 09.03.2010 15:04, Bart Kiers wrote: Hi Chris, Since the input ' .mine' does not contain any illegal tokens, the parser just stops parsing since (statement)* will also match nothing. You'll want to tell your parser to continue parsing all the way to the end of your token stream. Do that by adding an EOF to the end of your entry-point: presumably the source parser rule: source : (statement)* EOF ; Regards, Bart. -- Dipl.-Ing. Christoph Schinko c.schi...@cgv.tugraz.at Institute of Computer Graphics and Knowledge Visualization Graz University of Technology tel: +43 (316) 873-5416 Inffeldgasse 16c, 8010 Graz, Austria List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 28217] Re: [antlr-interest] Unexpected behavior - Error?
Hi Chris, sorry, forgot to send to the list the first time! On Tue, Mar 9, 2010 at 4:41 PM, Christoph Schinko c.schi...@cgv.tugraz.atwrote: Hi Bart! Thanks for the quick answer! Adding an EOF to the rule solves the issue in the toy example. Unfortunately we are using custom token label types and are now getting a ClassCastException. It seems that we now have the problem mentioned here: http://www.antlr.org/pipermail/antlr-interest/2009-November/036712.html Any thoughts on that? Unfortunately, I don't... I presume you read that entire thread, if not, a (possible) solution is given here: http://www.antlr.org/pipermail/antlr-interest/2009-November/036719.html Best of luck! Regards, Bart. On 09.03.2010 15:04, Bart Kiers wrote: Hi Chris, Since the input ' .mine' does not contain any illegal tokens, the parser just stops parsing since (statement)* will also match nothing. You'll want to tell your parser to continue parsing all the way to the end of your token stream. Do that by adding an EOF to the end of your entry-point: presumably the source parser rule: source : (statement)* EOF ; Regards, Bart. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 28218] Re: [antlr-interest] Using previously matched parser rule in decision making
From anywhere in the parser: java.util.List stack = getRuleInvocationStack(e, getParserName()); But this only works for Java and other targets that copy it (I think C# might do it). I don't do it in C because I prefer to take the view that the C stuff should be as close to the metal as it can be and the programmer will choose to add the overheads they need. In the JavaFX front end, this stack is used to pin down errors a little more precisely - as it is open source you can download the code and look at AbstractGeneratedParserV4.java Jim -Original Message- From: antlr-interest-boun...@antlr.org [mailto:antlr-interest- boun...@antlr.org] On Behalf Of Kieran Simpson Sent: Tuesday, March 09, 2010 1:58 AM To: Ron Burk Cc: antlr-interest@antlr.org Subject: Re: [antlr-interest] Using previously matched parser rule in decision making I agree Ron. Ron Burk wrote: It is an interesting idea for a top-down parser generator to just make the parsing stack of non-terminals available to user actions. Whether that's easy or hard depends on the details of how the tool generates parser code. But certainly knowing the context you expect to be in is arguably an advantage of top-down over bottom-up parsing, so there's an argument to be made for making that information available. As I struggle to think of common/practical use for it, mainly error reporting or recovery comes to mind. But, if the syntax made it easy to ask things like is X on the stack, I suppose there are a variety of semantic checks that could be made clearer and simpler than via flags and such. E.g. checking that a 'break' keyword in C occurs within a do/for/switch/while. I usually try to do things in one pass, so it may be more interesting of an idea to me than to someone who intends to build a syntax tree first before doing any actual work. Dinking with syntax: A: B C: B B: { if($Stack[A])... else if($Stack[C])... else assert(FALSE); } or maybe (also?) { if($Stack[-1]==$NonTerm[A]) ...; else ...; } or LoopStmt: Do | For | Switch | While ; ... BreakStmt: 'break' { if(!$Stack[LoopStmt]) SynError(break is not inside do/for/switch/while.\n); } List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr- interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your- email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 28219] [antlr-interest] MismatchedTokenException in simple grammar
Hi all, I'm completely new to ANTLR and EBNF grammars to begin with, so this is probably a basic issue I'm simply not understanding. I have a rule such as: version_line : WS? 'VERS' WS? '=' WS? '1.0' WS? EOL ; WS : ' '+ ; EOL : '\r' | '\n' | '\r\n' | '\n\r' ; that matches a statement in my input file that looks like this (with optional whitespace): VERSION = 1.0 With the rule form above, I'm getting a successful match, although I get an exception with this form: version_line : WS? 'VERS' WS? '=' WS? '1' '.0' WS? EOL ; or this form: version_line : WS? 'VERS' WS? '=' WS? DIGIT '.0' WS? EOL ; DIGIT : '1' ; Why is this different? I discovered this issue when trying to decompose the rule even more, hopefully ending up with something like this: version_line : WS? 'VERS' WS? '=' WS? DIGIT '.' DIGIT WS? EOL ; DIGIT : '0'..'9' ; Thanks in advance, -- Joel Parker List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 28220] Re: [antlr-interest] MismatchedTokenException in simple grammar
FYI: http://stackoverflow.com/questions/2412440/antlr-mismatchedtokenexception-on-simple-grammar List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 28221] [antlr-interest] AntLRWorks Rule Debugging Error
I am trying to test whenDescriptor rule in following grammar in AntLRWorks. I keep getting following exception as soon as I start debugging. Input text for testing is when order : OrderBll then [16:45:07] C:\Documents and Settings\RM\My Documents\My Tools\AntLRWorks\output\__Test__.java:14: cannot find symbol [16:45:07] symbol : method whenDescriptor() [16:45:07] location: class RulesTranslatorParser [16:45:07] g.whenDescriptor(); [16:45:07] ^ [16:45:07] 1 error I am able to test packageDescriptor and declareDescriptor successfully. Does anyone knows resolution to the error message? I tried various combination of input string but rule debugging fails. Here is my grammar. grammar RulesTranslator; options { backtrack=true; memoize=true; language=CSharp2; } tokens { AND='and'; OR='or'; NOT='not'; EXISTS='exists'; EVAL='eval'; FORALL='forall'; CONTAINS='contains'; IS='is'; INSTANCEOF='instanceof'; STRSIM='strsim'; SOUNDSLIKE='soundslike'; IN='in'; NEW='new'; WITH='with'; ASSERT='assert'; ISDEF='isdef'; } packageDescriptor : 'package' qualifiedName ; declareDescriptorList : (declareDescriptor)* ; declareDescriptor : 'declare' qualifiedName (variableDef)+ 'end' ; whenDescriptor : //'when' ( typeRef | NOT ) (parExpression)+ 'then' 'when' typeRef 'then' ; typeRef : (Identifier | variableDef) ; primitiveType : 'boolean' |'char' |'byte' |'short' |'int' |'long' |'float' |'double' ; qualifiedNameList : qualifiedName (',' qualifiedName)* ; qualifiedName : Identifier ('.' Identifier)* ; literal : integerLiteral | FloatingPointLiteral | CharacterLiteral | StringLiteral | booleanLiteral | 'null' ; integerLiteral : HexLiteral | OctalLiteral | DecimalLiteral ; booleanLiteral : 'true' | 'false' ; elementValuePairs : elementValuePair (',' elementValuePair)* ; elementValuePair : (Identifier '=')? conditionalExpression ; variableDef : ( Identifier ':' Identifier | Identifier ':' qualifiedName ) ; // STATEMENTS / BLOCKS chunk : (statement (';')?)* ; block : chunk EOF; statement : 'while' parExpression statement | 'do' statement 'while' parExpression ';' | 'switch' parExpression '{' switchBlockStatementGroups '}' | 'return' expression? ';' | 'break' Identifier? ';' | 'continue' Identifier? ';' //| 'when' parExpression 'then' (statement)? 'end' | statementExpression | Identifier ':' statement ; switchBlockStatementGroups : (switchBlockStatementGroup)* ; switchBlockStatementGroup : switchLabel statement* ; switchLabel : 'case' constantExpression ':' | 'default' ':' ; moreStatementExpressions : (',' statementExpression)* ; fieldseperator : (',' | ';') ; logicalOperator : ('' | '||' | '~=') ; parExpression : '(' expression ')' ; expressionList : expression (',' expression)* ; statementExpression : expression ; constantExpression : expression ; expression : conditionalExpression (assignmentOperator expression)? ; assignmentOperator : '=' | '+=' | '-=' | '*=' | '/=' | '=' | '|=' | '^=' | '%=' | '' '' '=' | '' '' '=' | '' '' '' '=' ; conditionalExpression : conditionalOrExpression ( '?' expression ':' expression )? ; conditionalOrExpression : conditionalAndExpression ( ( '||' | OR ) conditionalAndExpression )* ; conditionalAndExpression : inclusiveOrExpression ( ( '' | AND ) inclusiveOrExpression )* ; inclusiveOrExpression : exclusiveOrExpression ( '|' exclusiveOrExpression )* ; exclusiveOrExpression : andExpression ( '^' andExpression )* ; andExpression : equalityExpression ( '' equalityExpression )* ; equalityExpression : relationalExpression ( ('==' | '!=') relationalExpression )* ; relationalExpression : shiftExpression ( relationalOp shiftExpression )* ; relationalOp :('' '=' | '' '=' | '' | '') ; shiftExpression : additiveExpression ( shiftOp additiveExpression )* ; shiftOp : ('' '' | '' '' '' | '' '') ; additiveExpression : multiplicativeExpression ( ('+' | '-') multiplicativeExpression )* ; multiplicativeExpression : unaryExpression ( ( '*' | '/' | '%' ) unaryExpression )* ; unaryExpression : '+' unaryExpression |'-' unaryExpression | '++' primary | '--' primary
[il-antlr-interest: 28223] [antlr-interest] Unicode lexing
I know this topic has come up before, and sorry to bring it up again. Context: I'm bringing up BitC on CLI, and planning to use antlr to do it. BitC characters cover the full unicode (20 bit) range. The good news: haracters above U+ can only appear in character and string literals. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 28224] [antlr-interest] Stopping parser and lexer at first error
Hi all, I needed to catch any syntax error (letting the lexer insert/delete chars or the parser keeping parsing with the sys.err message only could be very dangerous to my application), so I took a look on the reference (which reports information not valid anymore) and on the internet and I found several hints and articles: Why the generated parser code tolerates illegal expression?http://www.antlr.org/wiki/pages/viewpage.action?pageId=4554943 How can I make the lexer exit upon first lexical error?http://www.antlr.org/wiki/pages/viewpage.action?pageId=5341217 http://www.antlr.org/wiki/display/ANTLR3/Custom+Syntax+Error+Recovery [antlr-interest] I want to throw an exception and stop parse, please! http://www.antlr.org/pipermail/antlr-interest/2009-May/034605.html It looks to me I found a way to do this, maybe it's worth to publish that on the wiki, once validated. I just added the following overrides to my grammar (attached): @parser::members { public class ParserException extends RuntimeException { Object objCurrentInputSymbol = null; public ParserException(Object oCurrentInputSymbol) { this.objCurrentInputSymbol = oCurrentInputSymbol; } } protected Object recoverFromMismatchedToken(IntStream input, int ttype, BitSet follow) throws RecognitionException { System.out.println(PARSER : this.getCurrentInputSymbol(input).toString() : + this.getCurrentInputSymbol(input).toString()); System.out.println(PARSER : this.failed() : + this.failed()); System.out.println(PARSER : this.getNumberOfSyntaxErrors() : + this.getNumberOfSyntaxErrors()); throw new ParserException(this.getCurrentInputSymbol(input)); } } @lexer::members { public class LexerException extends RuntimeException { RecognitionException recognitionException = null; String strErrorHeader = null; String strErrorMessage = null; public LexerException(RecognitionException recExc, String sHead, String sMsg) { this.recognitionException = recExc; this.strErrorHeader = sHead; this.strErrorMessage = sMsg; System.out.println(LEXER : ErrorHeader : + sHead); System.out.println(LEXER : ErrorMessage : + sMsg); System.out.println(LEXER : RecognitionException : + this.recognitionException.toString()); } } public void reportError(RecognitionException recExc) { throw new LexerException(recExc, this.getErrorHeader(recExc), getErrorMessage(recExc, this.getTokenNames())); } } Then I tested it with a simple class: public static void main(String[] args) { testLexerError(); testParserError(); } private static void testLexerError() { String strDlToParse = {CORRADO PIPPO ;feee}; System.out.println(TESTING LEXER with : + strDlToParse); testError(strDlToParse); } private static void testParserError() { String strDlToParse = {CORRADO PIPPO feee} dhert; System.out.println(TESTING PARSER with : + strDlToParse); testError(strDlToParse); } private static void testError(String strDlToParse) { CommonTree tree=null; String strError = null; ANTLRStringStream input = new org.antlr.runtime.ANTLRStringStream(strDlToParse); Dl2OwlJavaBLexer lexer = new Dl2OwlJavaBLexer(input); TokenStream tokens = new org.antlr.runtime.CommonTokenStream(lexer); Dl2OwlJavaBParser parser = new Dl2OwlJavaBParser(tokens); try { // this may rise an exception // TODO : check why NO EXCEPTION is risen with error line 1:9 no viable alternative at character ';' on inputs like {CORRADO ;} eu.servicemix.dl2owl.Dl2OwlJavaBParser.axiom_return ret = parser.axiom(); // TODO : check if this will be executed if no exception rises tree = (CommonTree) ret.getTree(); printTreeHelper(tree); } catch (RecognitionException e) { System.out.println(e.toString()); e.printStackTrace(); } catch (RuntimeException e) { System.out.println(e.toString()); e.printStackTrace(); } } The output looks ok, I wonder whether the whole 'trick' is too... TESTING LEXER with : {CORRADO PIPPO *;*feee} LEXER : ErrorHeader : line 1:15 LEXER : ErrorMessage : no viable alternative at character ';' LEXER : RecognitionException : NoViableAltException(';'@[1:1: Tokens : ( T__37 | T__38 | T__39 | T__40 | HAS_VALUE | ALL_VALUES | SOME_VALUES | DOT | HAS_CARD | MIN_CARD | MAX_CARD | NOT | AND | OR | URI_REF | INT_VALUE | WS | CTRL_CHAR );]) eu.servicemix.dl2owl.Dl2OwlJavaBLexer$LexerException eu.servicemix.dl2owl.Dl2OwlJavaBLexer$LexerException at eu.servicemix.dl2owl.Dl2OwlJavaBLexer.reportError(Dl2OwlJavaBLexer.java:69) at
[il-antlr-interest: 28226] Re: [antlr-interest] Using previously matched parser rule in decision making
java.util.List stack = getRuleInvocationStack(e, getParserName()); But this only works for Java and other targets that copy it (I think C# might do it). I don't do it in C because I prefer to take the view that the C stuff should be as close to the metal as it can be and the programmer will choose to add the overheads they need. That makes a lot of sense, but can you tell me how can a C-programmer keep track of the rules, he will pass through during the LookAhead. I think this question might be valid for someone using Java Target also. The action, you are asking to execute can be executed during normal parsing, but if that is used in Semantic predicates, then we have to be sure that this will get executed even during the lookahead. Can we create such a facility? What i mean is a set of actions, if put inside some special construct, will get executed while executing Syntactic predicates and semantic predicates, possibly with a rollback action. Please let me know, if i am misunderstanding the concept somewhere. Thanks, Gokul. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.