On Wed, 30 Jun 2010 10:48:39 -0700 Gordon Tyler <[email protected]> wrote: > I'm not very familiar with ANTLR's error recovery mechanisms, but I >suspect that the generated code for the 'expressions' rule looks for >a character that it recognizes as the start of an 'expression' rule >before it calls into the 'expression' rule and when it doesn't find >one in the second case, it exits out into the root rule, which then >checks if the next token is EOF and fails.
Please read the article on the wiki entitled "Custom error recovery" - this will give you all the information you need. Jim > > But this is just speculation. Hopefully one of the more experienced >ANTLRers can give you a better answer. > > -----Original Message----- >From: [email protected] >[mailto:[email protected]] On Behalf Of Luchesar Cekov > Sent: June 30, 2010 1:35 PM > Cc: [email protected] > Subject: Re: [antlr-interest] Continue parsing after an error > > Hi Gordon, > > Thanks for the prompt response. > Adding OTHER as an alternative was what I tried to do in the >beginning. > Unfortunately my use case is a bit more complex. I have worked out a > better example below. > In this example, the input string [ax][kx][ax] is wrong (k is not > allowed) but the grammar builds the full ast tree, so it recovers >from > the error - it would generate three expression nodes the second of >which > contains a ErrorCommonToken inside as per >recoverFromMismatchedToken(). > The string [ax]sax][ax] on the other end, generates only the first >bit > of the tree, till the error. - it generares only one expression >node. > > I do not understand why I get this different behavior - the parser > recovers if the error happens in the middle of a rule, but not if >the > error is at the beginning of a rule. > > Is this a problem in my grammar or it is just the way ANTLR works? > > Thanks, > Luchesar > > ================ > grammar StartOfARuleFailTest; > > options { output=AST; ASTLabelType=CommonTree; } > > tokens { ROOT_TOKEN;ERROR_TOKEN;EXPRESSIONS;EXPRESSION; } > > @members { > @Override > protected Object recoverFromMismatchedToken(IntStream input, int > ttype, BitSet follow) > throws RecognitionException { > MismatchedTokenException ex = new > MismatchedTokenException(ttype, input); > input.consume(); > return createErrorToken(ex, ttype); > } > > public static ErrorCommonToken >createErrorToken(RecognitionException > ex, int ttype) { > ErrorCommonToken errorCommonToken = new >ErrorCommonToken(ex.token); > errorCommonToken.setType(ttype); > > return errorCommonToken; > } > } > > root : expressions EOF -> ^(ROOT_TOKEN expressions) ; > expressions : expression* -> ^(EXPRESSIONS expression*) ; > expression : '[' 'a' 'x' ']' -> ^(EXPRESSION '[' 'a' 'x' ']'); > > OTHER : . ; > ================ > > > Gordon Tyler wrote: >> The grammar you have defined says, roughly: >> >> Parse any number of '[' or ']' until you reach EOF. >> >> It does not describe what to do if something other than '[' or ']' >>are found before EOF is found. >> >> You have defined a token, OTHER, to match the other stuff, but your >>parse rules do not reference OTHER. Perhaps something like this would >>work: >> >> root : (expressions | OTHER)* EOF -> ^(ROOT_TOKEN expressions) ; >> >> >> >> -----Original Message----- >> From: [email protected] >>[mailto:[email protected]] On Behalf Of Luchesar Cekov >> Sent: June 30, 2010 10:10 AM >> To: [email protected] >> Cc: Valerio Malenchino >> Subject: [antlr-interest] Continue parsing after an error >> >> Dear ANTLR enthusiasts, >> >> I am struggling with a problem. The parser jumps to the end of file >>from >> the middle of the document. >> >> The setup is as follow: >> * I have two alternatives flowed by EOF >> * during parse time in the middle of the document next token can >>not >> match either alternatives start >> >> This leads to parsing termination because the parser jumps to the >>EndOfFile. >> >> A simple grammar the illustrates the problem is >> >> =============== >> tokens {ROOT_TOKEN;} >> root >> : expressions EOF -> ^(ROOT_TOKEN expressions) ; >> expressions : ('[' | ']')* ; >> OTHER : . ; >> =============== >> >> If then I try parsing "[[][]]sdsdf[]][]][" the parsing will stop and >>the >> first "s" and will try to recover as if the EOF was the next token. >> When looking at the generated Parser it looks like if there is no >>viable >> alternative in the top rule in this case "root" the parser will >>behave >> as if it reached the EOF and will skip the rest of the tokens. >> >> The result AST will contain only children up until the first illegal >> token "s". >> >> I cannot see where my mistake is. It looks like the parser should >>not do >> that. Can you suggest a workaround for the problem? >> >> Thanks in advance, >> Luchesar >> > > -- > > Luchesar Cekov > Software Engineer > +44 (0) 207 239 4949 > *Ontology Systems* > www.ontology.com <http://www.ontology.com/> > > > > award list of icons > > > > > > > > > > . > > > > > List: http://www.antlr.org/mailman/listinfo/antlr-interest > Unsubscribe: >http://www.antlr.org/mailman/options/antlr-interest/your-email-address > > List: http://www.antlr.org/mailman/listinfo/antlr-interest > Unsubscribe: >http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups "il-antlr-interest" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
