This was changed because the tool no longer generates those sets. Jim
> -----Original Message----- > From: [email protected] [mailto:antlr-interest- > [email protected]] On Behalf Of Justin Murray > Sent: Thursday, July 21, 2011 12:28 PM > To: Vlad > Cc: [email protected] > Subject: Re: [antlr-interest] 'Dude' error in v3.4 and possible bugs > explained [was: on "crap" grammars] > > I think that Vlad may be onto something here. From what I can tell from > my generated grammar, this only affects ANTLR3_MISMATCHED_SET_EXCEPTION > type exceptions. My grammar has several hundred parser rules, but only > in 4 cases is a ANTLR3_MISMATCHED_SET_EXCEPTION generated. In all 4 > cases, the expectingSet is being set to NULL, and in no other cases is > expectingSet being set to NULL. I agree that this would be improved if > changed as Vlad described. > > It just so happens that the way I implemented my exception handling, I > treat ANTLR3_MISMATCHED_SET_EXCEPTION the same as > ANTLR3_RECOGNITION_EXCEPTION, and don't bother to display the > expectingSet, so I never would have discovered this problem. > > Since I recently figured out how the C template works, I decided to > take a peek. I found the following in antlr-3.4-complete-no- > antlrv2.jar/org/antlr/codegen/templates/C/C.stg: > > <if(PARSER)> > EXCEPTION->expectingSet = NULL; > <! use following code to make it recover inline; > EXCEPTION->expectingSet = &FOLLOW_set_in_<ruleName><elementIndex>; > !> > <endif> > > So it appears that this was done explicitly at some point. You could > edit C.stg to uncomment the code above, and I imagine that it will > generate the correct follow set pointer. Perhaps Jim knows why this is > like this? This may be avoiding some other problems, so I don't know > how safe of a change this would be. > > - Justin > > On 7/21/2011 2:45 PM, Vlad wrote: > > Previously I was on 3.2 runtime. It occurred to me to try 3.4 > released a day ago. To this end I've switched to 3.4-beta4 runtime as > well. Using one of the testerrors.g grammars with non-inlined int/float > tokens and parser generated by antlr-3.4-complete.jar I now get on > input string "name : bad": > > <string>(1) : error 4 : Unexpected token, at offset 6 > near [Index: 4 (Start: 31458399-Stop: 31458401) ='bad', > type<6> Line: 1 LinePos:6] > : unexpected input... > expected one of : Actually dude, we didn't seem to be expecting > anything here, or at least > I could not work out what I was expecting, like so many of us > these days! > > (this required switching to antlr3StringStreamNew() from > antlr3NewAsciiStringInPlaceStream() as was posted by Jim here: > http://groups.google.com/group/il-antlr- > interest/browse_thread/thread/981a79239e352c89 and as is mentioned > within that thread the last argument can't be NULL to avoid a > segfault). > > So, this is better because at least the offending token is > identified correctly. The reason the expected set is still not > identified correctly (the 'Dude' part) is because the generated error > path for the 'type' non-terminal always sets the exception's > expectingSet to NULL: > > { > if ( ((LA(1) >= AT_FLOAT_) && (LA(1) <= AT_INT_)) ) > { > CONSUME(); > PERRORRECOVERY=ANTLR3_FALSE; > } > else > { > CONSTRUCTEX(); > EXCEPTION->type = > ANTLR3_MISMATCHED_SET_EXCEPTION; > EXCEPTION->name = (void > *)ANTLR3_MISMATCHED_SET_NAME; > EXCEPTION->expectingSet = NULL; // <--- ???? > > goto ruletypeEx; > } > > > } > > I might be called names again, but I'd say this error handling > does not look correct because the rule knows exactly what token set it > expects right here but then goes ahead and ignores that info for the > purposes of generating exception info (what's the point in indicating > ANTLR3_MISMATCHED_SET_NAME if that set is always set to NULL). > > Examining the generated parser code, I in fact see what appears to > be a correct set that would be FOLLOW(':'): it has bits set for > AT_FLOAT_ and AT_INT_ and is FOLLOWPUSH()ed before the rule is entered. > > By manually doctoring the parser code to set EXCEPTION- > >expectingSet to point to this FOLLOW set, I get rid of the 'Dude' > message but hit on another bug in displayRecognitionError() that prints > the wrong two token names: > > <string>(1) : error 4 : Unexpected token, at offset 6 > near [Index: 4 (Start: 13845599-Stop: 13845601) ='bad', > type<6> Line: 1 LinePos:6] > : unexpected input... > expected one of : <EOR>, <DOWN> > > Looking at the stock displayRecognitionError() code, it is clear > that the loop over the set bits is not correct (the TODO is right). > Fixing it by adding errBits->isMember(errBits, bit): > > for (bit = 1; bit < numbits && count < 8 && count < size; bit++) > { > // TODO: This doesn;t look right - should be asking if the bit is > set!! > // > if (errBits->isMember(errBits, bit) && tokenNames[bit]) // <--- > ???? was missing bitset member check > { > ANTLR3_FPRINTF(stderr, "%s%s", count > 0 ? ", " : "", > tokenNames[bit]); > count++; > } > } > > finally gets me the error message that makes sense: > > <string>(1) : error 4 : Unexpected token, at offset 6 > near [Index: 4 (Start: 30442591-Stop: 30442593) ='bad', > type<6> Line: 1 LinePos:6] > : unexpected input... > expected one of : AT_FLOAT_, AT_INT_ > > > "Crap" grammars, I hear somebody said? Hmm, I don't think so... > > > > > > List: http://www.antlr.org/mailman/listinfo/antlr-interest > Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your- > email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups "il-antlr-interest" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
