Hi Adela, This is a great piece of work! The overall result is much better, well done!
> Le 18 déc. 2020 à 19:37, Adela Vais <[email protected]> a écrit : > > Hello! > > I made the suggested modifications. > > I have a question: given that the user will provide all the needed > information through the return value, should semanticVal(), startPos() and > endPos() still be part of the Lexer interface? No, they are not useful. They make sense only with split symbols. Besides, in that case (split symbols) I personally think the right interface is that of locations, not that of positions. IMHO it was an error in Java to put forward positions, and leave locations to the parser only. C and C++ show only location-based interfaces. It's up to the user to decide that a location is nothing but a position if she thinks two positions is too costly (but that would be mean to the users). > În vin., 20 nov. 2020 la 20:18, Akim Demaille <[email protected]> a scris: > >> You should introduce type aliases for b4_yystype and YYLocation. >> In the C++ parser, you have value_type and location_type which >> are defined to whatever they are actually. The code is nicer >> to read, with fewer occurrences of ugly YYnames. >> >> > I named them Location and Value. Excellent. > Should I also change b4_location_type to > the new Location alias throughout lalr1.d? I know that the user should not > use yy names (and before these commits, the examples used YYLocation) but I > don't know how much of the backend I should change. I changed the > b4_yystype occurrences. I would also use Location everywhere. >>> if (yychar == TokenKind.]b4_symbol(empty, id)[) >>> {]b4_parse_trace_if([[ >>> yycdebugln ("Reading a token");]])[ >>> - yychar = yylex ();]b4_locations_if([[ >>> - static if (yy_location_is_class) { >>> - yylloc = new ]b4_location_type[(yylexer.startPos, >> yylexer.endPos); >>> - } else { >>> - yylloc = ]b4_location_type[(yylexer.startPos, >> yylexer.endPos); >>> - }]]) >>> - yylval = yylexer.semanticVal;[ >>> + Symbol yysymbol = yylex(); >>> + yychar = yysymbol.token(); >> >> Maybe you don't need yychar, but only need yytoken. You probably >> can avoid dealing with the token-kinds here, and deal only with >> the symbol kinds. >> > Done. Great. This part, however, is too complex and not very clear: > - if (yychar <= TokenKind.]b4_symbol(eof, [id])[) > + if (yytoken <= ]b4_symbol(eof, [kind])[) > { > /* Return failure if at end of input. */ > - if (yychar == TokenKind.]b4_symbol(eof, [id])[) > + if (yytoken == ]b4_symbol(eof, [kind])[) > return false; > } > else > - yychar = TokenKind.]b4_symbol(empty, id)[; > + yytoken = ]b4_symbol(empty, kind)[; When accepting token kinds as return values from yylex, we tolerate any nonpositive number to denote EOF. Hence the first if. However when using symbol kinds, we should not accept any deviation from the symbols kinds. So that should be a single check "== eof", not two. Maybe have a look at what was done in C++. Again, great work, thanks! Cheers!
