On Mittwoch, 6. Februar 2019 11:53:42 CET Derek Clegg wrote: > In C++, we could be fancier: have two functions, one for a “pure” syntax > error: > > void my_yysyntax_error(); > > and one for syntax errors involving unexpected tokens: > > void my_yysyntax_error(std::string unexpected_token, > std::vector<std::string> expected_tokens); > > The call would be something like this: > > if (yycount == 0) { > my_yysyntax_error(); > } else { > my_yysyntax_error(yyarg[0], std::vector<std::string>{yyarg + 1, yyarg > + yycount}); }
That might be too limited for practical use. IMO a built-in API for retrieving a list of expected tokens should be designed in reentrant way, such that you could a) pass a parser state to the function to get a list of expected tokens precisely for the passed parser state and b) conveniently duplicate a parser state if required for being able to c) advance the duplicated parser state if required without touching the "official" parser state. For features like human readable syntax errors and user friendly auto completion, I am doing that already with additional code that accesses the Bison internal tables directly. It does work, but it involves a lot of extra code that is injected into an auto generated parser and that solution is very dependent on Bison's internal data structure layout and precise parser behaviour. So it would be great if there was some built-in alternative in Bison one day to get rid of this. When I first started to write custom syntax error code and auto completion features years ago, I also thought it was sufficient to just resolve all possible next tokens, but quickly realized that it is often not user friendly enough. For instance in practice the parser often might be at a point in the grammar where the only possibility would be a single specific sequence of tokens coming next, In such a case a user would desire not just to see the very next token, but all subsequent tokens as well, up to the point where the grammar really has a branch and requires an actual user input decision. Another example from my experience are parsers which are not using a separate lexer at all, so a parser which always simply just reads the next input byte and uses grammar rules instead for what's typically handled by external lexer rules. In such a scenario, if you were limited to only resolve the very next token, you would only be able to show the user a list of single character(s) expected next, like "expected either 'a', 'g', 'c'" instead of showing the user e.g. "expected either 'all', 'get', clear'". Best regards, Christian Schoenebeck _______________________________________________ help-bison@gnu.org https://lists.gnu.org/mailman/listinfo/help-bison