Fabien COELHO <[EMAIL PROTECTED]> writes: > Although I agree that it is "doable", I have stronger reserve than yours. > Also, I do not find it an appealing solution to change "gram.c" a lot.
I was not proposing hand-editing gram.c after bison generates it, if that's what you meant ;-). It seems however perfectly doable to reference bison's state stack from yyerror. We'd need some macro hacking to make yystate and the stack available to yyerror, but nothing worse than is already documented and recommended practice for other bison tweaks. > The automaton stack keeps states, which are not directly linked to rules > and terminals. The terminals are not available, they must be kept > separatly if you want them. This can be done in yylex(). We don't need them. Any already-shifted tokens are to the left of where the error is, no? > The internal state, stack, token... are local variables within yyparse(). > As a result, they are not accessible from yyerror. I haven't found any > available hook, so you have to hack "gram.c" to get this information. No, just redefine "yyerror" as a macro that passes additional parameters. > - move backwards before doing the above, if some reductions where > performed because of the submitted token and finally resulted in the error, > the state that lead to the error may not be the highest available one, so > maybe other allowed tokens may also be missed. We would need to have > the last state before any reduction. Yeah, I had come to the same conclusion --- state moves made without consuming input would need to be backed out if we wanted to report the complete follow set. I haven't yet looked to see how to do that. > As you noted, for things like "SHOW 123", the follow set basically > includes all keywords although you can have SHOW ALL or SHOW name. > So, as you suggested, you can either say "ident" as a simplification, but > you miss ALL which is meaningful, or you say all keywords, which is > useless. You're ignoring the distinction between classes of keywords. I would not recommend treating reserved_keywords as a subclass of ident. > (5) anything that can be done would be hardwired to one version of bison. > There is a lot of asumptions in the code and data structures, and any > newer/older version with some different internal representation would > basically break any code that would rely on that. So postgres would not be > "bison" portable:-( I don't think it is an real option that old postgresql > source would be broken against future bison releases. I think this argument is completely without merit. The technology of LALR parsers has been stable for what, thirty years now? The parts of bison that we'd want to look at are inherited lock stock and barrel from AT&T yacc, and are unlikely to change in the foreseeable future; even more unlikely to change in a way that we couldn't easily adapt to. You might as well argue that we shouldn't use autoconf because the autoconf authors sometimes make not-very-compatible changes. > (b) write a new "recursive descendant" parser, and drop "gram.y" Been there, done that, not impressed with the idea. There's a reason why people invented parser generators... > As a side effect of my inspection is that the automaton generated by bison > could be simplified if a different tradeoff between the lexer, the parser > and the post-processing would be chosen. Namelly, all tokens that are > just identifiers should be dropped and processed differently. We are not going to reserve the keywords that are presently unreserved. If you can think of a reasonable way to stop treating them as separate tokens inside the grammar without altering the user-visible behavior, I'm certainly interested. I think that will be rather difficult, however, considering for one thing that SQL specifies different case-folding behavior for identifiers and keywords. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 7: don't forget to increase your free space map settings