Andrew Gierth <and...@tao11.riddles.org.uk> writes: > "Tom" == Tom Lane <t...@sss.pgh.pa.us> writes: > Tom> The FROM FIRST/LAST bit seems particularly badly thought through, > Tom> because AFAICS it is flat out ambiguous with a normal FROM clause > Tom> immediately following the window function call. The only way to > Tom> make it not so would be to make FIRST and LAST be fully reserved, > Tom> which is neither a good idea nor spec-compliant.
> In the actual spec syntax it's not ambiguous at all because NTH_VALUE is > a reserved word (as are LEAD, LAG, FIRST_VALUE and LAST_VALUE), and OVER > is a mandatory clause in its syntax, so a FROM appearing before the OVER > must be part of a FROM FIRST/LAST and not introducing a FROM-clause. Hmm ... > In our syntax, if we made NTH_VALUE etc. a col_name_keyword (and thus > not legal as a function name outside its own special syntax) it would > also become unambiguous. > i.e. given this token sequence (with . marking the current posision): > select nth_value(x) . from first ignore > if we know up front that "nth_value" is a window function and not any > other kind of function, we know that we have to shift the "from" rather > than reducing the select-list because we haven't seen an "over" yet. I don't really find that to be a desirable solution, because quite aside from the extensibility problem, it would mean that a lot of errors become "syntax error" where we formerly gave a more useful message. This does open up a thought about how to proceed, though. I'd been trying to think of a way to solve this using base_yylex's ability to do some internal lookahead and change token types based on that. If you just think of recognizing FROM FIRST/LAST, you get nowhere because that's still legal in other contexts. But if you were to look for FROM followed by FIRST/LAST followed by IGNORE/RESPECT/OVER, I think that could only validly happen in this syntax. It'd take some work to extend base_yylex to look ahead 2 tokens not one, but I'm sure that could be done. (You'd also need a lookahead rule to match "IGNORE/RESPECT NULLS OVER", but that seems just as doable.) Then the relevant productions use FROM_LA, IGNORE_LA, RESPECT_LA instead of the corresponding bare tokens, and the grammar no longer has an ambiguity problem. regards, tom lane