On 2017-04-05 09:39:37 -0400, Tom Lane wrote: > Andres Freund <and...@anarazel.de> writes: > > On 2017-04-05 02:47:55 -0400, Noah Misch wrote: > >> [Action required within three days. This is a generic notification.] > > > I've a very preliminary patch. I'd like to only start polishing it up > > once the code freeze is over, so I can work on getting some patches in - > > note that I myself have no pending patches. Once frozen I'll polish it > > up and send that within a few days. > > FWIW, I'm willing to help out on this.
Help would be appreciated. I've pondered, and partially implemented, several approaches so far, and my conclusion is that we should just do nothing. The amount of corner cases is just too big, and the utility of the feature too small. To recap, the issue is that in previous versions it was, by accident (there's no test, code comments "supporting" the feature talk about a different corner case, and the behaviour isn't correct in some cases) allowed to do something like: SELECT * FROM CAST(generate_series(1,3) * 5 AS int); while SELECT * FROM generate_series(1,3) * 5; is not allowed. The reason that that works from the gram.y perspective is that CAST etc types of func_expr's. The reason that it worked from a code perspective is that execQual.c:ExecMakeTableFunctionResult() has the following: /* * Normally the passed expression tree will be a FuncExprState, since the * grammar only allows a function call at the top level of a table * function reference. However, if the function doesn't return set then * the planner might have replaced the function call via constant-folding * or inlining. So if we see any other kind of expression node, execute * it via the general ExecEvalExpr() code; the only difference is that we * don't get a chance to pass a special ReturnSetInfo to any functions * buried in the expression. */ if (funcexpr && IsA(funcexpr, FuncExprState) && IsA(funcexpr->expr, FuncExpr)) and back then ExecEvalExpr() was allowed to return sets. It also depends on some default initializations (e.g. rsinfo.returnMode = SFRM_ValuePerCall). This yields plenty weird behaviour in < v10. E.g. the following is disallowed: SELECT * FROM int4mul(generate_series(1,3), 1); ERROR: 0A000: set-valued function called in context that cannot accept a set as is SELECT * FROM CAST(int4mul(generate_series(1,3), 1) AS bigint); ERROR: 0A000: set-valued function called in context that cannot accept a set because the cast is implemented as int8(expr) which avoids the fallback path as it's a FuncExpr, but SELECT * FROM CAST(int4mul(generate_series(1,3), 1) AS text); works because the cast is implemented as a io coercion, which is not a funcexpr. Therefore it uses the fallback ExecEvalExpr(). The mismatch between ExecEvalExpr() and nodeFunctionscan.c SRF behaviour back then also yields odd behaviour, e.g.: SELECT * FROM generate_series(1,0); returns zero rows, but SELECT * FROM CAST(generate_series(1,0) * 5 AS INT); returns one NULL row. In v10, as it stands, these all error out, because the SRFs are now only to be evaluated via either nodeFunctionscan.c (FROM) or via nodeProjectSet.c (targetlist), not ExecEvalExpr() anymore. I've basically pondered three different methods of implementing something akin to the old behaviour: 1) Move the non-SRF part into nodeFunctionscan.c's targetlist, and let it be evaluated there. E.g. if the expression is CAST(generate_series(1,5) AS text), evaluate the generate_series(1,5) using nodeFunctionscan's FunctionScanPerFuncState machinery, but implement the CAST as CAST(Var(whatever, 1) AS Text). That doesn't work well for composite/record returning rows, because RangeTblFunction's returning composites are expanded into multiple columns. E.g. SELECT * FROM CAST(CAST(twocol_noset_outline(generate_series(1,3), 'text') AS text) AS twocol); returns all the columns of twocol, not a single wholerow datum. There's also some issues around what to do with cases involving volatile functions when the output is not referenced, or referenced multiple times e.g. SELECT f, f FROM CAST(generate_series(1,3) * nextval(...)) AS f; would evaluate nextval() twice with such an approach... 2) During planning, split of the SRF bit from the !SRF bit and have nodeFunctionscan.c evaluate both separately, like 1). That allows to avoid the volatility issue, because the targetlist is still projected separately. I've prototyped this to a reasonable point, by having create_functionscan_plan() process each RangeTblFunction and split the expression into SRF and non-SRF parts, and then have FunctionNext() evaluate the non-SRF part. At the current state of my prototype this happens to allow simple SRF in arguments cases like SELECT * FROM int4mul(generate_series(1,3), 1) which are disallowed in < v10. This is reasonably-ish in complexity for scalar values, but gets quite complicated for composite/record datums. Suddenly there's two different types of values that we need to deal with, the type of datum returned by the SRF, and the type of datum returned by the final expression - they might be the same, they might not. Especially for record-returning functions, where ROWS FROM( AS...) determines the column types, and ExecMakeTableFunctionResult() does a tupledesc_match() to verify the SRF returned something reasonable, it gets wild: What would we even be matching the types against? 3) Implement nested FROM SRFs as pipelined executor nodes, similar to the way targetlist SRFs are now handled. E.g. something like SELECT * FROM CAST(generate_series(1,10) * 5 AS int); would be implemented as one nodeFunctionscan.c for the generate_series(), and then something like nodeProjectSet.c (or a nodeFunctionscan.c branch that'd make a sub-node available) would evaluate the CAST(Var() * 5 AS int). This approach has the advantage that it'd allow us, potentially in the future, to get rid of the restriction that normal ROWS FROM doesn't allow SRFs in arguments. I think this again runs into trouble with row-expansion in the return value from SRFs, which isn't actually desired till the outermost "level" of the expression. I guess we could add a mode to nodeFunctionscan.c that forces composite/record types to be returned as wholerow vars instead of being split up. My conclusion here is that it's way too complicated to properly implement a feature that only seems to exist by accident and has plenty of weird behaviour. Currently we continue to accept all the !SRF expressions that were previously accepted, but I'd even consider insisting that the expression needs to be a proper FuncExpr at parse-analysis time (before inlining/const evaluation did its thing). Unless somebody has a radically better idea how to implement this? - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers