cloud-fan commented on code in PR #41817: URL: https://github.com/apache/spark/pull/41817#discussion_r1253718750
########## sql/core/src/test/resources/sql-tests/analyzer-results/column-resolution-aggregate.sql.out: ########## @@ -94,27 +94,23 @@ org.apache.spark.sql.AnalysisException -- !query SELECT k AS lca, lca + 1 AS col FROM v1 GROUP BY k, col -- !query analysis -Project [lca#x, (lca#x + 1) AS col#x] -+- Project [k#x, k#x AS lca#x] - +- Aggregate [k#x, (k#x + 1)], [k#x] - +- SubqueryAlias v1 - +- View (`v1`, [a#x,b#x,k#x]) - +- Project [cast(a#x as int) AS a#x, cast(b#x as int) AS b#x, cast(k#x as int) AS k#x] - +- SubqueryAlias t - +- LocalRelation [a#x, b#x, k#x] +Aggregate [k#x, (k#x + 1)], [k#x AS lca#x, (k#x + 1) AS col#x] Review Comment: The key of LCA is we should evaluate the LCA expression only once, that's why I said we need to add a Project below Aggregate, to evaluate the LCA expression before doing grouping. I don't think it's an easy job, considering all kinds of cases: nested LCA, LCA with agg func, LCA with grouping expr, etc. I suggest to revert the commit if we can't give a clear solution. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
