[GitHub] [spark] cloud-fan commented on a diff in pull request #41817: [SPARK-43851][SQL][FOLLOWUP] Move resolve LCA in grouping expressions

via GitHub Wed, 05 Jul 2023 15:30:29 -0700


cloud-fan commented on code in PR #41817:
URL: https://github.com/apache/spark/pull/41817#discussion_r1253718750



##########
sql/core/src/test/resources/sql-tests/analyzer-results/column-resolution-aggregate.sql.out:
##########
@@ -94,27 +94,23 @@ org.apache.spark.sql.AnalysisException
 -- !query
 SELECT k AS lca, lca + 1 AS col FROM v1 GROUP BY k, col
 -- !query analysis
-Project [lca#x, (lca#x + 1) AS col#x]
-+- Project [k#x, k#x AS lca#x]
-   +- Aggregate [k#x, (k#x + 1)], [k#x]
-      +- SubqueryAlias v1
-         +- View (`v1`, [a#x,b#x,k#x])
-            +- Project [cast(a#x as int) AS a#x, cast(b#x as int) AS b#x, 
cast(k#x as int) AS k#x]
-               +- SubqueryAlias t
-                  +- LocalRelation [a#x, b#x, k#x]
+Aggregate [k#x, (k#x + 1)], [k#x AS lca#x, (k#x + 1) AS col#x]

Review Comment:
   The key of LCA is we should evaluate the LCA expression only once, that's 
why I said we need to add a Project below Aggregate, to evaluate the LCA 
expression before doing grouping.
   
   I don't think it's an easy job, considering all kinds of cases: nested LCA, 
LCA with agg func, LCA with grouping expr, etc. I suggest to revert the commit 
if we can't give a clear solution.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] cloud-fan commented on a diff in pull request #41817: [SPARK-43851][SQL][FOLLOWUP] Move resolve LCA in grouping expressions

Reply via email to