ulysses-you commented on code in PR #37525:
URL: https://github.com/apache/spark/pull/37525#discussion_r1070753260


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/AliasAwareOutputExpression.scala:
##########
@@ -70,53 +66,16 @@ trait AliasAwareOutputExpression extends SQLConfHelper {
   protected def normalizeExpression(
       expr: Expression,
       pruneFunc: (Expression, AttributeSet) => Option[Expression]): 
Seq[Expression] = {
-    val normalizedCandidates = new mutable.HashSet[Expression]()
-    normalizedCandidates.add(expr)
     val outputSet = AttributeSet(outputExpressions.map(_.toAttribute))
-
-    def pruneCandidate(candidate: Expression): Option[Expression] = {
+    expr.multiTransform {

Review Comment:
   According to my usage. The `multiTransform` should at least support 3 cases 
of pruning:
   1. the max limit size of returned result
   2. eagerly pruning func
       - prune the result whose references is not subset of output
       - prune intermediate result if the alias map does not contain any other 
sub-expression
       - prune sub-expression, e.g. `PartitionCollection(a, b)` -> 
`PartitionCollection(a)` if b is not subset of output
   
   If all this requirements can be matched, I think it's good to switch to 
multi-transofrm.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to