ulysses-you commented on code in PR #37525:
URL: https://github.com/apache/spark/pull/37525#discussion_r1070753260
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/AliasAwareOutputExpression.scala:
##########
@@ -70,53 +66,16 @@ trait AliasAwareOutputExpression extends SQLConfHelper {
protected def normalizeExpression(
expr: Expression,
pruneFunc: (Expression, AttributeSet) => Option[Expression]):
Seq[Expression] = {
- val normalizedCandidates = new mutable.HashSet[Expression]()
- normalizedCandidates.add(expr)
val outputSet = AttributeSet(outputExpressions.map(_.toAttribute))
-
- def pruneCandidate(candidate: Expression): Option[Expression] = {
+ expr.multiTransform {
Review Comment:
According to my usage. The `multiTransform` should at least support 3 cases
of pruning:
1. the max limit size of returned result
2. eagerly pruning func
- prune the result whose references is not subset of output
- prune intermediate result if the alias map does not contain any other
sub-expression
- prune sub-expression, e.g. `PartitionCollection(a, b)` ->
`PartitionCollection(a)` if b is not subset of output
If all this requirements can be matched, I think it's good to switch to
multi-transofrm.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]