heyihong commented on code in PR #53637:
URL: https://github.com/apache/spark/pull/53637#discussion_r2661454041
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala:
##########
@@ -630,6 +630,20 @@ abstract class QueryPlan[PlanType <: QueryPlan[PlanType]]
foreach(actualFunc)
}
+ /**
+ * A variant of [[foreachWithSubqueries]] with pruning support.
+ * Only traverses nodes that match the given condition.
+ */
+ def foreachWithSubqueriesAndPruning(
+ cond: TreePatternBits => Boolean)(f: PlanType => Unit): Unit = {
+ if (!cond.apply(this)) {
+ return
+ }
+ f(this)
+ subqueries.foreach(_.foreachWithSubqueriesAndPruning(cond)(f))
Review Comment:
The traversal order of `foreachWithSubqueries` should also be subqueries
first, then children, but its traversal implementation is a bit hard to read
and understand, in my opinion.
After taking a closer look at the code, I think it may be a style choice for
`TreeNode` methods to handle traversal logic (including withPruning), while
QueryPlan simply leverages that to perform additional tasks, such as handling
subqueries.
So the correctness may not be affected, but style-wise, it makes more sense
to move the pruning logic to TreeNode.
I created a separate pr to simplify its traversal implementation:
https://github.com/apache/spark/pull/53681
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]