heyihong commented on code in PR #53637:
URL: https://github.com/apache/spark/pull/53637#discussion_r2661454041
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala:
##########
@@ -630,6 +630,20 @@ abstract class QueryPlan[PlanType <: QueryPlan[PlanType]]
foreach(actualFunc)
}
+ /**
+ * A variant of [[foreachWithSubqueries]] with pruning support.
+ * Only traverses nodes that match the given condition.
+ */
+ def foreachWithSubqueriesAndPruning(
+ cond: TreePatternBits => Boolean)(f: PlanType => Unit): Unit = {
+ if (!cond.apply(this)) {
+ return
+ }
+ f(this)
+ subqueries.foreach(_.foreachWithSubqueriesAndPruning(cond)(f))
Review Comment:
The traversal order of `foreachWithSubqueries` should also be subqueries
first, then children, but its traversal implementation is a bit hard to read
and understand, in my opinion.
After taking a closer look at the code, I think it may be a style choice for
`TreeNode` methods to handle plan traversal logic (including withPruning),
while QueryPlan simply leverages that to perform additional tasks, such as
handling subqueries.
So the correctness may not be affected, but style-wise, it makes more sense
to move the pruning logic to TreeNode.
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala:
##########
@@ -630,6 +630,20 @@ abstract class QueryPlan[PlanType <: QueryPlan[PlanType]]
foreach(actualFunc)
}
+ /**
+ * A variant of [[foreachWithSubqueries]] with pruning support.
+ * Only traverses nodes that match the given condition.
+ */
+ def foreachWithSubqueriesAndPruning(
+ cond: TreePatternBits => Boolean)(f: PlanType => Unit): Unit = {
+ if (!cond.apply(this)) {
+ return
+ }
+ f(this)
+ subqueries.foreach(_.foreachWithSubqueriesAndPruning(cond)(f))
Review Comment:
Discussed with @cloud-fan offline, while the current convention is to let
TreeNode methods to handle plan traversal logic (including withPruning) for
`foreachWithSubqueriesAndPruning`, a better implementation should avoid doing
so.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]