[GitHub] [spark] sigmod commented on a change in pull request #32060: [SPARK-34916][SQL] Add condition lambda and rule id to the transform family for early stopping

GitBox Wed, 07 Apr 2021 02:26:44 -0700


sigmod commented on a change in pull request #32060:
URL: https://github.com/apache/spark/pull/32060#discussion_r608490232




##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala
##########
@@ -80,28 +105,57 @@ abstract class QueryPlan[PlanType <: QueryPlan[PlanType]]
    * transformExpressionsDown or transformExpressionsUp should be used.
    *
    * @param rule the rule to be applied to every expression in this operator.
+   * @param cond  a Lambda expression to prune tree traversals. If 
`cond.apply` returns false
+   *              on an expression T, skips processing T and its subtree; 
otherwise, processes
+   *              T and its subtree recursively.
+   * @param ruleId is a unique Id for `rule` to prune unnecessary tree 
traversals. When it is
+   *        RuleId.UnknownId, no pruning happens. Otherwise, if `rule`(with id 
`ruleId`) has been
+   *        marked as in effective on an expression T, skips processing T and 
its subtree. Do not
+   *        pass it if the rule is not purely functional and reads a varying 
initial state for
+   *        different invocations.
    */
-  def transformExpressions(rule: PartialFunction[Expression, Expression]): 
this.type = {
-    transformExpressionsDown(rule)
+  def transformExpressions(rule: PartialFunction[Expression, Expression],
+    cond: TreePatternBits => Boolean = AlwaysProcess.fn,

Review comment:
       >> one should be contains(..)
   
   I think there're three variations:
   - containsPattern(P)
   - containsAllPatterns(P1, P2, P3)
   - containsAnyPatterns(P1, P2, P3)
   
   If we have a helper, we still need to pass the helper with (1) pattern enums 
and (2) an "all" v.s. "any" enum. 
   We can discuss offline later.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] sigmod commented on a change in pull request #32060: [SPARK-34916][SQL] Add condition lambda and rule id to the transform family for early stopping

Reply via email to