cloud-fan commented on a change in pull request #29101:
URL: https://github.com/apache/spark/pull/29101#discussion_r455692938



##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala
##########
@@ -201,126 +201,50 @@ trait PredicateHelper extends Logging {
     case e => e.children.forall(canEvaluateWithinJoin)
   }
 
-  /**
-   * Convert an expression into conjunctive normal form.
-   * Definition and algorithm: 
https://en.wikipedia.org/wiki/Conjunctive_normal_form
-   * CNF can explode exponentially in the size of the input expression when 
converting [[Or]]
-   * clauses. Use a configuration [[SQLConf.MAX_CNF_NODE_COUNT]] to prevent 
such cases.
-   *
-   * @param condition to be converted into CNF.
-   * @return the CNF result as sequence of disjunctive expressions. If the 
number of expressions
-   *         exceeds threshold on converting `Or`, `Seq.empty` is returned.
+  /*
+   * Returns a filter that it's output is a subset of `outputSet` and it 
contains all possible
+   * constraints from `condition`. This is used for predicate pushdown.
+   * When there is no such convertible filter, `None` is returned.
    */
-  protected def conjunctiveNormalForm(
-      condition: Expression,
-      groupExpsFunc: Seq[Expression] => Seq[Expression]): Seq[Expression] = {
-    val postOrderNodes = postOrderTraversal(condition)
-    val resultStack = new mutable.Stack[Seq[Expression]]
-    val maxCnfNodeCount = SQLConf.get.maxCnfNodeCount
-    // Bottom up approach to get CNF of sub-expressions
-    while (postOrderNodes.nonEmpty) {
-      val cnf = postOrderNodes.pop() match {
-        case _: And =>
-          val right = resultStack.pop()
-          val left = resultStack.pop()
-          left ++ right
-        case _: Or =>
-          // For each side, there is no need to expand predicates of the same 
references.
-          // So here we can aggregate predicates of the same qualifier as one 
single predicate,
-          // for reducing the size of pushed down predicates and corresponding 
codegen.
-          val right = groupExpsFunc(resultStack.pop())
-          val left = groupExpsFunc(resultStack.pop())
-          // Stop the loop whenever the result exceeds the `maxCnfNodeCount`
-          if (left.size * right.size > maxCnfNodeCount) {
-            logInfo(s"As the result size exceeds the threshold 
$maxCnfNodeCount. " +
-              "The CNF conversion is skipped and returning Seq.empty now. To 
avoid this, you can " +
-              s"raise the limit ${SQLConf.MAX_CNF_NODE_COUNT.key}.")
-            return Seq.empty
-          } else {
-            for { x <- left; y <- right } yield Or(x, y)
-          }
-        case other => other :: Nil
+  protected def convertibleFilter(

Review comment:
       the name is bit weird. `extractPredicatesWithinOutputSet`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to