Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10163#discussion_r47185568
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala
 ---
    @@ -252,6 +253,27 @@ case class And(left: Expression, right: Expression) 
extends BinaryOperator with
         }
       }
     
    +  override def semanticEquals(other: Expression): Boolean = this.getClass 
== other.getClass && {
    +    // Non-deterministic expressions cannot be semantic equal
    +    if (!deterministic || !other.deterministic) return false
    +
    +    // We already know both expressions are And, so we can tolerate 
ordering different
    +    // Recursively call semanticEquals on subexpressions to check the 
equivalency of two seqs.
    +    var elements1 = splitConjunctivePredicates(this)
    +    val elements2 = splitConjunctivePredicates(other)
    +    // We can recursively call semanticEquals to check the equivalency for 
subexpressions, but
    +    // there is no simple solution to compare the equivalency of sequence 
of expressions.
    +    // Expression class doesn't have order, so we couldn't sort them. We 
can neither use
    +    // set comparison as Set doesn't support custom compare function, 
which is semanticEquals.
    +    // To check the equivalency of elements1 and elements2, we first 
compare their size. Then
    +    // for each element in elements2, we remove its first semantically 
equivalent expression from
    +    // elements1. If they are semantically equivalent, elements1 should be 
empty at the end.
    +    elements1.size == elements2.size && {
    +      for (e <- elements2) elements1 = 
removeFirstSemanticEquivalent(elements1, e)
    +      elements1.isEmpty
    --- End diff --
    
    ah I see, this makes sense.
    But I think a better way is to add an optimization rule to turn all 
predicates into CNF, before we begin to check the semantic, or it will be hard 
to cover all cases like `a || (b && c) == (a || b) && (a || c)`
    
    cc @liancheng 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to