Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22702#discussion_r224658881
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
 ---
    @@ -276,15 +276,15 @@ object BooleanSimplification extends 
Rule[LogicalPlan] with PredicateHelper {
           case a And b if a.semanticEquals(b) => a
           case a Or b if a.semanticEquals(b) => a
     
    -      case a And (b Or c) if Not(a).semanticEquals(b) => And(a, c)
    -      case a And (b Or c) if Not(a).semanticEquals(c) => And(a, b)
    -      case (a Or b) And c if a.semanticEquals(Not(c)) => And(b, c)
    -      case (a Or b) And c if b.semanticEquals(Not(c)) => And(a, c)
    -
    -      case a Or (b And c) if Not(a).semanticEquals(b) => Or(a, c)
    -      case a Or (b And c) if Not(a).semanticEquals(c) => Or(a, b)
    -      case (a And b) Or c if a.semanticEquals(Not(c)) => Or(b, c)
    -      case (a And b) Or c if b.semanticEquals(Not(c)) => Or(a, c)
    +      case a And (b Or c) if !a.nullable && Not(a).semanticEquals(b) => 
And(a, c)
    --- End diff --
    
    after more thoughts, `a And (b Or c)` should be better than `If(IsNull(a), 
null, And(a, c))`, as it's more likely to get pushed down to data source, so 
the changes here LGTM


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to