sunchao commented on a change in pull request #33930:
URL: https://github.com/apache/spark/pull/33930#discussion_r721710731



##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
##########
@@ -441,6 +457,27 @@ object BooleanSimplification extends Rule[LogicalPlan] 
with PredicateHelper {
 
       case Not(IsNull(e)) => IsNotNull(e)
       case Not(IsNotNull(e)) => IsNull(e)
+
+      // Move `Not` from one side of `EqualTo`/`EqualNullSafe` to the other 
side if it's beneficial.
+      // E.g. `EqualTo(Not(a), b)` where `b = Not(c)`, it will become
+      // `EqualTo(a, Not(b))` => `EqualTo(a, Not(Not(c)))` => `EqualTo(a, c)`
+      // In addition, `if canSimplifyNot(b)` checks if the optimization can 
converge
+      // that avoids the situation two conditions are returning to each other.
+      case EqualTo(Not(a), b) if !canSimplifyNot(a) && canSimplifyNot(b) => 
EqualTo(a, Not(b))

Review comment:
       thanks for the update! shall we put `NotPropagation` before 
`BooleanSimplification` so that we don't need the extra pass?

##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
##########
@@ -813,6 +860,37 @@ object NullPropagation extends Rule[LogicalPlan] {
 }
 
 
+/**
+ * Unwrap the input of IsNull/IsNotNull if the input is NullIntolerant
+ * E.g. IsNull(Not(null)) == IsNull(null)
+ */
+object NullDownPropagation extends Rule[LogicalPlan] {
+  // Not all NullIntolerant can be propagated
+  // Return false if the expression may return null without non-null inputs.
+  // E.g. Cast is NullIntolerant; however, cast('Infinity' as integer) returns 
true
+  // Cannot apply to `ExtractValue` as the query planner uses the trait to 
resolve the columns.
+  // E.g. the planner may resolve column `a` to `a#123`, then IsNull(a#123) 
cannot be optimized
+  // Cannot apply to `EqualTo` as applying this optimization is too disruptive 
for some tests.
+  // E.g. [SPARK-32290]
+  private def supportedNullIntolerant(e: NullIntolerant): Boolean = (e match {
+    case _: Not => true
+    case _: GreaterThan | _: GreaterThanOrEqual | _: LessThan | _: 
LessThanOrEqual => true

Review comment:
       suppose we're converting `IsNull(a > b)` to `Or(IsNull(a), IsNull(b))`, 
what if `b` is non-deterministic? since its evaluation could be skipped.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to