davidvrba commented on a change in pull request #27231: [SPARK-28478][SQL]
Remove redundant null checks
URL: https://github.com/apache/spark/pull/27231#discussion_r371629505
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
##########
@@ -434,6 +434,27 @@ object SimplifyConditionals extends Rule[LogicalPlan]
with PredicateHelper {
case _ => false
}
+ /**
+ * Condition for redundant null check based on intolerant expressions.
+ * @param ifNullExpr expression that takes place if checkedExpr is null
+ * @param ifNotNullExpr expression that takes place if checkedExpr is not
null
+ * @param checkedExpr expression that is checked for null value
+ */
+ private def isRedundantNullCheck(
+ ifNullExpr: Expression,
+ ifNotNullExpr: Expression,
+ checkedExpr: Expression): Boolean = {
+ val isNullIntolerant = ifNotNullExpr.find { x =>
+ !x.isInstanceOf[NullIntolerant] && x.find(e =>
e.semanticEquals(checkedExpr)).nonEmpty
Review comment:
Actually i think we need slightly different logic. Consider these two
examples where `x` will be the null-checked column:
1. `substring(x, coalesce(a, b), c)`
2. `substring(coalesce(x, d), a, c)`
For 1. we need to be null-intolerant (even though `coalesce` is
null-tolerant), so if `x` is null, we replace the `substring` with null value
no matter what are the other children. For 2. we need to be null-tolerant and
we will not replace the `substring` by null value. So we need to check the
expression with respect to the position of `x` (the column that is being
null-checked). Does it make sense?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]