dbtsai commented on a change in pull request #29567:
URL: https://github.com/apache/spark/pull/29567#discussion_r479863466
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
##########
@@ -463,6 +463,8 @@ object SimplifyConditionals extends Rule[LogicalPlan] with
PredicateHelper {
case If(Literal(null, _), _, falseValue) => falseValue
case If(cond, trueValue, falseValue)
if cond.deterministic && trueValue.semanticEquals(falseValue) =>
trueValue
+ case If(cond, l @ Literal(null, _), FalseLiteral) if !cond.nullable =>
And(cond, l)
+ case If(cond, l @ Literal(null, _), TrueLiteral) if !cond.nullable =>
Or(Not(cond), l)
Review comment:
@sunchao I think the concern @dongjoon-hyun has is it's possible that
`cond` is non-deterministic or has side-effect (all the expressions that have
side effects in Spark, we consider them as non-deterministic). If we naively
simplify them, we might change the final result.
For example, `If(cond, TrueLiteral, TrueLiteral)` can be always simplified
to `TrueLiteral`; however, this will skip evaluating `cond` which might have
side-effect such as counting the number of times we call `cond`.
In the first case, the `cond` in `And(cond, l)` will be alway evaluated, and
in the second case, `Or(Not(cond), l)`, since `l` will never be `true`,
`Not(cond)` will always be evaluated.
As a result, maybe we don't need such strong requirement that `cond` has to
be deterministic.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]