viirya commented on a change in pull request #29567:
URL: https://github.com/apache/spark/pull/29567#discussion_r479687170
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
##########
@@ -463,6 +463,8 @@ object SimplifyConditionals extends Rule[LogicalPlan] with
PredicateHelper {
case If(Literal(null, _), _, falseValue) => falseValue
case If(cond, trueValue, falseValue)
if cond.deterministic && trueValue.semanticEquals(falseValue) =>
trueValue
+ case If(p, l @ Literal(null, _), FalseLiteral) if !p.nullable => And(p,
l)
+ case If(p, l @ Literal(null, _), TrueLiteral) if !p.nullable =>
Or(Not(p), l)
Review comment:
Pushdown predicate is somehow different to general predicate. For
example, IIUC we won't push down `Or(Not(p), null)` because of the null
literal. Predicate pushdown only rewrites predicates applied to a field column,
e.g. col > 1.
Just for predicate pushdown, maybe we can transfer `Or(Not(p), null)` to
just `Not(p)`? Because if `p` is true, the predicate evaluates to null, then we
filter out the row. If `p` is false, the predicate value is true. So it
semantically equals to `Not(p)` for predicate used in pushdown.
I'm not sure if I miss anything or edge case, and this also doesn't work for
nested predicate. Besides, this is unlikely case too I think, we can consider
if it's worth the changing.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]