Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/19522#discussion_r145279067
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
---
@@ -176,6 +176,8 @@ object ReorderAssociativeOperator extends
Rule[LogicalPlan] {
object OptimizeIn extends Rule[LogicalPlan] {
def apply(plan: LogicalPlan): LogicalPlan = plan transform {
case q: LogicalPlan => q transformExpressionsDown {
+ case expr @ In(v, _) if expr.isListEmpty =>
+ If(IsNull(v), Literal.create(null, BooleanType), FalseLiteral)
--- End diff --
Based on the SQL standard, the original fix is wrong. More importantly, the
fix does not bring any noticeable perf improvement, because `buildFilter` is
only used for partition pruning. In the future, we might enhance it for more
advanced statistic-based filter inference. For example, foldable expressions
can be evaluated earlier and this code change could cause a regression.
Yes. Please open a new JIRA for optimizer enhancement.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]