Github user dongjoon-hyun commented on a diff in the pull request:
https://github.com/apache/spark/pull/21416#discussion_r190407851
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
---
@@ -219,7 +219,11 @@ object ReorderAssociativeOperator extends
Rule[LogicalPlan] {
object OptimizeIn extends Rule[LogicalPlan] {
def apply(plan: LogicalPlan): LogicalPlan = plan transform {
case q: LogicalPlan => q transformExpressionsDown {
- case In(v, list) if list.isEmpty && !v.nullable => FalseLiteral
+ case In(v, list) if list.isEmpty =>
+ // When v is not nullable, the following expression will be
optimized
+ // to FalseLiteral which is tested in OptimizeInSuite.scala
+ If(IsNotNull(v), FalseLiteral, Literal(null, BooleanType))
+ case In(v, list) if list.length == 1 => EqualTo(v, list.head)
--- End diff --
Ur, @dbtsai . This will cause side-effects on typecasting. For example,
please see the following example. Could you add these kind of test cases?
```scala
scala> sql("select '1.1' in (1), '1.1' = 1").collect()
res0: Array[org.apache.spark.sql.Row] = Array([false,true])
```
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]