[GitHub] spark pull request #21416: [SPARK-24371] [SQL] Added isinSet in DataFrame AP...

dongjoon-hyun Wed, 23 May 2018 14:41:12 -0700

Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21416#discussion_r190407851
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
 ---
    @@ -219,7 +219,11 @@ object ReorderAssociativeOperator extends 
Rule[LogicalPlan] {
     object OptimizeIn extends Rule[LogicalPlan] {
       def apply(plan: LogicalPlan): LogicalPlan = plan transform {
         case q: LogicalPlan => q transformExpressionsDown {
    -      case In(v, list) if list.isEmpty && !v.nullable => FalseLiteral
    +      case In(v, list) if list.isEmpty =>
    +        // When v is not nullable, the following expression will be 
optimized
    +        // to FalseLiteral which is tested in OptimizeInSuite.scala
    +        If(IsNotNull(v), FalseLiteral, Literal(null, BooleanType))
    +      case In(v, list) if list.length == 1 => EqualTo(v, list.head)
    --- End diff --
    
    Ur, @dbtsai . This will cause side-effects on typecasting. For example, 
please see the following example. Could you add these kind of test cases?
    ```scala
    scala> sql("select '1.1' in (1), '1.1' = 1").collect()
    res0: Array[org.apache.spark.sql.Row] = Array([false,true])
    ```



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #21416: [SPARK-24371] [SQL] Added isinSet in DataFrame AP...

Reply via email to