[GitHub] spark pull request #21416: [SPARK-24371] [SQL] Added isinSet in DataFrame AP...

dbtsai Wed, 23 May 2018 22:53:23 -0700

Github user dbtsai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21416#discussion_r190472138
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
 ---
    @@ -219,7 +219,11 @@ object ReorderAssociativeOperator extends 
Rule[LogicalPlan] {
     object OptimizeIn extends Rule[LogicalPlan] {
       def apply(plan: LogicalPlan): LogicalPlan = plan transform {
         case q: LogicalPlan => q transformExpressionsDown {
    -      case In(v, list) if list.isEmpty && !v.nullable => FalseLiteral
    +      case In(v, list) if list.isEmpty =>
    +        // When v is not nullable, the following expression will be 
optimized
    +        // to FalseLiteral which is tested in OptimizeInSuite.scala
    +        If(IsNotNull(v), FalseLiteral, Literal(null, BooleanType))
    +      case In(v, list) if list.length == 1 => EqualTo(v, list.head)
    --- End diff --
    
    Why does it have any implication on typecasting? With this PR, it seems I 
get the correct result.
    
    ```scala
    == Analyzed Logical Plan ==
    (CAST(1.1 AS STRING) IN (CAST(1 AS STRING))): boolean, (CAST(1.1 AS INT) = 
1): boolean
    Project [cast(1.1 as string) IN (cast(1 as string)) AS (CAST(1.1 AS STRING) 
IN (CAST(1 AS STRING)))#484, (cast(1.1 as int) = 1) AS (CAST(1.1 AS INT) = 
1)#485]
    +- OneRowRelation
    
    == Optimized Logical Plan ==
    Project [false AS (CAST(1.1 AS STRING) IN (CAST(1 AS STRING)))#484, true AS 
(CAST(1.1 AS INT) = 1)#485]
    +- OneRowRelation
    
    == Physical Plan ==
    *(1) Project [false AS (CAST(1.1 AS STRING) IN (CAST(1 AS STRING)))#484, 
true AS (CAST(1.1 AS INT) = 1)#485]
    +- Scan OneRowRelation[]
    ```



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #21416: [SPARK-24371] [SQL] Added isinSet in DataFrame AP...

Reply via email to