[GitHub] spark pull request #21442: [SPARK-24402] [SQL] Optimize `In` expression when...

dbtsai Tue, 29 May 2018 00:04:23 -0700

Github user dbtsai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21442#discussion_r191320828
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
 ---
    @@ -219,7 +219,14 @@ object ReorderAssociativeOperator extends 
Rule[LogicalPlan] {
     object OptimizeIn extends Rule[LogicalPlan] {
       def apply(plan: LogicalPlan): LogicalPlan = plan transform {
         case q: LogicalPlan => q transformExpressionsDown {
    -      case In(v, list) if list.isEmpty && !v.nullable => FalseLiteral
    +      case In(v, list) if list.isEmpty =>
    +        // When v is not nullable, the following expression will be 
optimized
    +        // to FalseLiteral which is tested in OptimizeInSuite.scala
    +        If(IsNotNull(v), FalseLiteral, Literal(null, BooleanType))
    +      case In(v, Seq(elem @ Literal(_, _))) =>
    --- End diff --
    
    As you suggested, I'll move into `case expr @ In(v, list) if 
expr.inSetConvertible`. 
    
    Yes, I saw the test failure. The same code was passing the build in my 
another PR. 
https://github.com/apache/spark/pull/21416/commits/1332406d7f4ca7a9a4a85338f758430ecc334ff8
  I will debug it tomorrow.
    
    Thanks,



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #21442: [SPARK-24402] [SQL] Optimize `In` expression when...

Reply via email to