[GitHub] spark pull request #21442: [SPARK-24402] [SQL] Optimize `In` expression when...

viirya Tue, 29 May 2018 01:21:45 -0700

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21442#discussion_r191338888
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
 ---
    @@ -219,10 +219,15 @@ object ReorderAssociativeOperator extends 
Rule[LogicalPlan] {
     object OptimizeIn extends Rule[LogicalPlan] {
       def apply(plan: LogicalPlan): LogicalPlan = plan transform {
         case q: LogicalPlan => q transformExpressionsDown {
    -      case In(v, list) if list.isEmpty && !v.nullable => FalseLiteral
    +      case In(v, list) if list.isEmpty =>
    +        // When v is not nullable, the following expression will be 
optimized
    +        // to FalseLiteral which is tested in OptimizeInSuite.scala
    +        If(IsNotNull(v), FalseLiteral, Literal(null, BooleanType))
           case expr @ In(v, list) if expr.inSetConvertible =>
             val newList = ExpressionSet(list).toSeq
    -        if (newList.size > SQLConf.get.optimizerInSetConversionThreshold) {
    +        if (newList.length == 1) {
    --- End diff --
    
    When `list.length == 1`, we don't need to create `ExpressionSet`.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #21442: [SPARK-24402] [SQL] Optimize `In` expression when...

Reply via email to