[GitHub] spark pull request #21331: [SPARK-24276][SQL] Order of literals in IN should...

mgaido91 Fri, 18 May 2018 15:38:24 -0700

Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21331#discussion_r189407673
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Canonicalize.scala
 ---
    @@ -85,6 +87,14 @@ object Canonicalize {
         case Not(GreaterThanOrEqual(l, r)) => LessThan(l, r)
         case Not(LessThanOrEqual(l, r)) => GreaterThan(l, r)
     
    +    // order the list in the In operator
    +    // we can do this only if all the elements in the list are literals 
with the same datatype
    +    case i @ In(value, list)
    +        if i.inSetConvertible && 
list.map(_.dataType.asNullable).distinct.size == 1 =>
    --- End diff --
    
    thanks for your comment @dongjoon-hyun, but I am not sure I agree with you. 
What if we have something like ` in (array(null, 1), array(1, 2, 3), array(3, 
2, 1))`? The first literal would contain an array which can contain nulls while 
the others would not be, so in this case we would have 2 distinct datatypes 
(because of nullability).
    Am I missing something? Thanks.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #21331: [SPARK-24276][SQL] Order of literals in IN should...

Reply via email to