Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21331#discussion_r189413969
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Canonicalize.scala
 ---
    @@ -85,6 +87,14 @@ object Canonicalize {
         case Not(GreaterThanOrEqual(l, r)) => LessThan(l, r)
         case Not(LessThanOrEqual(l, r)) => GreaterThan(l, r)
     
    +    // order the list in the In operator
    +    // we can do this only if all the elements in the list are literals 
with the same datatype
    +    case i @ In(value, list)
    +        if i.inSetConvertible && 
list.map(_.dataType.asNullable).distinct.size == 1 =>
    +      val literals = list.map(_.asInstanceOf[Literal])
    +      val ordering = 
TypeUtils.getInterpretedOrdering(literals.head.dataType)
    +      In(value, literals.sortBy(_.value)(ordering))
    --- End diff --
    
    For complex literals like `array`, this doesn't work. Please add a test 
case for complex types and handle them.
    ```
    scala> sql("select * from t where array(1,2) in 
(array(1,2),array(2,1))").queryExecution.logical.canonicalized.semanticHash()
    res4: Int = -1398094385
    
    scala> sql("select * from t where array(1,2) in 
(array(2,1),array(1,2))").queryExecution.logical.canonicalized.semanticHash()
    res5: Int = -1233982198
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to