Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21331#discussion_r190038976
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Canonicalize.scala
 ---
    @@ -85,6 +87,10 @@ object Canonicalize {
         case Not(GreaterThanOrEqual(l, r)) => LessThan(l, r)
         case Not(LessThanOrEqual(l, r)) => GreaterThan(l, r)
     
    +    // order the list in the In operator
    +    case In(value, list) =>
    +      In(value, list.sortBy(_.semanticHash()))
    --- End diff --
    
    The only difference is that the elements in the list are canonicalized 
before the hash.  I can't think of any meaningful example. The only one I can 
think of is something like ` mybool in (5 < 2)` and ` mybool in (not 5 >= 2)`. 
But in the future we may have more rules here making meaningful using the 
`semanticHash` which is logically what we want here IMHO


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to