Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22038#discussion_r208833322
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
 ---
    @@ -486,9 +486,17 @@ object TypeCoercion {
             }
     
           case i @ In(a, b) if b.exists(_.dataType != a.dataType) =>
    -        findWiderCommonType(i.children.map(_.dataType)) match {
    -          case Some(finalDataType) => 
i.withNewChildren(i.children.map(Cast(_, finalDataType)))
    -          case None => i
    +        if (b.map(_.dataType).distinct.size == 1) {
    --- End diff --
    
    mmh...this means that: `1 in ('1', 1)` behaves differently from `'1' in (1, 
1)` which is not great IMHO. Can you please also check which is the behavior 
with nested subqueries? I think that having the same behavior among IN with 
literals and IN with subqueries is even more important that having the same 
behavior with binary comparisons.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to