Github user mgaido91 commented on a diff in the pull request:
https://github.com/apache/spark/pull/22038#discussion_r208833322
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
---
@@ -486,9 +486,17 @@ object TypeCoercion {
}
case i @ In(a, b) if b.exists(_.dataType != a.dataType) =>
- findWiderCommonType(i.children.map(_.dataType)) match {
- case Some(finalDataType) =>
i.withNewChildren(i.children.map(Cast(_, finalDataType)))
- case None => i
+ if (b.map(_.dataType).distinct.size == 1) {
--- End diff --
mmh...this means that: `1 in ('1', 1)` behaves differently from `'1' in (1,
1)` which is not great IMHO. Can you please also check which is the behavior
with nested subqueries? I think that having the same behavior among IN with
literals and IN with subqueries is even more important that having the same
behavior with binary comparisons.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]