viirya commented on a change in pull request #34038:
URL: https://github.com/apache/spark/pull/34038#discussion_r714490029
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
##########
@@ -401,15 +401,30 @@ trait CheckAnalysis extends PredicateHelper with
LookupCatalog {
|the ${ordinalNumber(ti + 1)} table has
${child.output.length} columns
""".stripMargin.replace("\n", " ").trim())
}
+ val isUnion = operator.isInstanceOf[Union]
+ val dataTypesAreCompatibleFn = if (isUnion) {
+ // `TypeCoercion` takes care of type coercion already. If any
columns or nested
+ // columns are not compatible, we detect it here and throw
analysis exception.
+ val typeChecker = (dt1: DataType, dt2: DataType) => {
+ !TypeCoercion.findWiderTypeForTwo(dt1.asNullable,
dt2.asNullable).isEmpty
Review comment:
Oh, I spent a little time to recall why I keep original check logic.
It is because if `TypeCoercion` fails to find compatible types for any
column, it won't add cast for all. It is all or nothing logic here.
So if we only check `dt1 == dt2` here, we compare the original data types
even some of them are compatible.
`AnalysisErrorSuite` has one example. One relation has `short, string,
double, decimal`, another one has `string, string, string, map`.
The first three columns are compatible, only the fourth isn't. So
`TypeCoercion` fails to add casts for all.
If we compare `dt1 == dt2`, the error will be like "short is not compatible
with string". But currently we get like "decimal is not compatible with map".
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
##########
@@ -401,15 +401,30 @@ trait CheckAnalysis extends PredicateHelper with
LookupCatalog {
|the ${ordinalNumber(ti + 1)} table has
${child.output.length} columns
""".stripMargin.replace("\n", " ").trim())
}
+ val isUnion = operator.isInstanceOf[Union]
+ val dataTypesAreCompatibleFn = if (isUnion) {
+ // `TypeCoercion` takes care of type coercion already. If any
columns or nested
+ // columns are not compatible, we detect it here and throw
analysis exception.
+ val typeChecker = (dt1: DataType, dt2: DataType) => {
+ !TypeCoercion.findWiderTypeForTwo(dt1.asNullable,
dt2.asNullable).isEmpty
Review comment:
Oh, I spent a little time to recall why I keep original check logic.
It is because if `TypeCoercion` fails to find compatible types for any
column, it won't add cast for all. It is all or nothing logic there.
So if we only check `dt1 == dt2` here, we compare the original data types
even some of them are compatible.
`AnalysisErrorSuite` has one example. One relation has `short, string,
double, decimal`, another one has `string, string, string, map`.
The first three columns are compatible, only the fourth isn't. So
`TypeCoercion` fails to add casts for all.
If we compare `dt1 == dt2`, the error will be like "short is not compatible
with string". But currently we get like "decimal is not compatible with map".
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]