srowen commented on a change in pull request #23275: [SPARK-26323][SQL] Scala
UDF should still check input types even if some inputs are of type Any
URL: https://github.com/apache/spark/pull/23275#discussion_r245652117
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
##########
@@ -882,7 +882,18 @@ object TypeCoercion {
case udf: ScalaUDF if udf.inputTypes.nonEmpty =>
val children = udf.children.zip(udf.inputTypes).map { case (in,
expected) =>
- implicitCast(in, udfInputToCastType(in.dataType,
expected)).getOrElse(in)
+ // Currently Scala UDF will only expect `AnyDataType` at top level,
so this trick works.
+ // In the future we should create types like `AbstractArrayType`, so
that Scala UDF can
+ // accept inputs of array type of arbitrary element type.
Review comment:
Yeah I think we should make this change. For better or worse we have several
like this in MLlib that are trying to handle things like `Double` and instance
types at the same time. We might have user code that does the same. If we can
keep some functionality in these cases, I think it's worth it.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]