[GitHub] [spark] sunchao commented on a change in pull request #32764: [SPARK-35390][SQL] Handle type coercion when resolving V2 functions

GitBox Thu, 03 Jun 2021 17:51:09 -0700


sunchao commented on a change in pull request #32764:
URL: https://github.com/apache/spark/pull/32764#discussion_r645218632




##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
##########
@@ -2169,12 +2169,29 @@ class Analyzer(override val catalogManager: 
CatalogManager)
                         unbound, arguments, unsupported)
                   }
 
+                  if (bound.inputTypes().length != arguments.length) {
+                    throw 
QueryCompilationErrors.v2FunctionInvalidInputTypeLengthError(
+                      bound, arguments)
+                  }
+
+                  val castedArguments = arguments.zip(bound.inputTypes()).map 
{ case (arg, ty) =>
+                    if (arg.dataType != ty) {
+                      if (Cast.canCast(arg.dataType, ty)) {
+                        Cast(arg, ty)
+                      } else {
+                        throw 
QueryCompilationErrors.v2FunctionCastError(bound, arg, ty)
+                      }
+                    } else {
+                      arg
+                    }
+                  }

Review comment:
       > Since we have two places of type coercion, is there any possibility of 
bug where we do differently in some cases at both places?
   
   Hmm @dongjoon-hyun what are the two places of type coercion? do you mean the 
potential inconsistency from `bind` and `inputTypes`?
   
   > Thanks for explaining it. So that is said, bind can return a 
implementation with magic method, which takes decimal input, when Spark binds 
it with IntegerType input?
   
   Yes correct. Spark will insert cast `int -> decimal`. 
   
   Also from the Java doc of `inputTypes`: "If the types returned differ from 
the types passed to `bind(StructType)`, Spark will cast input values to the 
required data types. This allows implementations to delegate input value 
casting to Spark."
   
   > Using above example, when Spark binds it with IntegerType, the UDF must 
know Spark can cast int to decimal, so it can return an implementation with 
magic method taking decimal input?
   
   Yes the implementor of the UDF should know whether such a cast is valid. 
Spark also checks this during analysis and throws error if not.
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] sunchao commented on a change in pull request #32764: [SPARK-35390][SQL] Handle type coercion when resolving V2 functions

Reply via email to