This is an automated email from the ASF dual-hosted git repository.
gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-3.0 by this push:
new da6f398 [SPARK-31010][SQL][DOC][FOLLOW-UP] Improve deprecated warning
message for untyped scala udf
da6f398 is described below
commit da6f398db2838aca6e0dc18866715a14b9b2aded
Author: yi.wu <[email protected]>
AuthorDate: Fri Apr 24 19:10:18 2020 +0900
[SPARK-31010][SQL][DOC][FOLLOW-UP] Improve deprecated warning message for
untyped scala udf
### What changes were proposed in this pull request?
Give more friendly warning message/migration guide of deprecated scala udf
to users.
### Why are the changes needed?
User can not distinguish function signature between typed and untyped scala
udf. Instead, we shall tell user what to do directly.
### Does this PR introduce any user-facing change?
No, it's newly added in Spark 3.0.
### How was this patch tested?
Pass Jenkins.
Closes #28311 from Ngone51/update_udf_doc.
Authored-by: yi.wu <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
(cherry picked from commit 463c54419bf663615eb72e24b82e940feb85c68c)
Signed-off-by: HyukjinKwon <[email protected]>
---
docs/sql-migration-guide.md | 2 +-
sql/core/src/main/scala/org/apache/spark/sql/functions.scala | 6 +++---
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/docs/sql-migration-guide.md b/docs/sql-migration-guide.md
index 854c9ea..39619f6 100644
--- a/docs/sql-migration-guide.md
+++ b/docs/sql-migration-guide.md
@@ -75,7 +75,7 @@ license: |
- In Spark version 2.4 and below, you can create a map with duplicated keys
via built-in functions like `CreateMap`, `StringToMap`, etc. The behavior of
map with duplicated keys is undefined, for example, map look up respects the
duplicated key appears first, `Dataset.collect` only keeps the duplicated key
appears last, `MapKeys` returns duplicated keys, etc. In Spark 3.0, Spark
throws `RuntimeException` when duplicated keys are found. You can set
`spark.sql.mapKeyDedupPolicy` to `LAST [...]
- - In Spark 3.0, using `org.apache.spark.sql.functions.udf(AnyRef, DataType)`
is not allowed by default. Set `spark.sql.legacy.allowUntypedScalaUDF` to true
to keep using it. In Spark version 2.4 and below, if
`org.apache.spark.sql.functions.udf(AnyRef, DataType)` gets a Scala closure
with primitive-type argument, the returned UDF returns null if the input values
is null. However, in Spark 3.0, the UDF returns the default value of the Java
type if the input value is null. For example, ` [...]
+ - In Spark 3.0, using `org.apache.spark.sql.functions.udf(AnyRef, DataType)`
is not allowed by default. Remove the return type parameter to automatically
switch to typed Scala udf is recommended, or set
`spark.sql.legacy.allowUntypedScalaUDF` to true to keep using it. In Spark
version 2.4 and below, if `org.apache.spark.sql.functions.udf(AnyRef,
DataType)` gets a Scala closure with primitive-type argument, the returned UDF
returns null if the input values is null. However, in Spark 3.0 [...]
- In Spark 3.0, a higher-order function `exists` follows the three-valued
boolean logic, that is, if the `predicate` returns any `null`s and no `true` is
obtained, then `exists` returns `null` instead of `false`. For example,
`exists(array(1, null, 3), x -> x % 2 == 0)` is `null`. The previous
behaviorcan be restored by setting
`spark.sql.legacy.followThreeValuedLogicInArrayExists` to `false`.
diff --git a/sql/core/src/main/scala/org/apache/spark/sql/functions.scala
b/sql/core/src/main/scala/org/apache/spark/sql/functions.scala
index 782be98..9fd4718 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/functions.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/functions.scala
@@ -4833,8 +4833,8 @@ object functions {
* @group udf_funcs
* @since 2.0.0
*/
- @deprecated("Untyped Scala UDF API is deprecated, please use typed Scala UDF
API such as " +
- "'def udf[RT: TypeTag](f: Function0[RT]): UserDefinedFunction' instead.",
"3.0.0")
+ @deprecated("Scala `udf` method with return type parameter is deprecated. " +
+ "Please use Scala `udf` method without return type parameter.", "3.0.0")
def udf(f: AnyRef, dataType: DataType): UserDefinedFunction = {
if (!SQLConf.get.getConf(SQLConf.LEGACY_ALLOW_UNTYPED_SCALA_UDF)) {
val errorMsg = "You're using untyped Scala UDF, which does not have the
input type " +
@@ -4842,7 +4842,7 @@ object functions {
"argument, and the closure will see the default value of the Java type
for the null " +
"argument, e.g. `udf((x: Int) => x, IntegerType)`, the result is 0 for
null input. " +
"To get rid of this error, you could:\n" +
- "1. use typed Scala UDF APIs, e.g. `udf((x: Int) => x)`\n" +
+ "1. use typed Scala UDF APIs(without return type parameter), e.g.
`udf((x: Int) => x)`\n" +
"2. use Java UDF APIs, e.g. `udf(new UDF1[String, Integer] { " +
"override def call(s: String): Integer = s.length() }, IntegerType)`,
" +
"if input types are all non primitive\n" +
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]