Huaxin Gao created SPARK-28441: ---------------------------------- Summary: udf(max(udf(column))) throws java.lang.UnsupportedOperationException: Cannot evaluate expression: udf(null) Key: SPARK-28441 URL: https://issues.apache.org/jira/browse/SPARK-28441 Project: Spark Issue Type: Bug Components: PySpark, SQL Affects Versions: 3.0.0 Reporter: Huaxin Gao
I found this when doing https://issues.apache.org/jira/browse/SPARK-28277 {code:java} >>> @pandas_udf("string", PandasUDFType.SCALAR) ... def noop(x): ... return x.apply(str) ... >>> spark.udf.register("udf", noop) <function noop at 0x111b5f9d8> >>> spark.sql("CREATE OR REPLACE TEMPORARY VIEW t1 as select * from values >>> (\"one\", 1), (\"two\", 2),(\"three\", 3),(\"one\", NULL) as t1(k, v)") DataFrame[] >>> spark.sql("CREATE OR REPLACE TEMPORARY VIEW t2 as select * from values >>> (\"one\", 1), (\"two\", 22),(\"one\", 5),(\"one\", NULL), (NULL, 5) as >>> t2(k, v)") DataFrame[] >>> spark.sql("SELECT t1.k FROM t1 WHERE t1.v <= (SELECT udf(max(udf(t2.v))) >>> FROM t2 WHERE udf(t2.k) = udf(t1.k))").show() py4j.protocol.Py4JJavaError: An error occurred while calling o65.showString. : java.lang.UnsupportedOperationException: Cannot evaluate expression: udf(null) at org.apache.spark.sql.catalyst.expressions.Unevaluable.eval(Expression.scala:296) at org.apache.spark.sql.catalyst.expressions.Unevaluable.eval$(Expression.scala:295) at org.apache.spark.sql.catalyst.expressions.PythonUDF.eval(PythonUDF.scala:52) {code} -- This message was sent by Atlassian JIRA (v7.6.14#76016) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org