Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22206#discussion_r212676265
--- Diff: python/pyspark/sql/tests.py ---
@@ -6394,6 +6394,17 @@ def test_invalid_args(self):
df.withColumn('mean_v', mean_udf(df['v']).over(ow))
+class DataSourceV2Tests(ReusedSQLTestCase):
+ def test_pyspark_udf_SPARK_25213(self):
+ from pyspark.sql.functions import udf
+
+ df =
self.spark.read.format("org.apache.spark.sql.sources.v2.SimpleDataSourceV2").load()
+ result = df.withColumn('x', udf(lambda x: x, 'int')(df['i']))
--- End diff --
Agreed. I was just verifying that the fix worked before spending more time
on it.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]