Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/19630#discussion_r151377556
--- Diff: python/pyspark/sql/functions.py ---
@@ -2271,15 +2169,42 @@ def pandas_udf(f=None, returnType=StringType()):
| 2| 1.1094003924504583|
+---+-------------------+
- .. note:: This type of udf cannot be used with functions such as
`withColumn` or `select`
- because it defines a `DataFrame` transformation rather
than a `Column`
- transformation.
-
.. seealso:: :meth:`pyspark.sql.GroupedData.apply`
.. note:: The user-defined function must be deterministic.
"""
- return _create_udf(f, returnType=returnType,
pythonUdfType=PythonUdfType.PANDAS_UDF)
+ # decorator @pandas_udf(dataType(), functionType)
+ if f is None or isinstance(f, (str, DataType)):
--- End diff --
just for curious, when `f` will be none?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]