Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/19630#discussion_r151677913
--- Diff: python/pyspark/sql/functions.py ---
@@ -2049,132 +2050,12 @@ def map_values(col):
# ---------------------------- User Defined Function
----------------------------------
-def _wrap_function(sc, func, returnType):
- command = (func, returnType)
- pickled_command, broadcast_vars, env, includes =
_prepare_for_python_RDD(sc, command)
- return sc._jvm.PythonFunction(bytearray(pickled_command), env,
includes, sc.pythonExec,
- sc.pythonVer, broadcast_vars,
sc._javaAccumulator)
-
-
-class PythonUdfType(object):
- # row-at-a-time UDFs
- NORMAL_UDF = 0
- # scalar vectorized UDFs
- PANDAS_UDF = 1
- # grouped vectorized UDFs
- PANDAS_GROUPED_UDF = 2
-
-
-class UserDefinedFunction(object):
--- End diff --
Yup, I noticed it first too when I reviewed but then noticed he imported
this indentedly:
https://github.com/icexelloss/spark/blob/cf1d1caa4f41c6bcf565cfc5b9e9901d94f56af3/python/pyspark/sql/functions.py#L35
So, I guess it could be fine. I manually just double checked:
```python
>>> from pyspark.sql import functions
>>> functions.UserDefinedFunction
<class 'pyspark.sql.udf.UserDefinedFunction'>
>>> from pyspark import sql
>>> sql.functions.UserDefinedFunction
<class 'pyspark.sql.udf.UserDefinedFunction'>
>>> from pyspark.sql.functions import UserDefinedFunction
>>> from pyspark.sql.udf import UserDefinedFunction
```
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]