Github user icexelloss commented on a diff in the pull request:
https://github.com/apache/spark/pull/22620#discussion_r222456993
--- Diff: python/pyspark/sql/udf.py ---
@@ -310,9 +319,11 @@ def register(self, name, f, returnType=None):
"Invalid returnType: data type can not be specified
when f is"
"a user-defined function, but got %s." % returnType)
if f.evalType not in [PythonEvalType.SQL_BATCHED_UDF,
- PythonEvalType.SQL_SCALAR_PANDAS_UDF]:
+ PythonEvalType.SQL_SCALAR_PANDAS_UDF,
+
PythonEvalType.SQL_GROUPED_AGG_PANDAS_UDF]:
--- End diff --
We don't need it here:
Users specify GROUPED_AGG only. GROUPED_AGG is turned to WINDOW_AGG eval
type in WindowInPandasExec.
Admittedly, there is a bit confusion here we can improve. We just haven't
got a user specified udf type that maps to multiple evalType before WINDOW_AGG.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]