HyukjinKwon commented on a change in pull request #28052:
[SPARK-31287][PYTHON][SQL] Ignore type hints in groupby.(cogroup.)applyInPandas
and mapInPandas
URL: https://github.com/apache/spark/pull/28052#discussion_r399745691
##########
File path: python/pyspark/sql/pandas/functions.py
##########
@@ -384,6 +384,14 @@ def _create_pandas_udf(f, returnType, evalType):
"In Python 3.6+ and Spark 3.0+, it is preferred to specify
type hints for "
"pandas UDF instead of specifying pandas UDF type which will
be deprecated "
"in the future releases. See SPARK-28264 for more details.",
UserWarning)
+ elif evalType in [PythonEvalType.SQL_GROUPED_MAP_PANDAS_UDF,
+ PythonEvalType.SQL_MAP_PANDAS_ITER_UDF,
+ PythonEvalType.SQL_COGROUPED_MAP_PANDAS_UDF]:
+ # In case of 'SQL_GROUPED_MAP_PANDAS_UDF', deprecation warning is
being triggered
+ # at `apply` instead.
+ # In case of 'SQL_MAP_PANDAS_ITER_UDF' and
'SQL_COGROUPED_MAP_PANDAS_UDF', the
+ # evaluation type will always be set.
+ pass
elif len(argspec.annotations) > 0:
Review comment:
It implies a good point - there's a mismatch about how pandas UDF uses
Python type hints because we force to set the Python type hints but the type
hints are supposed to be completely optional ...
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]