BryanCutler commented on a change in pull request #27165:
[SPARK-28264][PYTHON][SQL] Support type hints in pandas UDF and rename/move
inconsistent pandas UDF types
URL: https://github.com/apache/spark/pull/27165#discussion_r368764880
##########
File path: python/pyspark/sql/pandas/group_ops.py
##########
@@ -114,12 +157,12 @@ def __init__(self, gd1, gd2):
self.sql_ctx = gd1.sql_ctx
@since(3.0)
- def apply(self, udf):
+ def applyInPandas(self, func, schema):
"""
- Applies a function to each cogroup using a pandas udf and returns the
result
+ Applies a function to each cogroup using pandas and returns the result
as a `DataFrame`.
- The user-defined function should take two `pandas.DataFrame` and
return another
+ The function should take two `pandas.DataFrame` and return another
Review comment:
minor but should be plural -> `pandas.DataFrame`s. I'm not sure the correct
format though.. I see `pandas.DataFrame`\\s and ``DataFrame``s below
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]