HyukjinKwon commented on a change in pull request #28160: 
[SPARK-30722][DOCS][FOLLOW-UP] Explicitly mention the same entire input/output 
length restriction of Series Iterator UDF
URL: https://github.com/apache/spark/pull/28160#discussion_r405973319
 
 

 ##########
 File path: docs/sql-pyspark-pandas-with-arrow.md
 ##########
 @@ -198,12 +201,14 @@ For detailed usage, please see 
[`pyspark.sql.functions.pandas_udf`](api/python/p
 
 ## Pandas Function APIs
 
-Pandas function APIs can directly apply a Python native function against the 
whole DataFrame by
-using Pandas instances. Internally it works similarly with Pandas UDFs by 
Spark using Arrow to transfer
-data and Pandas to work with the data, which allows vectorized operations. A 
Pandas function API behaves
-as a regular API under PySpark `DataFrame` in general.
+Pandas Function APIs can directly apply a Python native function against the 
whole `DataFrame` by
+using Pandas instances. Internally it works similarly with Pandas UDFs by 
using Arrow to transfer
+data and Pandas to work with the data, which allows vectorized operations. 
However, A Pandas Function
+API behaves as a regular API under PySpark `DataFrame` instead of `Column`, 
and Python type hints in Pandas
+Functions APIs are optional and do not affect how it works internally at this 
moment although they
+might be required in the future.
 
 Review comment:
   I piggy-backed some doc changes here while I am here.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to