Reynold Xin created SPARK-28264:
-----------------------------------
Summary: Revisiting Python / pandas UDF
Key: SPARK-28264
URL: https://issues.apache.org/jira/browse/SPARK-28264
Project: Spark
Issue Type: Improvement
Components: PySpark, SQL
Affects Versions: 3.0.0
Reporter: Reynold Xin
Assignee: Reynold Xin
In the past two years, the pandas UDFs are perhaps the most important changes
to Spark for Python data science. However, these functionalities have evolved
organically, leading to some inconsistencies and confusions among users. This
document revisits UDF definition and naming, as a result of discussions among
Xiangrui, Li Jin, Hyukjin, and Reynold.
See document here:
[https://docs.google.com/document/d/10Pkl-rqygGao2xQf6sddt0b-4FYK4g8qr_bXLKTL65A/edit#|https://docs.google.com/document/d/10Pkl-rqygGao2xQf6sddt0b-4FYK4g8qr_bXLKTL65A/edit]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]