[ 
https://issues.apache.org/jira/browse/SPARK-41661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xinrong Meng updated SPARK-41661:
---------------------------------
    Description: 
User-defined Functions in Python consist of (pickled) Python UDFs and 
(Arrow-optimized) Pandas UDFs. They enable users to run arbitrary Python code 
on top of the Apache Spark™ engine. Users only have to state "what to do"; 
PySpark, as a sandbox, encapsulates "how to do it".

Spark Connect Python Client (SCPC), as a client and server interface for 
PySpark will eventually replace the legacy API of PySpark. Supporting PySpark 
UDFs is essential for Spark Connect to reach parity with the PySpark legacy API.

See design doc 
[here|https://docs.google.com/document/d/e/2PACX-1vRXF8nTdjwH0LbYyp3b6Zt6STEKWsvfKSO7_s4foOB-3zJ2h4_06JF147hUPlADJxZ_X22RFxgZ-fRS/pub].

  was:
See design doc 
[here|https://docs.google.com/document/d/e/2PACX-1vRXF8nTdjwH0LbYyp3b6Zt6STEKWsvfKSO7_s4foOB-3zJ2h4_06JF147hUPlADJxZ_X22RFxgZ-fRS/pub].

User-defined Functions in Python consist of (pickled) Python UDFs and 
(Arrow-optimized) Pandas UDFs. They enable users to run arbitrary Python code 
on top of the Apache Spark™ engine. Users only have to state "what to do"; 
PySpark, as a sandbox, encapsulates "how to do it".

Spark Connect Python Client (SCPC), as a client and server interface for 
PySpark will eventually replace the legacy API of PySpark. Supporting PySpark 
UDFs is essential for Spark Connect to reach parity with the PySpark legacy API.


> Support for User-defined Functions in Python
> --------------------------------------------
>
>                 Key: SPARK-41661
>                 URL: https://issues.apache.org/jira/browse/SPARK-41661
>             Project: Spark
>          Issue Type: Umbrella
>          Components: Connect
>    Affects Versions: 3.4.0
>            Reporter: Martin Grund
>            Assignee: Xinrong Meng
>            Priority: Major
>
> User-defined Functions in Python consist of (pickled) Python UDFs and 
> (Arrow-optimized) Pandas UDFs. They enable users to run arbitrary Python code 
> on top of the Apache Spark™ engine. Users only have to state "what to do"; 
> PySpark, as a sandbox, encapsulates "how to do it".
> Spark Connect Python Client (SCPC), as a client and server interface for 
> PySpark will eventually replace the legacy API of PySpark. Supporting PySpark 
> UDFs is essential for Spark Connect to reach parity with the PySpark legacy 
> API.
> See design doc 
> [here|https://docs.google.com/document/d/e/2PACX-1vRXF8nTdjwH0LbYyp3b6Zt6STEKWsvfKSO7_s4foOB-3zJ2h4_06JF147hUPlADJxZ_X22RFxgZ-fRS/pub].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to