Haiyang Sun created SPARK-55278:
-----------------------------------

             Summary: Language-agnostic UDF Protocol for Spark
                 Key: SPARK-55278
                 URL: https://issues.apache.org/jira/browse/SPARK-55278
             Project: Spark
          Issue Type: Improvement
          Components: Connect, PySpark
    Affects Versions: 4.2
            Reporter: Haiyang Sun


Run user-provided code in Spark {*}consistently across many programming 
languages{*}.

Today, Spark Connect allows users to write queries from multiple languages, but 
support for user-defined functions is incomplete. In practice, only Python has 
a mature solution, and it relies on language-specific mechanisms that do not 
generalize to other languages such as 
[Go|https://github.com/apache/spark-connect-go] / 
[Rust|https://github.com/apache/spark-connect-rust] / 
[Swift|https://github.com/apache/spark-connect-swift] / 
[.NET|https://github.com/GoEddie/spark-connect-dotnet] (where UDF is not 
supported).

Our objective is to define a *unified API and execution protocol* for 
user-defined functions that run outside the Spark engine process via 
inter-process communication (IPC). This allows Spark to interact with external 
workers in a consistent way, regardless of the language used to implement the 
function.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to