+1 (non-binding) Thank you for your persistence working on this proposal and figuring out the details.
On 2026/03/18 09:32:40 Haiyang Sun via dev wrote: > Hi Spark devs, > > I would like to call for *a new vote following the previous attempt* for the > *SPIP: Language-Agnostic UDF Execution Protocol for Spark *after addressing > comments and providing a supplementary design document for worker > specification. > > The SPIP proposes a structured, language-agnostic framework for running > user-defined functions (UDFs) in Spark across multiple programming languages > > Today, Spark Connect allows users to write queries from multiple languages, > but support for user-defined functions remains incomplete. In practice, > only Scala, Java, Python have working support, and this relies on > language-specific mechanisms that do not generalize well to other languages > such as Go <https://github.com/apache/spark-connect-go> / Rust > <https://github.com/apache/spark-connect-rust> / Swift > <https://github.com/apache/spark-connect-swift> / TypeScript > <https://github.com/BaldrVivaldelli/ts-spark-connector> where UDF support > is currently unavailable. In addition, there are legacy limitations in the > existing PySpark worker implementation that make it difficult to evolve the > system or extend it to new languages. > > The proposal introduces two related components: > > > 1. > > *A unified UDF execution protocol* > > The proposal defines a structured API and execution protocol for running > UDFs outside the Spark executor process and communicating with Spark via > inter-process communication (IPC). This protocol enables Spark to interact > with external UDF workers in a consistent and extensible way, regardless of > the implementation language. > 2. > > *A worker specification for provisioning and lifecycle management.* > > To support multi-language execution environments, the proposal also > introduces a worker specification describing how UDF workers can be > installed, started, connected to, and terminated. This document complements > the SPIP by outlining how workers can be provisioned and managed in a > consistent way. > > Note that this SPIP can help enable UDF support for languages that > currently do not support UDFs. For languages that already have UDF > implementations (especially Python), the goal is not to replace existing > implementations immediately, but to provide a framework that may allow them > to gradually evolve toward more language-agnostic abstractions over time. > > More details can be found in the SPIP document and the supplementary design > for worker specification: > > SPIP: > https://docs.google.com/document/d/19Whzq127QxVt2Luk0EClgaDtcpBsFUp67NcVdKKyPF8 > > Worker specification design document: > https://docs.google.com/document/d/1Dx9NqHRNuUpatH9DYoFF9cmvUl2fqHT4Rjbyw4EGLHs > > Discussion Thread: > https://lists.apache.org/thread/9t4svsnd71j7sb4r4scf2xhh8dvp3b43 > > Previous vote and discussion thread: > https://lists.apache.org/thread/81xghrfwvopp274rgyxfthsstb2xmkz1 > > *Please vote on adopting this proposal.* > > [ ] +1: Accept the proposal as an official SPIP > > [ ] +0: No opinion > > [ ] -1: Disapprove (please explain why) > > The vote will remain open for *at least 72 hours. * > > Thanks to everyone who participated in the discussion and provided valuable > feedback! > > > Best regards, > > Haiyang > --------------------------------------------------------------------- To unsubscribe e-mail: [email protected]
