[VOTE] SPIP: Improving Spark SQL UDFs with Transpilation

Holden Karau Mon, 12 Jan 2026 12:24:08 -0800

Hi Folks,

Discussion on the SPIP Spark SQL UDF transpilation
<https://lists.apache.org/thread/xj8qfqvo5f9o188984mwh2kcg0fnqs9c> seems to
have settled, so I'm now bringing it for a vote. The normal requirements is
that an SPIP vote is open for at least 72 hours, but given that some are
just returning from the winter holidays, I plan to leave this vote open
until Sunday Jan 18th.


>From the original discussion:

It's been a few years since we last looked at transpilation, and with the
growth of Pandas on Spark I think it's time we revisit it. I've got a JIRA
filed <https://issues.apache.org/jira/browse/SPARK-54783> some rough proof
of concept code <https://github.com/apache/spark/pull/53547> (I think doing
the transpilation Python side instead of Scala side makes more sense, but
was interesting to play with), and  of course everyones favourite a design
doc.
<https://docs.google.com/document/d/1cHc6tiR4yO3hppTzrK1F1w9RwyEPMvaeEuL2ub2LURg/edit?usp=sharing>
(I
also have a collection of YouTube streams playing with the idea
<https://www.youtube.com/@HoldenKarau/streams> if anyone wants to follow
along on that journey).

Cheers,

Holden :)

-- 
Twitter: https://twitter.com/holdenkarau
Fight Health Insurance: https://www.fighthealthinsurance.com/
<https://www.fighthealthinsurance.com/?q=hk_email>
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
YouTube Live Streams: https://www.youtube.com/user/holdenkarau
Pronouns: she/her

[VOTE] SPIP: Improving Spark SQL UDFs with Transpilation

Reply via email to