[DISCUSS] SPIP: Improving Spark SQL UDFs with Transpilation

Hi Folks,

It's been a few years since we last looked at transpilation, and with the
growth of Pandas on Spark I think it's time we revisit it. I've got a JIRA
filed <https://issues.apache.org/jira/browse/SPARK-54783> some rough proof
of concept code <https://github.com/apache/spark/pull/53547> (I think doing
the transpilation Python side instead of Scala side makes more sense, but
was interesting to play with), and  of course everyones favourite a design
doc.
<https://docs.google.com/document/d/1cHc6tiR4yO3hppTzrK1F1w9RwyEPMvaeEuL2ub2LURg/edit?usp=sharing>
(I
also have a collection of YouTube streams playing with the idea
<https://www.youtube.com/@HoldenKarau/streams> if anyone wants to follow
along on that journey).


Wishing everyone a happy holidays :)

Cheers,

Holden :)

-- 
Twitter: https://twitter.com/holdenkarau
Fight Health Insurance: https://www.fighthealthinsurance.com/
<https://www.fighthealthinsurance.com/?q=hk_email>
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
YouTube Live Streams: https://www.youtube.com/user/holdenkarau
Pronouns: she/her

[DISCUSS] SPIP: Improving Spark SQL UDFs with Transpilation

Reply via email to