Hi everyone,

I would like to start a discussion on “Python Data Source API”.

This proposal aims to introduce a simple API in Python for Data Sources.
The idea is to enable Python developers to create data sources without
having to learn Scala or deal with the complexities of the current data
source APIs. The goal is to make a Python-based API that is simple and easy
to use, thus making Spark more accessible to the wider Python developer
community. This proposed approach is based on the recently introduced
Python user-defined table functions with extensions to support data sources.

*SPIP Doc*:
https://docs.google.com/document/d/1oYrCKEKHzznljYfJO4kx5K_Npcgt1Slyfph3NEk7JRU/edit?usp=sharing

*SPIP JIRA*: https://issues.apache.org/jira/browse/SPARK-44076

Looking forward to your feedback.

Thanks,
Allison

Reply via email to