Personally I'd love this, but I agree with some of the earlier comments that
this should not be Python specific (meaning I should be able to implement a
data source in Python and then make it usable across all languages Spark
supports). I think we should find a way to make this reusable beyond P
Thanks for your feedback Martin.
However, if the primary intended purpose of this API is to provide an
interface for endpoint querying, then I find this proposal even less
convincing.
Neither the Spark execution model nor the data source API (full or
restricted as proposed here) are a good f
Hey,
I would like to express my strong support for Python Data Sources even
though they might not be immediately as powerful as Scala-based data
sources. One element that is easily lost in this discussion is how much
faster the iteration speed is with Python compared to Scala. Due to the
dynamic n
With such limited scope (both language availability and features) do we
have any representative examples of sources that could significantly
benefit from providing this API, compared other available options, such
as batch imports, direct queries from vectorized UDFs or even
interfacing source
In an ideal world, every data source you want to connect to already has a
Spark data source implementation (either v1 or v2), then this Python API is
useless. But I feel it's common that people want to do quick data
exploration, and the target data system is not popular enough to have an
existing S
Similarly to Jacek, I feel it fails to document an actual community need
for such a feature.
Currently, any data source implementation has the potential to benefit
Spark users across all supported and third-party clients. For generally
available sources, this is advantageous for the whole Spar
This API looks starting from scratch and has no relationship with the existing
Java/Scala DataSourceV2 API. Particularly, how can they support SQL?
We have been back and forth on the DataSource V2 design since 2.3, I believe
there are some things to learn when introducing the Python DataSource A
Actually I support this idea in a way that Python developers don't have to
learn Scala to write their own source (and separate packaging).
This is more crucial especially when you want to write a simple data source
that interacts with the Python ecosystem.
On Tue, 20 Jun 2023 at 03:08, Denny Lee
Slightly biased, but per my conversations - this would be awesome to have!
On Mon, Jun 19, 2023 at 09:43 Abdeali Kothari
wrote:
> I would definitely use it - is it's available :)
>
> On Mon, 19 Jun 2023, 21:56 Jacek Laskowski, wrote:
>
>> Hi Allison and devs,
>>
>> Although I was against this i
I would definitely use it - is it's available :)
On Mon, 19 Jun 2023, 21:56 Jacek Laskowski, wrote:
> Hi Allison and devs,
>
> Although I was against this idea at first sight (probably because I'm a
> Scala dev), I think it could work as long as there are people who'd be
> interested in such an
Hi Allison and devs,
Although I was against this idea at first sight (probably because I'm a
Scala dev), I think it could work as long as there are people who'd be
interested in such an API. Were there any? I'm just curious. I've seen no
emails requesting it.
I also doubt that Python devs would l
Hi everyone,
I would like to start a discussion on “Python Data Source API”.
This proposal aims to introduce a simple API in Python for Data Sources.
The idea is to enable Python developers to create data sources without
having to learn Scala or deal with the complexities of the current data
sour
12 matches
Mail list logo