Interesting. So given one of the features of Spark connect should be
simpler migrations we should (in my mind) only declare it stable once we’ve
gone through two releases where the previous client + its code can talk to
the new server.

Twitter: https://twitter.com/holdenkarau
Fight Health Insurance: https://www.fighthealthinsurance.com/
<https://www.fighthealthinsurance.com/?q=hk_email>
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
YouTube Live Streams: https://www.youtube.com/user/holdenkarau
Pronouns: she/her


On Tue, Jan 21, 2025 at 12:31 PM Dongjoon Hyun <dongj...@apache.org> wrote:

> It seems that there is misinformation about the stability of Spark Connect
> in Spark 4. I would like to reduce the gap in our dev mailing list.
>
> Frequently, some people claim `Spark Connect` is stable because it uses
> Protobuf. Yes, we standardize the interface layer. However, may I ask if it
> implies its implementation's stability?
>
> Since Apache Spark is an open source community, you can see the stability
> of implementation in our public CI. In our CI, the PySpark Connect client
> has been technically broken most of the time.
>
> 1.
> https://github.com/apache/spark/actions/workflows/build_python_connect.yml
> (Spark Connect Python-only in master)
>
> In addition, the Spark 3.5 client seems to face another difficulty talking
> with Spark 4 server.
>
> 2.
> https://github.com/apache/spark/actions/workflows/build_python_connect35.yml
> (Spark Connect Python-only:master-server, 35-client)
>
> 3. What about the stability and the feature parities in different
> languages? Do they work well with Apache Spark 4? I'm wondering if there is
> any clue for the Apache Spark community to do assessment?
>
> Given (1), (2), and (3), how can we make sure that `Spark Connect` is
> stable or ready in Spark 4? From my perspective, this is still actively
> under development with an open end.
>
> The bottom line is `Spark Connect` needs more community love in order to
> be claimed as Stable in Apache Spark 4. I'm looking forward to seeing the
> healthy Spark Connect CI in Spark 4. Until then, let's clarify what is
> stable in `Spark Connect` and what is not yet.
>
> Best Regards,
> Dongjoon.
>
> PS.
> This is a seperate thread from the previous flakiness issues.
> https://lists.apache.org/thread/r5dzdr3w4ly0dr99k24mqvld06r4mzmq
> ([FYI] Known `Spark Connect` Test Suite Flakiness)
>

Reply via email to