To be clear, (1) is `PySpark 4.0 Client` + `Spark 4.0 Server`, which is more 
severe.

And, your point matches with (2) exactly. Thank you for your reply, Holden.

Dongjoon.

On 2025/01/21 22:38:20 Holden Karau wrote:
> Interesting. So given one of the features of Spark connect should be
> simpler migrations we should (in my mind) only declare it stable once we’ve
> gone through two releases where the previous client + its code can talk to
> the new server.
> 
> Twitter: https://twitter.com/holdenkarau
> Fight Health Insurance: https://www.fighthealthinsurance.com/
> <https://www.fighthealthinsurance.com/?q=hk_email>
> Books (Learning Spark, High Performance Spark, etc.):
> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
> Pronouns: she/her
> 
> 
> On Tue, Jan 21, 2025 at 12:31 PM Dongjoon Hyun <dongj...@apache.org> wrote:
> 
> > It seems that there is misinformation about the stability of Spark Connect
> > in Spark 4. I would like to reduce the gap in our dev mailing list.
> >
> > Frequently, some people claim `Spark Connect` is stable because it uses
> > Protobuf. Yes, we standardize the interface layer. However, may I ask if it
> > implies its implementation's stability?
> >
> > Since Apache Spark is an open source community, you can see the stability
> > of implementation in our public CI. In our CI, the PySpark Connect client
> > has been technically broken most of the time.
> >
> > 1.
> > https://github.com/apache/spark/actions/workflows/build_python_connect.yml
> > (Spark Connect Python-only in master)
> >
> > In addition, the Spark 3.5 client seems to face another difficulty talking
> > with Spark 4 server.
> >
> > 2.
> > https://github.com/apache/spark/actions/workflows/build_python_connect35.yml
> > (Spark Connect Python-only:master-server, 35-client)
> >
> > 3. What about the stability and the feature parities in different
> > languages? Do they work well with Apache Spark 4? I'm wondering if there is
> > any clue for the Apache Spark community to do assessment?
> >
> > Given (1), (2), and (3), how can we make sure that `Spark Connect` is
> > stable or ready in Spark 4? From my perspective, this is still actively
> > under development with an open end.
> >
> > The bottom line is `Spark Connect` needs more community love in order to
> > be claimed as Stable in Apache Spark 4. I'm looking forward to seeing the
> > healthy Spark Connect CI in Spark 4. Until then, let's clarify what is
> > stable in `Spark Connect` and what is not yet.
> >
> > Best Regards,
> > Dongjoon.
> >
> > PS.
> > This is a seperate thread from the previous flakiness issues.
> > https://lists.apache.org/thread/r5dzdr3w4ly0dr99k24mqvld06r4mzmq
> > ([FYI] Known `Spark Connect` Test Suite Flakiness)
> >
> 

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Reply via email to