Thanks for update and looking into it. Excuse the thumb typos
On Tue, 21 Jan 2025 at 4:09 PM, Hyukjin Kwon <gurwls...@apache.org> wrote: > Just a quick note on that: the major reason is 1. OOM we should figure out > and fix the CI environment. 2. structured streaming test failure that is > still in development. > I made an umbrella JIRA (https://issues.apache.org/jira/browse/SPARK-50907), > and I will work there. Should be easier to look at what was the actual > issue there. > > On Wed, 22 Jan 2025 at 09:04, Hyukjin Kwon <gurwls...@apache.org> wrote: > >> Let me take a look. shouldn't be a major issue. >> >> On Wed, 22 Jan 2025 at 08:31, Mich Talebzadeh <mich.talebza...@gmail.com> >> wrote: >> >>> As discussed on a thread over the weekend, we agreed among us including >>> Matei on a shift towards a more stable and version-independent APIs. >>> Spark Connect IMO is a key enabler of this shift, allowing users and >>> developers to build applications and libraries that are more resilient to >>> changes in Spark's internals as opposed to RDDs. *Moreover, **maintaining >>> backward compatibility fo*r the existing *RDD-based applications and >>> libraries* is crucial during this transition window so the timeframe is >>> another factor for consideration. >>> >>> HTH >>> >>> Mich Talebzadeh, >>> Architect | Data Science | Financial Crime | Forensic Analysis | GDPR >>> >>> view my Linkedin profile >>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>> >>> >>> >>> >>> >>> On Tue, 21 Jan 2025 at 22:40, Holden Karau <holden.ka...@gmail.com> >>> wrote: >>> >>>> Interesting. So given one of the features of Spark connect should be >>>> simpler migrations we should (in my mind) only declare it stable once we’ve >>>> gone through two releases where the previous client + its code can talk to >>>> the new server. >>>> >>>> Twitter: https://twitter.com/holdenkarau >>>> Fight Health Insurance: https://www.fighthealthinsurance.com/ >>>> <https://www.fighthealthinsurance.com/?q=hk_email> >>>> Books (Learning Spark, High Performance Spark, etc.): >>>> https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> >>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau >>>> Pronouns: she/her >>>> >>>> >>>> On Tue, Jan 21, 2025 at 12:31 PM Dongjoon Hyun <dongj...@apache.org> >>>> wrote: >>>> >>>>> It seems that there is misinformation about the stability of Spark >>>>> Connect in Spark 4. I would like to reduce the gap in our dev mailing >>>>> list. >>>>> >>>>> Frequently, some people claim `Spark Connect` is stable because it >>>>> uses Protobuf. Yes, we standardize the interface layer. However, may I ask >>>>> if it implies its implementation's stability? >>>>> >>>>> Since Apache Spark is an open source community, you can see the >>>>> stability of implementation in our public CI. In our CI, the PySpark >>>>> Connect client has been technically broken most of the time. >>>>> >>>>> 1. >>>>> https://github.com/apache/spark/actions/workflows/build_python_connect.yml >>>>> (Spark Connect Python-only in master) >>>>> >>>>> In addition, the Spark 3.5 client seems to face another difficulty >>>>> talking with Spark 4 server. >>>>> >>>>> 2. >>>>> https://github.com/apache/spark/actions/workflows/build_python_connect35.yml >>>>> (Spark Connect Python-only:master-server, 35-client) >>>>> >>>>> 3. What about the stability and the feature parities in different >>>>> languages? Do they work well with Apache Spark 4? I'm wondering if there >>>>> is >>>>> any clue for the Apache Spark community to do assessment? >>>>> >>>>> Given (1), (2), and (3), how can we make sure that `Spark Connect` is >>>>> stable or ready in Spark 4? From my perspective, this is still actively >>>>> under development with an open end. >>>>> >>>>> The bottom line is `Spark Connect` needs more community love in order >>>>> to be claimed as Stable in Apache Spark 4. I'm looking forward to seeing >>>>> the healthy Spark Connect CI in Spark 4. Until then, let's clarify what is >>>>> stable in `Spark Connect` and what is not yet. >>>>> >>>>> Best Regards, >>>>> Dongjoon. >>>>> >>>>> PS. >>>>> This is a seperate thread from the previous flakiness issues. >>>>> https://lists.apache.org/thread/r5dzdr3w4ly0dr99k24mqvld06r4mzmq >>>>> ([FYI] Known `Spark Connect` Test Suite Flakiness) >>>>> >>>>