Thanks,

I read PySpark pull. I suggest this
Why are the changes needed?

As Spark connect is becoming the default *API *in spark 4.0, we need to add
connect support for TWS in Python.
Why:

Saying this "As Spark Connect is becoming* the default AP*I in Spark 4.0"
reflects more accurately that Spark Connect is an interface for interacting
with Spark, not a replacement for the entire system.

HTH
..


Dr Mich Talebzadeh,
Architect | Data Science | Financial Crime | Forensic Analysis | GDPR

   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>





On Tue, 4 Mar 2025 at 20:35, Jungtaek Lim <kabhwan.opensou...@gmail.com>
wrote:

> Hi,
>
> Here are PRs we are seeking for consensus to get in for 4.0.
>
> PySpark: https://github.com/apache/spark/pull/49560
> Scala: https://github.com/apache/spark/pull/49488
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)
>
>
> On Tue, Mar 4, 2025 at 11:06 PM Mich Talebzadeh <mich.talebza...@gmail.com>
> wrote:
>
>> Thanks.
>>
>> Can you point to a link or any further documentation please?
>>
>> Dr Mich Talebzadeh,
>> Architect | Data Science | Financial Crime | Forensic Analysis | GDPR
>>
>>    view my Linkedin profile
>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>
>>
>>
>>
>>
>> On Tue, 4 Mar 2025 at 13:22, Herman van Hovell
>> <her...@databricks.com.invalid> wrote:
>>
>>> +1
>>>
>>> On Tue, Mar 4, 2025 at 2:07 AM Anish Shrigondekar
>>> <anish.shrigonde...@databricks.com.invalid> wrote:
>>>
>>>> +1 - Would be great to get this into the Spark 4.0 release.
>>>>
>>>> Thanks,
>>>> Anish
>>>>
>>>> On Mon, Mar 3, 2025 at 9:35 PM Jungtaek Lim <
>>>> kabhwan.opensou...@gmail.com> wrote:
>>>>
>>>>> Hi dev,
>>>>>
>>>>> We are going to introduce a new API named `transformWithState` for
>>>>> streaming query, which allows users to perform more complex stateful
>>>>> operation in user function, with lot simpler code compared to
>>>>> `flatMapGroupsWithState` (and `applyInPandasWithState`).
>>>>>
>>>>> The target version has been Spark 4.0.0 and we track this project as a
>>>>> major one for Spark 4. We push most planned features into Spark 4.0.0,
>>>>> except Spark Connect support.
>>>>>
>>>>> The PRs for Spark Connect support are merged into Spark 4.1 branch,
>>>>> but I'm seeking the voice whether we can introduce Spark Connect support 
>>>>> to
>>>>> Spark 4.0.0.
>>>>>
>>>>> I understand this arrives a bit late, but since the API is something
>>>>> backed by a huge effort and I foresee this new API to replace the usage of
>>>>> flatMapGroupsWithState and applyInPandasWithState sooner, I'd like to make
>>>>> sure we don't push users back to wait for another 6+ months to use this in
>>>>> Spark Connect.
>>>>>
>>>>> Would love to hear your thoughts.
>>>>>
>>>>> Thanks,
>>>>> Jungtaek Lim (HeartSaVioR)
>>>>>
>>>>

Reply via email to