Re: 4.0.0 RC1 is coming

Szehon Ho Fri, 21 Feb 2025 00:05:37 -0800

Hi

Sorry for late reply, we identified another serious issue with the newly
added Call Procedure, can we add it to the list?


SPARK-51273: Spark Connect Call Procedure runs the procedure twice
<https://issues.apache.org/jira/browse/SPARK-51273>.  I have a PR
<https://github.com/apache/spark/pull/50031> to fix it.

I know its a new functionality that Iceberg (and other V2 data source) are
waiting for in Spark 4.0 to implement their Spark procedures, and it would
be great to fix it before the release.  Running twice can lead to
correctness issues.

Thanks
Szehon



On Sun, Feb 16, 2025 at 10:36 PM Jungtaek Lim <kabhwan.opensou...@gmail.com>
wrote:

> I'm working on SPARK-51187
> <https://issues.apache.org/jira/browse/SPARK-51187>, to gracefully rename
> the improper config we introduced in SPARK-49699
> <https://issues.apache.org/jira/browse/SPARK-49699>. Unfortunately, the
> config was released in Apache Spark 3.5.4, hence we need a graceful way on
> this rather than blindly renaming.
>
> Also, on my radar of reviews, it'd be ideal to include SPARK-50655
> <https://issues.apache.org/jira/browse/SPARK-50655> into Apache Spark
> 4.0.0, otherwise we will need to deal with additional work on storage
> format change.
>
> Thanks for driving the huge release!
>
>
>
>
> On Mon, Feb 17, 2025 at 2:05 PM Wenchen Fan <cloud0...@gmail.com> wrote:
>
>> Hi all,
>>
>> RC1 was scheduled for Feb 15, but I'll cut in on Feb 18 to have 3 working
>> days during the vote period, due to Feb 15 and 16 being the weekend, and
>> Feb 17 being a holiday in the US.
>>
>> The RC1 vote likely won't pass because of some ongoing work but I think
>> it's better to kick off the release process as scheduled.
>>
>> The ongoing work that I'm aware of:
>>
>>    - SPARK-38388, SPARK-51016: correctness issue caused by
>>    indeterministic query
>>    - SPARK-50992: OOM issue caused by AQE UI
>>    - SPARK-46057: SQL UDF. SQL table function is still WIP but we can
>>    probably re-target it for 4.1. cc @Allison Wang
>>    <allison.w...@databricks.com>
>>    - SPARK-48918: Unified Scala interface for Classic and Connect. A few
>>    sub-tasks are still open, do we need to complete them in 4.0? @Herman
>>    van Hövell tot Westerflier <herman.vanhov...@databricks.com> @Paddy Xu
>>    - SPARK-46815: Arbitrary state API v2. A few sub-tasks are still
>>    open, do we need to complete them in 4.0? @Anish Shrigondekar
>>    <anish.shrigonde...@databricks.com>
>>    - SPARK-24497: Recursive CTE. The performance issue is hard to fix,
>>    we will likely retarget it for 4.1.
>>    - from_json performance regression: we should either support CSE for
>>    Filter in whole stage codegen (PR
>>    <https://github.com/apache/spark/pull/49573>) or revert the codegen
>>    support of from_json.
>>
>> Please reply to this email if you have other ongoing work to add to this
>> list.
>>
>> Thanks,
>> Wenchen
>>
>>

Re: 4.0.0 RC1 is coming

Reply via email to