Re: 4.0.0 RC1 is coming

Max Gekk Fri, 21 Feb 2025 08:06:35 -0800

Hi All,

While testing new syntax CREATE FUNCTION ... RETURNS TABLE which was
introduced recently, I have found that Spark fails with an internal error
in 4.0.0-rc1, see https://issues.apache.org/jira/browse/SPARK-51289. I
believe even if we don't fully support new feature, Spark shouldn't crash
with internal error while using it and should output a proper error message.


-1 for RC1

Yours faithfully,
Maksim Gekk

On Fri, Feb 21, 2025 at 9:06 AM Szehon Ho <szehon.apa...@gmail.com> wrote:

> Hi
>
> Sorry for late reply, we identified another serious issue with the newly
> added Call Procedure, can we add it to the list?
>
> SPARK-51273: Spark Connect Call Procedure runs the procedure twice
> <https://issues.apache.org/jira/browse/SPARK-51273>.  I have a PR
> <https://github.com/apache/spark/pull/50031> to fix it.
>
> I know its a new functionality that Iceberg (and other V2 data source) are
> waiting for in Spark 4.0 to implement their Spark procedures, and it would
> be great to fix it before the release.  Running twice can lead to
> correctness issues.
>
> Thanks
> Szehon
>
>
>
> On Sun, Feb 16, 2025 at 10:36 PM Jungtaek Lim <
> kabhwan.opensou...@gmail.com> wrote:
>
>> I'm working on SPARK-51187
>> <https://issues.apache.org/jira/browse/SPARK-51187>, to gracefully
>> rename the improper config we introduced in SPARK-49699
>> <https://issues.apache.org/jira/browse/SPARK-49699>. Unfortunately, the
>> config was released in Apache Spark 3.5.4, hence we need a graceful way on
>> this rather than blindly renaming.
>>
>> Also, on my radar of reviews, it'd be ideal to include SPARK-50655
>> <https://issues.apache.org/jira/browse/SPARK-50655> into Apache Spark
>> 4.0.0, otherwise we will need to deal with additional work on storage
>> format change.
>>
>> Thanks for driving the huge release!
>>
>>
>>
>>
>> On Mon, Feb 17, 2025 at 2:05 PM Wenchen Fan <cloud0...@gmail.com> wrote:
>>
>>> Hi all,
>>>
>>> RC1 was scheduled for Feb 15, but I'll cut in on Feb 18 to have 3
>>> working days during the vote period, due to Feb 15 and 16 being the
>>> weekend, and Feb 17 being a holiday in the US.
>>>
>>> The RC1 vote likely won't pass because of some ongoing work but I think
>>> it's better to kick off the release process as scheduled.
>>>
>>> The ongoing work that I'm aware of:
>>>
>>>    - SPARK-38388, SPARK-51016: correctness issue caused by
>>>    indeterministic query
>>>    - SPARK-50992: OOM issue caused by AQE UI
>>>    - SPARK-46057: SQL UDF. SQL table function is still WIP but we can
>>>    probably re-target it for 4.1. cc @Allison Wang
>>>    <allison.w...@databricks.com>
>>>    - SPARK-48918: Unified Scala interface for Classic and Connect. A
>>>    few sub-tasks are still open, do we need to complete them in 4.0? @Herman
>>>    van Hövell tot Westerflier <herman.vanhov...@databricks.com> @Paddy
>>>    Xu
>>>    - SPARK-46815: Arbitrary state API v2. A few sub-tasks are still
>>>    open, do we need to complete them in 4.0? @Anish Shrigondekar
>>>    <anish.shrigonde...@databricks.com>
>>>    - SPARK-24497: Recursive CTE. The performance issue is hard to fix,
>>>    we will likely retarget it for 4.1.
>>>    - from_json performance regression: we should either support CSE for
>>>    Filter in whole stage codegen (PR
>>>    <https://github.com/apache/spark/pull/49573>) or revert the codegen
>>>    support of from_json.
>>>
>>> Please reply to this email if you have other ongoing work to add to this
>>> list.
>>>
>>> Thanks,
>>> Wenchen
>>>
>>>

Re: 4.0.0 RC1 is coming

Reply via email to