Hi All, While testing new syntax CREATE FUNCTION ... RETURNS TABLE which was introduced recently, I have found that Spark fails with an internal error in 4.0.0-rc1, see https://issues.apache.org/jira/browse/SPARK-51289. I believe even if we don't fully support new feature, Spark shouldn't crash with internal error while using it and should output a proper error message.
-1 for RC1 Yours faithfully, Maksim Gekk On Fri, Feb 21, 2025 at 9:06 AM Szehon Ho <szehon.apa...@gmail.com> wrote: > Hi > > Sorry for late reply, we identified another serious issue with the newly > added Call Procedure, can we add it to the list? > > SPARK-51273: Spark Connect Call Procedure runs the procedure twice > <https://issues.apache.org/jira/browse/SPARK-51273>. I have a PR > <https://github.com/apache/spark/pull/50031> to fix it. > > I know its a new functionality that Iceberg (and other V2 data source) are > waiting for in Spark 4.0 to implement their Spark procedures, and it would > be great to fix it before the release. Running twice can lead to > correctness issues. > > Thanks > Szehon > > > > On Sun, Feb 16, 2025 at 10:36 PM Jungtaek Lim < > kabhwan.opensou...@gmail.com> wrote: > >> I'm working on SPARK-51187 >> <https://issues.apache.org/jira/browse/SPARK-51187>, to gracefully >> rename the improper config we introduced in SPARK-49699 >> <https://issues.apache.org/jira/browse/SPARK-49699>. Unfortunately, the >> config was released in Apache Spark 3.5.4, hence we need a graceful way on >> this rather than blindly renaming. >> >> Also, on my radar of reviews, it'd be ideal to include SPARK-50655 >> <https://issues.apache.org/jira/browse/SPARK-50655> into Apache Spark >> 4.0.0, otherwise we will need to deal with additional work on storage >> format change. >> >> Thanks for driving the huge release! >> >> >> >> >> On Mon, Feb 17, 2025 at 2:05 PM Wenchen Fan <cloud0...@gmail.com> wrote: >> >>> Hi all, >>> >>> RC1 was scheduled for Feb 15, but I'll cut in on Feb 18 to have 3 >>> working days during the vote period, due to Feb 15 and 16 being the >>> weekend, and Feb 17 being a holiday in the US. >>> >>> The RC1 vote likely won't pass because of some ongoing work but I think >>> it's better to kick off the release process as scheduled. >>> >>> The ongoing work that I'm aware of: >>> >>> - SPARK-38388, SPARK-51016: correctness issue caused by >>> indeterministic query >>> - SPARK-50992: OOM issue caused by AQE UI >>> - SPARK-46057: SQL UDF. SQL table function is still WIP but we can >>> probably re-target it for 4.1. cc @Allison Wang >>> <allison.w...@databricks.com> >>> - SPARK-48918: Unified Scala interface for Classic and Connect. A >>> few sub-tasks are still open, do we need to complete them in 4.0? @Herman >>> van Hövell tot Westerflier <herman.vanhov...@databricks.com> @Paddy >>> Xu >>> - SPARK-46815: Arbitrary state API v2. A few sub-tasks are still >>> open, do we need to complete them in 4.0? @Anish Shrigondekar >>> <anish.shrigonde...@databricks.com> >>> - SPARK-24497: Recursive CTE. The performance issue is hard to fix, >>> we will likely retarget it for 4.1. >>> - from_json performance regression: we should either support CSE for >>> Filter in whole stage codegen (PR >>> <https://github.com/apache/spark/pull/49573>) or revert the codegen >>> support of from_json. >>> >>> Please reply to this email if you have other ongoing work to add to this >>> list. >>> >>> Thanks, >>> Wenchen >>> >>>