Re: [VOTE] Release Spark 4.1.0 (RC3)

DB Tsai Tue, 16 Dec 2025 16:34:50 -0800

+1 on fixing known issues to protect the quality of Apache Spark releases, 
rather than assuming “not a regression” means we should ship with known 
problems.


Longer term, once we move to a more frequent release cadence, if an issue comes 
from a new feature, we should default that feature off (or gate it) without 
blocking the release. But if it’s a day-zero bug that isn’t technically a 
regression, we should still address it before releasing.

Sent from my iPhone

> On Dec 17, 2025, at 3:19 AM, Mark Hamstra <[email protected]> wrote:
> 
> On a little higher level, not restricted to just this issue/PR, there
> is a distinct difference between "if there is no regression, then we
> can release without fixing the issue" and "if there is no regression,
> then we must release without fixing the issue". I don't believe that
> the latter has ever been established as agreed upon policy in the
> Spark project. I also don't believe that it is a good policy: there
> are issues worth taking the time to fix (or at least carefully
> discuss) even if they are not regressions.
> 
>> On Tue, Dec 16, 2025 at 5:54 AM Herman van Hovell via dev
>> <[email protected]> wrote:
>> 
>> Dongjoon,
>> 
>> I have a couple of problems with this course of action:
>> 
>> You seem to be favoring speed over quality here. Even if my vote were 
>> erroneous, you should give me more than two hours to respond. This is a 
>> global community, not everyone is awake at the same time. As far as I know 
>> we try to follow a consensus driven decision making process here; this seems 
>> to be diametrically opposed to that.
>> The problem itself is serious since it can cause driver crashes. In general 
>> I believe that we should not be in the business of shipping obviously broken 
>> things. The only thing you are doing now is increase toil by forcing us to 
>> release a patch version almost immediately.
>> The offending change was backported to a maintenance release. That is 
>> something different than it being a previously known problem.
>> I am not sure I follow the PR argument. You merged my initial PR without 
>> even checking in with me. That PR fixed the issue, it just needed proper 
>> tests and some touch-ups (again quality is important). I open a follow-up 
>> that contains proper testing, and yes this fails because of a change in 
>> error types, it happens, I will fix it. The statement that we don't have a 
>> fix is untrue, the fact that you state otherwise makes me seriously doubt 
>> your judgement here. You could have asked me or someone else, you could have 
>> leaned in and checked it yourself.
>> 
>> I would like to understand why there is such a rush here.
>> 
>> Kind regards,
>> Herman
>> 
>>> On Tue, Dec 16, 2025 at 7:27 AM Dongjoon Hyun <[email protected]> wrote:
>>> 
>>> After rechecking, this vote passed.
>>> 
>>> I'll send a vote result email.
>>> 
>>> Dongjoon.
>>> 
>>> On 2025/12/16 11:03:39 Dongjoon Hyun wrote:
>>>> Hi, All.
>>>> 
>>>> I've been working with Herman's PRs so far.
>>>> 
>>>> As a kind of fact checking, I need to correct two things in RC3 thread.
>>>> 
>>>> First, Herman claimed that he found a regression of Apache Spark 4.1.0, 
>>>> but actually it's not true because Apache Spark 4.0.1 also has SPARK-53342 
>>>> since 2025-09-06.
>>>> 
>>>> Second, although Herman shared us a patch since last Friday, Herman also 
>>>> made another PR containing the main code change 9 hours ago. In addition, 
>>>> unfortunately, it also didn't pass our CIs yet. It simply means that there 
>>>> is no complete patch yet in the community for both Apache Spark 4.1.0 and 
>>>> 4.0.2.
>>>> 
>>>> https://github.com/apache/spark/pull/53480
>>>> ([SPARK-54696][CONNECT] Clean-up Arrow Buffers - follow-up)
>>>> 
>>>> In short, he seems to block RC3 as a mistake. I'm re-checking the 
>>>> situation around RC3 vote and `branch-4.1` situation.
>>>> 
>>>> Dongjoon.
>>>> 
>>>>>>> 
>>>>>>> On 2025/12/15 14:59:32 Herman van Hovell via dev wrote:
>>>>>>>> I pasted a non-existing link for the root cause. The actual link is 
>>>>>>>> here:
>>>>>>>> https://issues.apache.org/jira/browse/SPARK-53342
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Mon, Dec 15, 2025 at 10:47 AM Herman van Hovell <
>>>>>>> [email protected]>
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Hey Dongjoon,
>>>>>>>>> 
>>>>>>>>> Regarding your questions.
>>>>>>>>> 
>>>>>>>>>   1. If you define a large-ish local relation (which makes us cache it
>>>>>>>>>   on the serverside) and keep using it, then leak off-heap memory
>>>>>>> every time
>>>>>>>>>   it is being used. At some point the OS will OOM kill the driver.
>>>>>>> While I
>>>>>>>>>   have a repro, testing it like this in CI is not a good idea. As an
>>>>>>>>>   alternative I am working on a test that checks buffer clean-up.For
>>>>>>> the
>>>>>>>>>   record I don't appreciate the term `claim` here; I am not blocking a
>>>>>>>>>   release without genuine concern.
>>>>>>>>>   2. The root cause is
>>>>>>>>>   https://databricks.atlassian.net/browse/SPARK-53342 and not the
>>>>>>> large
>>>>>>>>>   local relations work.
>>>>>>>>>   3. A PR has been open since Friday:
>>>>>>>>>   https://github.com/apache/spark/pull/53452. I hope that I can get
>>>>>>> it
>>>>>>>>>   merged today.
>>>>>>>>>   4. I don't see a reason why.
>>>>>>>>> 
>>>>>>>>> Cheers,
>>>>>>>>> Herman
>>>>>>>>> 
>>>>>>>>> On Mon, Dec 15, 2025 at 5:47 AM Dongjoon Hyun <[email protected]>
>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> How can we verify the regression, Herman?
>>>>>>>>>> 
>>>>>>>>>> It's a little difficult for me to evaluate your claim so far due to
>>>>>>> the
>>>>>>>>>> lack of the shared information. Specifically, there is no update for
>>>>>>> last 3
>>>>>>>>>> days on "SPARK-54696 (Spark Connect LocalRelation support leak
>>>>>>> off-heap
>>>>>>>>>> memory)" after you created it.
>>>>>>>>>> 
>>>>>>>>>> Could you provide us more technical information about your Spark
>>>>>>> Connect
>>>>>>>>>> issue?
>>>>>>>>>> 
>>>>>>>>>> 1. How can we reproduce your claim? Do you have a test case?
>>>>>>>>>> 
>>>>>>>>>> 2. For the root cause, I'm wondering if you are saying literally
>>>>>>>>>> SPARK-53917 (Support large local relations) or another JIRA issue.
>>>>>>> Which
>>>>>>>>>> commit is the root cause?
>>>>>>>>>> 
>>>>>>>>>> 3. Since you assigned SPARK-54696 to yourself for last 3 days, do you
>>>>>>>>>> want to provide a PR soon?
>>>>>>>>>> 
>>>>>>>>>> 4. If you need more time, shall we simply revert the root cause from
>>>>>>>>>> Apache Spark 4.1.0 ?
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Dongjoon
>>>>>>>>>> 
>>>>>>>>>> On 2025/12/14 23:29:59 Herman van Hovell via dev wrote:
>>>>>>>>>>> Yes. It is a regression in Spark 4.1. The root cause is a change
>>>>>>> where
>>>>>>>>>> we
>>>>>>>>>>> fail to clean-up allocated (off-heap) buffers.
>>>>>>>>>>> 
>>>>>>>>>>> On Sun, Dec 14, 2025 at 4:25 AM Dongjoon Hyun <[email protected]>
>>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> Hi, Herman.
>>>>>>>>>>>> 
>>>>>>>>>>>> Do you mean that is a regression at Apache Spark 4.1.0?
>>>>>>>>>>>> 
>>>>>>>>>>>> If then, do you know what was the root cause?
>>>>>>>>>>>> 
>>>>>>>>>>>> Dongjoon.
>>>>>>>>>>>> 
>>>>>>>>>>>> On 2025/12/13 23:09:02 Herman van Hovell via dev wrote:
>>>>>>>>>>>>> -1. We need to get
>>>>>>>>>> https://issues.apache.org/jira/browse/SPARK-54696
>>>>>>>>>>>> fixed.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On Sat, Dec 13, 2025 at 11:07 AM Jules Damji <
>>>>>>> [email protected]
>>>>>>>>>>> 
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> +1 non-binding
>>>>>>>>>>>>>> —
>>>>>>>>>>>>>> Sent from my iPhone
>>>>>>>>>>>>>> Pardon the dumb thumb typos :)
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On Dec 11, 2025, at 8:34 AM, [email protected] wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Please vote on releasing the following candidate as Apache
>>>>>>>>>> Spark
>>>>>>>>>>>>>> version 4.1.0.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> The vote is open until Sun, 14 Dec 2025 09:34:31 PST and
>>>>>>> passes
>>>>>>>>>> if a
>>>>>>>>>>>>>> majority +1 PMC votes are cast, with
>>>>>>>>>>>>>>> a minimum of 3 +1 votes.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> [ ] +1 Release this package as Apache Spark 4.1.0
>>>>>>>>>>>>>>> [ ] -1 Do not release this package because ...
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> To learn more about Apache Spark, please see
>>>>>>>>>>>> https://spark.apache.org/
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> The tag to be voted on is v4.1.0-rc3 (commit e221b56be7b):
>>>>>>>>>>>>>>> https://github.com/apache/spark/tree/v4.1.0-rc3
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> The release files, including signatures, digests, etc. can
>>>>>>> be
>>>>>>>>>> found
>>>>>>>>>>>> at:
>>>>>>>>>>>>>>> 
>>>>>>> https://dist.apache.org/repos/dist/dev/spark/v4.1.0-rc3-bin/
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Signatures used for Spark RCs can be found in this file:
>>>>>>>>>>>>>>> https://downloads.apache.org/spark/KEYS
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> The staging repository for this release can be found at:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>> https://repository.apache.org/content/repositories/orgapachespark-1508/
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> The documentation corresponding to this release can be
>>>>>>> found at:
>>>>>>>>>>>>>>> 
>>>>>>> https://dist.apache.org/repos/dist/dev/spark/v4.1.0-rc3-docs/
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> The list of bug fixes going into 4.1.0 can be found at the
>>>>>>>>>> following
>>>>>>>>>>>> URL:
>>>>>>>>>>>>>>> 
>>>>>>> https://issues.apache.org/jira/projects/SPARK/versions/12355581
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> FAQ
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> =========================
>>>>>>>>>>>>>>> How can I help test this release?
>>>>>>>>>>>>>>> =========================
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> If you are a Spark user, you can help us test this release
>>>>>>> by
>>>>>>>>>> taking
>>>>>>>>>>>>>>> an existing Spark workload and running on this release
>>>>>>>>>> candidate,
>>>>>>>>>>>> then
>>>>>>>>>>>>>>> reporting any regressions.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> If you're working in PySpark you can set up a virtual env
>>>>>>> and
>>>>>>>>>> install
>>>>>>>>>>>>>>> the current RC via "pip install
>>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>> https://dist.apache.org/repos/dist/dev/spark/v4.1.0-rc3-bin/pyspark-4.1.0.tar.gz
>>>>>>>>>>>>>> "
>>>>>>>>>>>>>>> and see if anything important breaks.
>>>>>>>>>>>>>>> In the Java/Scala, you can add the staging repository to
>>>>>>> your
>>>>>>>>>>>> project's
>>>>>>>>>>>>>> resolvers and test
>>>>>>>>>>>>>>> with the RC (make sure to clean up the artifact cache
>>>>>>>>>> before/after so
>>>>>>>>>>>>>>> you don't end up building with an out of date RC going
>>>>>>> forward).
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>>>>>> To unsubscribe e-mail: [email protected]
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>>>>> To unsubscribe e-mail: [email protected]
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>>> To unsubscribe e-mail: [email protected]
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>> To unsubscribe e-mail: [email protected]
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe e-mail: [email protected]
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe e-mail: [email protected]
>>>>> 
>>>>> 
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe e-mail: [email protected]
>>>> 
>>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe e-mail: [email protected]
>>> 
> 
> ---------------------------------------------------------------------
> To unsubscribe e-mail: [email protected]
> 

---------------------------------------------------------------------
To unsubscribe e-mail: [email protected]

Re: [VOTE] Release Spark 4.1.0 (RC3)

Reply via email to