Re: [VOTE] Release Spark 4.1.0 (RC3)

Dongjoon Hyun Tue, 16 Dec 2025 22:22:21 -0800

Thank you, Holden.

Technically, I recommend Apache Spark 4.1.0 in two reasons.


First, for the memory issue, it's not a regression of Apache Spark 4.1.0 
because Herman pointed SPARK-53342 as the root cause. Since SPARK-53342 was a 
kind of data loss issue of "Spark Connect", it was backported to branch-4.1 on 
2025-08-26 and released as a part of Apache Spark 4.0.1 on 2025-09-06.

- SPARK-53342: CreateDataFrame will silently truncate data with multiple Arrow 
RecordBatches
- https://issues.apache.org/jira/browse/SPARK-53342
- https://github.com/apache/spark/pull/52090

There has been an open backport PR to branch-3.5 which didn't pass the CI yet 
on branch-3.5. So, only Apache Spark 4.1.0 and 4.0.1 has the fix for data loss.

- https://github.com/apache/spark/pull/52131
[SPARK-53342][SQL][3.5] Fix Arrow converter to handle multiple record batches 
in single IPC stream

Second, on top of that, there exist other security and correctness patches in 
`branch-4.1` which means 4.1.0 is better than 4.0.1.

In short, I believe 4.1.0 is a better choice in the security and correctness 
perspective for all existing live releases.

Thanks,
Dongjoon.

On 2025/12/17 01:22:05 Holden Karau wrote:
> We can also mitigate the harm with release notes that feature X has known
> issues and are defualt to off and we don't recommend enabling them
> (thinking about MERGE INTO specifically here). For the memory leak we can
> also document it in the release notes and say we don't recommend upgrading
> to the this version for Spark Connect users?
> 
> On Tue, Dec 16, 2025 at 5:14 PM Dongjoon Hyun <[email protected]> wrote:
> 
> > Hi, Mark.
> >
> > Apache Spark 4.1.0 RC3 vote passed according to the ASF policy with the
> > majority rule.
> >
> > However, your comment is also a reasonable argument. I welcome any high
> > level discussion about that always independently (as we did in the mailing
> > before). Let's use the on-going thread like the following.
> >
> > [DISCUSS] SPIP: Accelerating Apache Spark Release Cadence
> > https://lists.apache.org/thread/31rx0xmwbhloljsowc9ksl6kk5vbb3r4
> >
> > To other people, Apache Spark community receives 3296 patches until now.
> > Roughly 9 commits per day. And, we backport many bug fixes to
> > branch-4.1/4.0/3.5. That's the reason why ASF and Apache Spark community
> > carefully pre-defined the release blockers like Jira Issue Priority
> > "Blocker" or regression check.
> >
> > $ git log --oneline --since=2025-01-01 | wc -l
> >     3295
> >
> > I believe "Having a release is better than no release at all". Apache
> > Spark 4.1.1 is already starting on branch-4.1.
> >
> > Sincerely,
> > Dongjoon.
> >
> >
> > On 2025/12/16 18:17:00 Mark Hamstra wrote:
> > > On a little higher level, not restricted to just this issue/PR, there
> > > is a distinct difference between "if there is no regression, then we
> > > can release without fixing the issue" and "if there is no regression,
> > > then we must release without fixing the issue". I don't believe that
> > > the latter has ever been established as agreed upon policy in the
> > > Spark project. I also don't believe that it is a good policy: there
> > > are issues worth taking the time to fix (or at least carefully
> > > discuss) even if they are not regressions.
> > >
> > > On Tue, Dec 16, 2025 at 5:54 AM Herman van Hovell via dev
> > > <[email protected]> wrote:
> > > >
> > > > Dongjoon,
> > > >
> > > > I have a couple of problems with this course of action:
> > > >
> > > > You seem to be favoring speed over quality here. Even if my vote were
> > erroneous, you should give me more than two hours to respond. This is a
> > global community, not everyone is awake at the same time. As far as I know
> > we try to follow a consensus driven decision making process here; this
> > seems to be diametrically opposed to that.
> > > > The problem itself is serious since it can cause driver crashes. In
> > general I believe that we should not be in the business of shipping
> > obviously broken things. The only thing you are doing now is increase toil
> > by forcing us to release a patch version almost immediately.
> > > > The offending change was backported to a maintenance release. That is
> > something different than it being a previously known problem.
> > > > I am not sure I follow the PR argument. You merged my initial PR
> > without even checking in with me. That PR fixed the issue, it just needed
> > proper tests and some touch-ups (again quality is important). I open a
> > follow-up that contains proper testing, and yes this fails because of a
> > change in error types, it happens, I will fix it. The statement that we
> > don't have a fix is untrue, the fact that you state otherwise makes me
> > seriously doubt your judgement here. You could have asked me or someone
> > else, you could have leaned in and checked it yourself.
> > > >
> > > > I would like to understand why there is such a rush here.
> > > >
> > > > Kind regards,
> > > > Herman
> > > >
> > > > On Tue, Dec 16, 2025 at 7:27 AM Dongjoon Hyun <[email protected]>
> > wrote:
> > > >>
> > > >> After rechecking, this vote passed.
> > > >>
> > > >> I'll send a vote result email.
> > > >>
> > > >> Dongjoon.
> > > >>
> > > >> On 2025/12/16 11:03:39 Dongjoon Hyun wrote:
> > > >> > Hi, All.
> > > >> >
> > > >> > I've been working with Herman's PRs so far.
> > > >> >
> > > >> > As a kind of fact checking, I need to correct two things in RC3
> > thread.
> > > >> >
> > > >> > First, Herman claimed that he found a regression of Apache Spark
> > 4.1.0, but actually it's not true because Apache Spark 4.0.1 also has
> > SPARK-53342 since 2025-09-06.
> > > >> >
> > > >> > Second, although Herman shared us a patch since last Friday, Herman
> > also made another PR containing the main code change 9 hours ago. In
> > addition, unfortunately, it also didn't pass our CIs yet. It simply means
> > that there is no complete patch yet in the community for both Apache Spark
> > 4.1.0 and 4.0.2.
> > > >> >
> > > >> > https://github.com/apache/spark/pull/53480
> > > >> > ([SPARK-54696][CONNECT] Clean-up Arrow Buffers - follow-up)
> > > >> >
> > > >> > In short, he seems to block RC3 as a mistake. I'm re-checking the
> > situation around RC3 vote and `branch-4.1` situation.
> > > >> >
> > > >> > Dongjoon.
> > > >> >
> > > >> > > > >
> > > >> > > > > On 2025/12/15 14:59:32 Herman van Hovell via dev wrote:
> > > >> > > > > > I pasted a non-existing link for the root cause. The actual
> > link is here:
> > > >> > > > > > https://issues.apache.org/jira/browse/SPARK-53342
> > > >> > > > > >
> > > >> > > > > >
> > > >> > > > > > On Mon, Dec 15, 2025 at 10:47 AM Herman van Hovell <
> > > >> > > > > [email protected]>
> > > >> > > > > > wrote:
> > > >> > > > > >
> > > >> > > > > > > Hey Dongjoon,
> > > >> > > > > > >
> > > >> > > > > > > Regarding your questions.
> > > >> > > > > > >
> > > >> > > > > > >    1. If you define a large-ish local relation (which
> > makes us cache it
> > > >> > > > > > >    on the serverside) and keep using it, then leak
> > off-heap memory
> > > >> > > > > every time
> > > >> > > > > > >    it is being used. At some point the OS will OOM kill
> > the driver.
> > > >> > > > > While I
> > > >> > > > > > >    have a repro, testing it like this in CI is not a good
> > idea. As an
> > > >> > > > > > >    alternative I am working on a test that checks buffer
> > clean-up.For
> > > >> > > > > the
> > > >> > > > > > >    record I don't appreciate the term `claim` here; I am
> > not blocking a
> > > >> > > > > > >    release without genuine concern.
> > > >> > > > > > >    2. The root cause is
> > > >> > > > > > >    https://databricks.atlassian.net/browse/SPARK-53342
> > and not the
> > > >> > > > > large
> > > >> > > > > > >    local relations work.
> > > >> > > > > > >    3. A PR has been open since Friday:
> > > >> > > > > > >    https://github.com/apache/spark/pull/53452. I hope
> > that I can get
> > > >> > > > > it
> > > >> > > > > > >    merged today.
> > > >> > > > > > >    4. I don't see a reason why.
> > > >> > > > > > >
> > > >> > > > > > > Cheers,
> > > >> > > > > > > Herman
> > > >> > > > > > >
> > > >> > > > > > > On Mon, Dec 15, 2025 at 5:47 AM Dongjoon Hyun <
> > [email protected]>
> > > >> > > > > wrote:
> > > >> > > > > > >
> > > >> > > > > > >> How can we verify the regression, Herman?
> > > >> > > > > > >>
> > > >> > > > > > >> It's a little difficult for me to evaluate your claim so
> > far due to
> > > >> > > > > the
> > > >> > > > > > >> lack of the shared information. Specifically, there is
> > no update for
> > > >> > > > > last 3
> > > >> > > > > > >> days on "SPARK-54696 (Spark Connect LocalRelation
> > support leak
> > > >> > > > > off-heap
> > > >> > > > > > >> memory)" after you created it.
> > > >> > > > > > >>
> > > >> > > > > > >> Could you provide us more technical information about
> > your Spark
> > > >> > > > > Connect
> > > >> > > > > > >> issue?
> > > >> > > > > > >>
> > > >> > > > > > >> 1. How can we reproduce your claim? Do you have a test
> > case?
> > > >> > > > > > >>
> > > >> > > > > > >> 2. For the root cause, I'm wondering if you are saying
> > literally
> > > >> > > > > > >> SPARK-53917 (Support large local relations) or another
> > JIRA issue.
> > > >> > > > > Which
> > > >> > > > > > >> commit is the root cause?
> > > >> > > > > > >>
> > > >> > > > > > >> 3. Since you assigned SPARK-54696 to yourself for last 3
> > days, do you
> > > >> > > > > > >> want to provide a PR soon?
> > > >> > > > > > >>
> > > >> > > > > > >> 4. If you need more time, shall we simply revert the
> > root cause from
> > > >> > > > > > >> Apache Spark 4.1.0 ?
> > > >> > > > > > >>
> > > >> > > > > > >> Thanks,
> > > >> > > > > > >> Dongjoon
> > > >> > > > > > >>
> > > >> > > > > > >> On 2025/12/14 23:29:59 Herman van Hovell via dev wrote:
> > > >> > > > > > >> > Yes. It is a regression in Spark 4.1. The root cause
> > is a change
> > > >> > > > > where
> > > >> > > > > > >> we
> > > >> > > > > > >> > fail to clean-up allocated (off-heap) buffers.
> > > >> > > > > > >> >
> > > >> > > > > > >> > On Sun, Dec 14, 2025 at 4:25 AM Dongjoon Hyun <
> > [email protected]>
> > > >> > > > > > >> wrote:
> > > >> > > > > > >> >
> > > >> > > > > > >> > > Hi, Herman.
> > > >> > > > > > >> > >
> > > >> > > > > > >> > > Do you mean that is a regression at Apache Spark
> > 4.1.0?
> > > >> > > > > > >> > >
> > > >> > > > > > >> > > If then, do you know what was the root cause?
> > > >> > > > > > >> > >
> > > >> > > > > > >> > > Dongjoon.
> > > >> > > > > > >> > >
> > > >> > > > > > >> > > On 2025/12/13 23:09:02 Herman van Hovell via dev
> > wrote:
> > > >> > > > > > >> > > > -1. We need to get
> > > >> > > > > > >> https://issues.apache.org/jira/browse/SPARK-54696
> > > >> > > > > > >> > > fixed.
> > > >> > > > > > >> > > >
> > > >> > > > > > >> > > > On Sat, Dec 13, 2025 at 11:07 AM Jules Damji <
> > > >> > > > > [email protected]
> > > >> > > > > > >> >
> > > >> > > > > > >> > > wrote:
> > > >> > > > > > >> > > >
> > > >> > > > > > >> > > > > +1 non-binding
> > > >> > > > > > >> > > > > —
> > > >> > > > > > >> > > > > Sent from my iPhone
> > > >> > > > > > >> > > > > Pardon the dumb thumb typos :)
> > > >> > > > > > >> > > > >
> > > >> > > > > > >> > > > > > On Dec 11, 2025, at 8:34 AM,
> > [email protected] wrote:
> > > >> > > > > > >> > > > > >
> > > >> > > > > > >> > > > > > Please vote on releasing the following
> > candidate as Apache
> > > >> > > > > > >> Spark
> > > >> > > > > > >> > > > > version 4.1.0.
> > > >> > > > > > >> > > > > >
> > > >> > > > > > >> > > > > > The vote is open until Sun, 14 Dec 2025
> > 09:34:31 PST and
> > > >> > > > > passes
> > > >> > > > > > >> if a
> > > >> > > > > > >> > > > > majority +1 PMC votes are cast, with
> > > >> > > > > > >> > > > > > a minimum of 3 +1 votes.
> > > >> > > > > > >> > > > > >
> > > >> > > > > > >> > > > > > [ ] +1 Release this package as Apache Spark
> > 4.1.0
> > > >> > > > > > >> > > > > > [ ] -1 Do not release this package because ...
> > > >> > > > > > >> > > > > >
> > > >> > > > > > >> > > > > > To learn more about Apache Spark, please see
> > > >> > > > > > >> > > https://spark.apache.org/
> > > >> > > > > > >> > > > > >
> > > >> > > > > > >> > > > > > The tag to be voted on is v4.1.0-rc3 (commit
> > e221b56be7b):
> > > >> > > > > > >> > > > > >
> > https://github.com/apache/spark/tree/v4.1.0-rc3
> > > >> > > > > > >> > > > > >
> > > >> > > > > > >> > > > > > The release files, including signatures,
> > digests, etc. can
> > > >> > > > > be
> > > >> > > > > > >> found
> > > >> > > > > > >> > > at:
> > > >> > > > > > >> > > > > >
> > > >> > > > > https://dist.apache.org/repos/dist/dev/spark/v4.1.0-rc3-bin/
> > > >> > > > > > >> > > > > >
> > > >> > > > > > >> > > > > > Signatures used for Spark RCs can be found in
> > this file:
> > > >> > > > > > >> > > > > > https://downloads.apache.org/spark/KEYS
> > > >> > > > > > >> > > > > >
> > > >> > > > > > >> > > > > > The staging repository for this release can be
> > found at:
> > > >> > > > > > >> > > > > >
> > > >> > > > > > >> > >
> > > >> > > > > > >>
> > > >> > > > >
> > https://repository.apache.org/content/repositories/orgapachespark-1508/
> > > >> > > > > > >> > > > > >
> > > >> > > > > > >> > > > > > The documentation corresponding to this
> > release can be
> > > >> > > > > found at:
> > > >> > > > > > >> > > > > >
> > > >> > > > > https://dist.apache.org/repos/dist/dev/spark/v4.1.0-rc3-docs/
> > > >> > > > > > >> > > > > >
> > > >> > > > > > >> > > > > > The list of bug fixes going into 4.1.0 can be
> > found at the
> > > >> > > > > > >> following
> > > >> > > > > > >> > > URL:
> > > >> > > > > > >> > > > > >
> > > >> > > > >
> > https://issues.apache.org/jira/projects/SPARK/versions/12355581
> > > >> > > > > > >> > > > > >
> > > >> > > > > > >> > > > > > FAQ
> > > >> > > > > > >> > > > > >
> > > >> > > > > > >> > > > > > =========================
> > > >> > > > > > >> > > > > > How can I help test this release?
> > > >> > > > > > >> > > > > > =========================
> > > >> > > > > > >> > > > > >
> > > >> > > > > > >> > > > > > If you are a Spark user, you can help us test
> > this release
> > > >> > > > > by
> > > >> > > > > > >> taking
> > > >> > > > > > >> > > > > > an existing Spark workload and running on this
> > release
> > > >> > > > > > >> candidate,
> > > >> > > > > > >> > > then
> > > >> > > > > > >> > > > > > reporting any regressions.
> > > >> > > > > > >> > > > > >
> > > >> > > > > > >> > > > > > If you're working in PySpark you can set up a
> > virtual env
> > > >> > > > > and
> > > >> > > > > > >> install
> > > >> > > > > > >> > > > > > the current RC via "pip install
> > > >> > > > > > >> > > > >
> > > >> > > > > > >> > >
> > > >> > > > > > >>
> > > >> > > > >
> > https://dist.apache.org/repos/dist/dev/spark/v4.1.0-rc3-bin/pyspark-4.1.0.tar.gz
> > > >> > > > > > >> > > > > "
> > > >> > > > > > >> > > > > > and see if anything important breaks.
> > > >> > > > > > >> > > > > > In the Java/Scala, you can add the staging
> > repository to
> > > >> > > > > your
> > > >> > > > > > >> > > project's
> > > >> > > > > > >> > > > > resolvers and test
> > > >> > > > > > >> > > > > > with the RC (make sure to clean up the
> > artifact cache
> > > >> > > > > > >> before/after so
> > > >> > > > > > >> > > > > > you don't end up building with an out of date
> > RC going
> > > >> > > > > forward).
> > > >> > > > > > >> > > > > >
> > > >> > > > > > >> > > > > >
> > > >> > > > > > >>
> > ---------------------------------------------------------------------
> > > >> > > > > > >> > > > > > To unsubscribe e-mail:
> > [email protected]
> > > >> > > > > > >> > > > > >
> > > >> > > > > > >> > > > >
> > > >> > > > > > >> > > > >
> > > >> > > > > > >>
> > ---------------------------------------------------------------------
> > > >> > > > > > >> > > > > To unsubscribe e-mail:
> > [email protected]
> > > >> > > > > > >> > > > >
> > > >> > > > > > >> > > > >
> > > >> > > > > > >> > > >
> > > >> > > > > > >> > >
> > > >> > > > > > >> > >
> > > >> > > > >
> > ---------------------------------------------------------------------
> > > >> > > > > > >> > > To unsubscribe e-mail:
> > [email protected]
> > > >> > > > > > >> > >
> > > >> > > > > > >> > >
> > > >> > > > > > >> >
> > > >> > > > > > >>
> > > >> > > > > > >>
> > ---------------------------------------------------------------------
> > > >> > > > > > >> To unsubscribe e-mail: [email protected]
> > > >> > > > > > >>
> > > >> > > > > > >>
> > > >> > > > > >
> > > >> > > > >
> > > >> > > > >
> > ---------------------------------------------------------------------
> > > >> > > > > To unsubscribe e-mail: [email protected]
> > > >> > > > >
> > > >> > > > >
> > > >> > > >
> > > >> > >
> > > >> > >
> > ---------------------------------------------------------------------
> > > >> > > To unsubscribe e-mail: [email protected]
> > > >> > >
> > > >> > >
> > > >> >
> > > >> >
> > ---------------------------------------------------------------------
> > > >> > To unsubscribe e-mail: [email protected]
> > > >> >
> > > >> >
> > > >>
> > > >> ---------------------------------------------------------------------
> > > >> To unsubscribe e-mail: [email protected]
> > > >>
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe e-mail: [email protected]
> > >
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe e-mail: [email protected]
> >
> >
> 
> -- 
> Twitter: https://twitter.com/holdenkarau
> Fight Health Insurance: https://www.fighthealthinsurance.com/
> <https://www.fighthealthinsurance.com/?q=hk_email>
> Books (Learning Spark, High Performance Spark, etc.):
> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
> Pronouns: she/her
> 

---------------------------------------------------------------------
To unsubscribe e-mail: [email protected]

Re: [VOTE] Release Spark 4.1.0 (RC3)

Reply via email to