Hi, Mark. Apache Spark 4.1.0 RC3 vote passed according to the ASF policy with the majority rule.
However, your comment is also a reasonable argument. I welcome any high level discussion about that always independently (as we did in the mailing before). Let's use the on-going thread like the following. [DISCUSS] SPIP: Accelerating Apache Spark Release Cadence https://lists.apache.org/thread/31rx0xmwbhloljsowc9ksl6kk5vbb3r4 To other people, Apache Spark community receives 3296 patches until now. Roughly 9 commits per day. And, we backport many bug fixes to branch-4.1/4.0/3.5. That's the reason why ASF and Apache Spark community carefully pre-defined the release blockers like Jira Issue Priority "Blocker" or regression check. $ git log --oneline --since=2025-01-01 | wc -l 3295 I believe "Having a release is better than no release at all". Apache Spark 4.1.1 is already starting on branch-4.1. Sincerely, Dongjoon. On 2025/12/16 18:17:00 Mark Hamstra wrote: > On a little higher level, not restricted to just this issue/PR, there > is a distinct difference between "if there is no regression, then we > can release without fixing the issue" and "if there is no regression, > then we must release without fixing the issue". I don't believe that > the latter has ever been established as agreed upon policy in the > Spark project. I also don't believe that it is a good policy: there > are issues worth taking the time to fix (or at least carefully > discuss) even if they are not regressions. > > On Tue, Dec 16, 2025 at 5:54 AM Herman van Hovell via dev > <[email protected]> wrote: > > > > Dongjoon, > > > > I have a couple of problems with this course of action: > > > > You seem to be favoring speed over quality here. Even if my vote were > > erroneous, you should give me more than two hours to respond. This is a > > global community, not everyone is awake at the same time. As far as I know > > we try to follow a consensus driven decision making process here; this > > seems to be diametrically opposed to that. > > The problem itself is serious since it can cause driver crashes. In general > > I believe that we should not be in the business of shipping obviously > > broken things. The only thing you are doing now is increase toil by forcing > > us to release a patch version almost immediately. > > The offending change was backported to a maintenance release. That is > > something different than it being a previously known problem. > > I am not sure I follow the PR argument. You merged my initial PR without > > even checking in with me. That PR fixed the issue, it just needed proper > > tests and some touch-ups (again quality is important). I open a follow-up > > that contains proper testing, and yes this fails because of a change in > > error types, it happens, I will fix it. The statement that we don't have a > > fix is untrue, the fact that you state otherwise makes me seriously doubt > > your judgement here. You could have asked me or someone else, you could > > have leaned in and checked it yourself. > > > > I would like to understand why there is such a rush here. > > > > Kind regards, > > Herman > > > > On Tue, Dec 16, 2025 at 7:27 AM Dongjoon Hyun <[email protected]> wrote: > >> > >> After rechecking, this vote passed. > >> > >> I'll send a vote result email. > >> > >> Dongjoon. > >> > >> On 2025/12/16 11:03:39 Dongjoon Hyun wrote: > >> > Hi, All. > >> > > >> > I've been working with Herman's PRs so far. > >> > > >> > As a kind of fact checking, I need to correct two things in RC3 thread. > >> > > >> > First, Herman claimed that he found a regression of Apache Spark 4.1.0, > >> > but actually it's not true because Apache Spark 4.0.1 also has > >> > SPARK-53342 since 2025-09-06. > >> > > >> > Second, although Herman shared us a patch since last Friday, Herman also > >> > made another PR containing the main code change 9 hours ago. In > >> > addition, unfortunately, it also didn't pass our CIs yet. It simply > >> > means that there is no complete patch yet in the community for both > >> > Apache Spark 4.1.0 and 4.0.2. > >> > > >> > https://github.com/apache/spark/pull/53480 > >> > ([SPARK-54696][CONNECT] Clean-up Arrow Buffers - follow-up) > >> > > >> > In short, he seems to block RC3 as a mistake. I'm re-checking the > >> > situation around RC3 vote and `branch-4.1` situation. > >> > > >> > Dongjoon. > >> > > >> > > > > > >> > > > > On 2025/12/15 14:59:32 Herman van Hovell via dev wrote: > >> > > > > > I pasted a non-existing link for the root cause. The actual link > >> > > > > > is here: > >> > > > > > https://issues.apache.org/jira/browse/SPARK-53342 > >> > > > > > > >> > > > > > > >> > > > > > On Mon, Dec 15, 2025 at 10:47 AM Herman van Hovell < > >> > > > > [email protected]> > >> > > > > > wrote: > >> > > > > > > >> > > > > > > Hey Dongjoon, > >> > > > > > > > >> > > > > > > Regarding your questions. > >> > > > > > > > >> > > > > > > 1. If you define a large-ish local relation (which makes us > >> > > > > > > cache it > >> > > > > > > on the serverside) and keep using it, then leak off-heap > >> > > > > > > memory > >> > > > > every time > >> > > > > > > it is being used. At some point the OS will OOM kill the > >> > > > > > > driver. > >> > > > > While I > >> > > > > > > have a repro, testing it like this in CI is not a good > >> > > > > > > idea. As an > >> > > > > > > alternative I am working on a test that checks buffer > >> > > > > > > clean-up.For > >> > > > > the > >> > > > > > > record I don't appreciate the term `claim` here; I am not > >> > > > > > > blocking a > >> > > > > > > release without genuine concern. > >> > > > > > > 2. The root cause is > >> > > > > > > https://databricks.atlassian.net/browse/SPARK-53342 and not > >> > > > > > > the > >> > > > > large > >> > > > > > > local relations work. > >> > > > > > > 3. A PR has been open since Friday: > >> > > > > > > https://github.com/apache/spark/pull/53452. I hope that I > >> > > > > > > can get > >> > > > > it > >> > > > > > > merged today. > >> > > > > > > 4. I don't see a reason why. > >> > > > > > > > >> > > > > > > Cheers, > >> > > > > > > Herman > >> > > > > > > > >> > > > > > > On Mon, Dec 15, 2025 at 5:47 AM Dongjoon Hyun > >> > > > > > > <[email protected]> > >> > > > > wrote: > >> > > > > > > > >> > > > > > >> How can we verify the regression, Herman? > >> > > > > > >> > >> > > > > > >> It's a little difficult for me to evaluate your claim so far > >> > > > > > >> due to > >> > > > > the > >> > > > > > >> lack of the shared information. Specifically, there is no > >> > > > > > >> update for > >> > > > > last 3 > >> > > > > > >> days on "SPARK-54696 (Spark Connect LocalRelation support leak > >> > > > > off-heap > >> > > > > > >> memory)" after you created it. > >> > > > > > >> > >> > > > > > >> Could you provide us more technical information about your > >> > > > > > >> Spark > >> > > > > Connect > >> > > > > > >> issue? > >> > > > > > >> > >> > > > > > >> 1. How can we reproduce your claim? Do you have a test case? > >> > > > > > >> > >> > > > > > >> 2. For the root cause, I'm wondering if you are saying > >> > > > > > >> literally > >> > > > > > >> SPARK-53917 (Support large local relations) or another JIRA > >> > > > > > >> issue. > >> > > > > Which > >> > > > > > >> commit is the root cause? > >> > > > > > >> > >> > > > > > >> 3. Since you assigned SPARK-54696 to yourself for last 3 > >> > > > > > >> days, do you > >> > > > > > >> want to provide a PR soon? > >> > > > > > >> > >> > > > > > >> 4. If you need more time, shall we simply revert the root > >> > > > > > >> cause from > >> > > > > > >> Apache Spark 4.1.0 ? > >> > > > > > >> > >> > > > > > >> Thanks, > >> > > > > > >> Dongjoon > >> > > > > > >> > >> > > > > > >> On 2025/12/14 23:29:59 Herman van Hovell via dev wrote: > >> > > > > > >> > Yes. It is a regression in Spark 4.1. The root cause is a > >> > > > > > >> > change > >> > > > > where > >> > > > > > >> we > >> > > > > > >> > fail to clean-up allocated (off-heap) buffers. > >> > > > > > >> > > >> > > > > > >> > On Sun, Dec 14, 2025 at 4:25 AM Dongjoon Hyun > >> > > > > > >> > <[email protected]> > >> > > > > > >> wrote: > >> > > > > > >> > > >> > > > > > >> > > Hi, Herman. > >> > > > > > >> > > > >> > > > > > >> > > Do you mean that is a regression at Apache Spark 4.1.0? > >> > > > > > >> > > > >> > > > > > >> > > If then, do you know what was the root cause? > >> > > > > > >> > > > >> > > > > > >> > > Dongjoon. > >> > > > > > >> > > > >> > > > > > >> > > On 2025/12/13 23:09:02 Herman van Hovell via dev wrote: > >> > > > > > >> > > > -1. We need to get > >> > > > > > >> https://issues.apache.org/jira/browse/SPARK-54696 > >> > > > > > >> > > fixed. > >> > > > > > >> > > > > >> > > > > > >> > > > On Sat, Dec 13, 2025 at 11:07 AM Jules Damji < > >> > > > > [email protected] > >> > > > > > >> > > >> > > > > > >> > > wrote: > >> > > > > > >> > > > > >> > > > > > >> > > > > +1 non-binding > >> > > > > > >> > > > > — > >> > > > > > >> > > > > Sent from my iPhone > >> > > > > > >> > > > > Pardon the dumb thumb typos :) > >> > > > > > >> > > > > > >> > > > > > >> > > > > > On Dec 11, 2025, at 8:34 AM, [email protected] > >> > > > > > >> > > > > > wrote: > >> > > > > > >> > > > > > > >> > > > > > >> > > > > > Please vote on releasing the following candidate > >> > > > > > >> > > > > > as Apache > >> > > > > > >> Spark > >> > > > > > >> > > > > version 4.1.0. > >> > > > > > >> > > > > > > >> > > > > > >> > > > > > The vote is open until Sun, 14 Dec 2025 09:34:31 > >> > > > > > >> > > > > > PST and > >> > > > > passes > >> > > > > > >> if a > >> > > > > > >> > > > > majority +1 PMC votes are cast, with > >> > > > > > >> > > > > > a minimum of 3 +1 votes. > >> > > > > > >> > > > > > > >> > > > > > >> > > > > > [ ] +1 Release this package as Apache Spark 4.1.0 > >> > > > > > >> > > > > > [ ] -1 Do not release this package because ... > >> > > > > > >> > > > > > > >> > > > > > >> > > > > > To learn more about Apache Spark, please see > >> > > > > > >> > > https://spark.apache.org/ > >> > > > > > >> > > > > > > >> > > > > > >> > > > > > The tag to be voted on is v4.1.0-rc3 (commit > >> > > > > > >> > > > > > e221b56be7b): > >> > > > > > >> > > > > > https://github.com/apache/spark/tree/v4.1.0-rc3 > >> > > > > > >> > > > > > > >> > > > > > >> > > > > > The release files, including signatures, digests, > >> > > > > > >> > > > > > etc. can > >> > > > > be > >> > > > > > >> found > >> > > > > > >> > > at: > >> > > > > > >> > > > > > > >> > > > > https://dist.apache.org/repos/dist/dev/spark/v4.1.0-rc3-bin/ > >> > > > > > >> > > > > > > >> > > > > > >> > > > > > Signatures used for Spark RCs can be found in this > >> > > > > > >> > > > > > file: > >> > > > > > >> > > > > > https://downloads.apache.org/spark/KEYS > >> > > > > > >> > > > > > > >> > > > > > >> > > > > > The staging repository for this release can be > >> > > > > > >> > > > > > found at: > >> > > > > > >> > > > > > > >> > > > > > >> > > > >> > > > > > >> > >> > > > > https://repository.apache.org/content/repositories/orgapachespark-1508/ > >> > > > > > >> > > > > > > >> > > > > > >> > > > > > The documentation corresponding to this release can > >> > > > > > >> > > > > > be > >> > > > > found at: > >> > > > > > >> > > > > > > >> > > > > https://dist.apache.org/repos/dist/dev/spark/v4.1.0-rc3-docs/ > >> > > > > > >> > > > > > > >> > > > > > >> > > > > > The list of bug fixes going into 4.1.0 can be found > >> > > > > > >> > > > > > at the > >> > > > > > >> following > >> > > > > > >> > > URL: > >> > > > > > >> > > > > > > >> > > > > https://issues.apache.org/jira/projects/SPARK/versions/12355581 > >> > > > > > >> > > > > > > >> > > > > > >> > > > > > FAQ > >> > > > > > >> > > > > > > >> > > > > > >> > > > > > ========================= > >> > > > > > >> > > > > > How can I help test this release? > >> > > > > > >> > > > > > ========================= > >> > > > > > >> > > > > > > >> > > > > > >> > > > > > If you are a Spark user, you can help us test this > >> > > > > > >> > > > > > release > >> > > > > by > >> > > > > > >> taking > >> > > > > > >> > > > > > an existing Spark workload and running on this > >> > > > > > >> > > > > > release > >> > > > > > >> candidate, > >> > > > > > >> > > then > >> > > > > > >> > > > > > reporting any regressions. > >> > > > > > >> > > > > > > >> > > > > > >> > > > > > If you're working in PySpark you can set up a > >> > > > > > >> > > > > > virtual env > >> > > > > and > >> > > > > > >> install > >> > > > > > >> > > > > > the current RC via "pip install > >> > > > > > >> > > > > > >> > > > > > >> > > > >> > > > > > >> > >> > > > > https://dist.apache.org/repos/dist/dev/spark/v4.1.0-rc3-bin/pyspark-4.1.0.tar.gz > >> > > > > > >> > > > > " > >> > > > > > >> > > > > > and see if anything important breaks. > >> > > > > > >> > > > > > In the Java/Scala, you can add the staging > >> > > > > > >> > > > > > repository to > >> > > > > your > >> > > > > > >> > > project's > >> > > > > > >> > > > > resolvers and test > >> > > > > > >> > > > > > with the RC (make sure to clean up the artifact > >> > > > > > >> > > > > > cache > >> > > > > > >> before/after so > >> > > > > > >> > > > > > you don't end up building with an out of date RC > >> > > > > > >> > > > > > going > >> > > > > forward). > >> > > > > > >> > > > > > > >> > > > > > >> > > > > > > >> > > > > > >> --------------------------------------------------------------------- > >> > > > > > >> > > > > > To unsubscribe e-mail: > >> > > > > > >> > > > > > [email protected] > >> > > > > > >> > > > > > > >> > > > > > >> > > > > > >> > > > > > >> > > > > > >> > > > > > >> --------------------------------------------------------------------- > >> > > > > > >> > > > > To unsubscribe e-mail: > >> > > > > > >> > > > > [email protected] > >> > > > > > >> > > > > > >> > > > > > >> > > > > > >> > > > > > >> > > > > >> > > > > > >> > > > >> > > > > > >> > > > >> > > > > --------------------------------------------------------------------- > >> > > > > > >> > > To unsubscribe e-mail: [email protected] > >> > > > > > >> > > > >> > > > > > >> > > > >> > > > > > >> > > >> > > > > > >> > >> > > > > > >> --------------------------------------------------------------------- > >> > > > > > >> To unsubscribe e-mail: [email protected] > >> > > > > > >> > >> > > > > > >> > >> > > > > > > >> > > > > > >> > > > > --------------------------------------------------------------------- > >> > > > > To unsubscribe e-mail: [email protected] > >> > > > > > >> > > > > > >> > > > > >> > > > >> > > --------------------------------------------------------------------- > >> > > To unsubscribe e-mail: [email protected] > >> > > > >> > > > >> > > >> > --------------------------------------------------------------------- > >> > To unsubscribe e-mail: [email protected] > >> > > >> > > >> > >> --------------------------------------------------------------------- > >> To unsubscribe e-mail: [email protected] > >> > > --------------------------------------------------------------------- > To unsubscribe e-mail: [email protected] > > --------------------------------------------------------------------- To unsubscribe e-mail: [email protected]
