Thank you all for your new feedback on RC3. 

I am concluding this RC3 vote as not passed again and preparing RC4.

RC4 is RC3 + the following patches which landed at branch-4.1 currently. Please 
let me know if you need more patches.

SPARK-54696 Clean-up ArrowBuffers in Connect
SPARK-54686 Relax DSv2 table checks in temp views to allow new top-level columns
SPARK-53991 Enforce KLL_SKETCH_AGG_GET_RANK/QUANTILE arguments are foldable
SPARK-54692 Add python_worker_logs tvf doc to API reference
SPARK-54683 Unify geo and time types blocking
SPARK-54689 Make `org.apache.spark.sql.pipelines` internal package and make 
`EstimatorUtils` private
SPARK-54695 StandaloneDynamicAllocationSuite.syncExecutors should ensure 
executors have fully setup

Dongjoon Hyun.

On 2025/12/15 14:59:32 Herman van Hovell via dev wrote:
> I pasted a non-existing link for the root cause. The actual link is here:
> https://issues.apache.org/jira/browse/SPARK-53342
> 
> 
> On Mon, Dec 15, 2025 at 10:47 AM Herman van Hovell <[email protected]>
> wrote:
> 
> > Hey Dongjoon,
> >
> > Regarding your questions.
> >
> >    1. If you define a large-ish local relation (which makes us cache it
> >    on the serverside) and keep using it, then leak off-heap memory every 
> > time
> >    it is being used. At some point the OS will OOM kill the driver. While I
> >    have a repro, testing it like this in CI is not a good idea. As an
> >    alternative I am working on a test that checks buffer clean-up.For the
> >    record I don't appreciate the term `claim` here; I am not blocking a
> >    release without genuine concern.
> >    2. The root cause is
> >    https://databricks.atlassian.net/browse/SPARK-53342 and not the large
> >    local relations work.
> >    3. A PR has been open since Friday:
> >    https://github.com/apache/spark/pull/53452. I hope that I can get it
> >    merged today.
> >    4. I don't see a reason why.
> >
> > Cheers,
> > Herman
> >
> > On Mon, Dec 15, 2025 at 5:47 AM Dongjoon Hyun <[email protected]> wrote:
> >
> >> How can we verify the regression, Herman?
> >>
> >> It's a little difficult for me to evaluate your claim so far due to the
> >> lack of the shared information. Specifically, there is no update for last 3
> >> days on "SPARK-54696 (Spark Connect LocalRelation support leak off-heap
> >> memory)" after you created it.
> >>
> >> Could you provide us more technical information about your Spark Connect
> >> issue?
> >>
> >> 1. How can we reproduce your claim? Do you have a test case?
> >>
> >> 2. For the root cause, I'm wondering if you are saying literally
> >> SPARK-53917 (Support large local relations) or another JIRA issue. Which
> >> commit is the root cause?
> >>
> >> 3. Since you assigned SPARK-54696 to yourself for last 3 days, do you
> >> want to provide a PR soon?
> >>
> >> 4. If you need more time, shall we simply revert the root cause from
> >> Apache Spark 4.1.0 ?
> >>
> >> Thanks,
> >> Dongjoon
> >>
> >> On 2025/12/14 23:29:59 Herman van Hovell via dev wrote:
> >> > Yes. It is a regression in Spark 4.1. The root cause is a change where
> >> we
> >> > fail to clean-up allocated (off-heap) buffers.
> >> >
> >> > On Sun, Dec 14, 2025 at 4:25 AM Dongjoon Hyun <[email protected]>
> >> wrote:
> >> >
> >> > > Hi, Herman.
> >> > >
> >> > > Do you mean that is a regression at Apache Spark 4.1.0?
> >> > >
> >> > > If then, do you know what was the root cause?
> >> > >
> >> > > Dongjoon.
> >> > >
> >> > > On 2025/12/13 23:09:02 Herman van Hovell via dev wrote:
> >> > > > -1. We need to get
> >> https://issues.apache.org/jira/browse/SPARK-54696
> >> > > fixed.
> >> > > >
> >> > > > On Sat, Dec 13, 2025 at 11:07 AM Jules Damji <[email protected]
> >> >
> >> > > wrote:
> >> > > >
> >> > > > > +1 non-binding
> >> > > > > —
> >> > > > > Sent from my iPhone
> >> > > > > Pardon the dumb thumb typos :)
> >> > > > >
> >> > > > > > On Dec 11, 2025, at 8:34 AM, [email protected] wrote:
> >> > > > > >
> >> > > > > > Please vote on releasing the following candidate as Apache
> >> Spark
> >> > > > > version 4.1.0.
> >> > > > > >
> >> > > > > > The vote is open until Sun, 14 Dec 2025 09:34:31 PST and passes
> >> if a
> >> > > > > majority +1 PMC votes are cast, with
> >> > > > > > a minimum of 3 +1 votes.
> >> > > > > >
> >> > > > > > [ ] +1 Release this package as Apache Spark 4.1.0
> >> > > > > > [ ] -1 Do not release this package because ...
> >> > > > > >
> >> > > > > > To learn more about Apache Spark, please see
> >> > > https://spark.apache.org/
> >> > > > > >
> >> > > > > > The tag to be voted on is v4.1.0-rc3 (commit e221b56be7b):
> >> > > > > > https://github.com/apache/spark/tree/v4.1.0-rc3
> >> > > > > >
> >> > > > > > The release files, including signatures, digests, etc. can be
> >> found
> >> > > at:
> >> > > > > > https://dist.apache.org/repos/dist/dev/spark/v4.1.0-rc3-bin/
> >> > > > > >
> >> > > > > > Signatures used for Spark RCs can be found in this file:
> >> > > > > > https://downloads.apache.org/spark/KEYS
> >> > > > > >
> >> > > > > > The staging repository for this release can be found at:
> >> > > > > >
> >> > >
> >> https://repository.apache.org/content/repositories/orgapachespark-1508/
> >> > > > > >
> >> > > > > > The documentation corresponding to this release can be found at:
> >> > > > > > https://dist.apache.org/repos/dist/dev/spark/v4.1.0-rc3-docs/
> >> > > > > >
> >> > > > > > The list of bug fixes going into 4.1.0 can be found at the
> >> following
> >> > > URL:
> >> > > > > > https://issues.apache.org/jira/projects/SPARK/versions/12355581
> >> > > > > >
> >> > > > > > FAQ
> >> > > > > >
> >> > > > > > =========================
> >> > > > > > How can I help test this release?
> >> > > > > > =========================
> >> > > > > >
> >> > > > > > If you are a Spark user, you can help us test this release by
> >> taking
> >> > > > > > an existing Spark workload and running on this release
> >> candidate,
> >> > > then
> >> > > > > > reporting any regressions.
> >> > > > > >
> >> > > > > > If you're working in PySpark you can set up a virtual env and
> >> install
> >> > > > > > the current RC via "pip install
> >> > > > >
> >> > >
> >> https://dist.apache.org/repos/dist/dev/spark/v4.1.0-rc3-bin/pyspark-4.1.0.tar.gz
> >> > > > > "
> >> > > > > > and see if anything important breaks.
> >> > > > > > In the Java/Scala, you can add the staging repository to your
> >> > > project's
> >> > > > > resolvers and test
> >> > > > > > with the RC (make sure to clean up the artifact cache
> >> before/after so
> >> > > > > > you don't end up building with an out of date RC going forward).
> >> > > > > >
> >> > > > > >
> >> ---------------------------------------------------------------------
> >> > > > > > To unsubscribe e-mail: [email protected]
> >> > > > > >
> >> > > > >
> >> > > > >
> >> ---------------------------------------------------------------------
> >> > > > > To unsubscribe e-mail: [email protected]
> >> > > > >
> >> > > > >
> >> > > >
> >> > >
> >> > > ---------------------------------------------------------------------
> >> > > To unsubscribe e-mail: [email protected]
> >> > >
> >> > >
> >> >
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe e-mail: [email protected]
> >>
> >>
> 

---------------------------------------------------------------------
To unsubscribe e-mail: [email protected]

Reply via email to