" *- distributing libraries with CVE is not a good development practice*"
This version of spark is only a minor upgrade of a maintained branch and we
have a newer release - 4.0 now for users that need that.

For some time ago I updated FasterXML jackson to fix one CVE
https://github.com/apache/spark/pull/40933
It took almost a year before some have to change it, because it break there
system https://github.com/apache/spark/pull/49163

So I recommend not to bump the version of Parquet
If users need a newer version of Parquet they can upgrade spark.

ons. 28. mai 2025 kl. 04:47 skrev Rozov, Vlad <vro...@amazon.com.invalid>:

> I’ll go with the community vote.
>
> My take:
>
> - the backport is already available, so work was already done (if the
> issue is to open PR with the backport, I can help with that)
> - there is no downside of upgrading parquet dependency to 1.15.2 as 4.0.0
> uses upgraded dependency
> - between 1.13 and 1.15 there are bug fixes that Spark users will benefit
> from. I guess that the similar argument applies to ORC upgrade (
> https://github.com/apache/spark/pull/50813).
> - there are confirmed performance improvements that directly impacts Spark
> - 4.0.0 was just released and it will take some time before it is fully
> tested and adopted for production deployments
> - distributing libraries with CVE is not a good development practice
>
> Thank you,
>
> Vlad
>
> On May 27, 2025, at 4:21 PM, Hyukjin Kwon <gurwls...@apache.org> wrote:
>
> I am fine with backporting if we know that the CVEs actually affect Spark.
> Let's check if one of CVEs actually affects Spark, and create a backport if
> so.
> For improvements, it is generally not backported down to old branches
>
> On Wed, 28 May 2025 at 01:17, Rozov, Vlad <vro...@amazon.com.invalid>
> wrote:
>
>> Hi Dongjoon,
>>
>> > I guess you wanted to propose Apache Parquet 1.5.2 backport instead.
>> Correct, that was my question: "Should parquet version be upgraded to
>> 1.15.1 or 1.15.2? There are 10 CVEs in the current 1.13.1 and even though
>> they may not impact Spark there are other improvements (better performance)
>> that will benefit Spark users.”
>>
>> IMO, it will be beneficial for Spark 3.5.x users to have parquet
>> dependency upgraded to 1.15.2. Even if Spark is not directly impacted by
>> the Parquet CVE-2025-46762 and CVE-2025-30065, as Spark distribution
>> installs vulnerable libraries, it may trigger scanner alerts on end user
>> systems.
>>
>> Should your CR be backported to 3.5 branch and included into the next
>> 3.5.7 release? Another option was to undo the revert and bump parquet
>> version from 1.15.1 to 1.15.2.
>>
>> Thank you,
>>
>> Vlad
>>
>> > On May 26, 2025, at 9:16 AM, Dongjoon Hyun <dongj...@apache.org> wrote:
>> >
>> > To Vlad. This is not correct.
>> >
>> >> the revert can now be undone.
>> >
>> > FYI, Parquet 1.5.1 was reverted not only for the deadlock report, but
>> also I got informed that 1.5.1 was turned out to be insufficient and 1.5.2
>> was in progress in the Apache Parquet community.
>> >
>> > I guess you wanted to propose Apache Parquet 1.5.2 backport instead.
>> For the record, I made both the revert commit and the following SPARK-51950
>> PR. As of now, I don't see any valid reason of reverting of the reverted
>> commit of 1.5.1.
>> >
>> > [SPARK-51950][BUILD] Upgrade Parquet to 1.15.2
>> > https://github.com/apache/spark/pull/50755
>> >
>> > Dongjoon.
>> >
>> > On 2025/05/26 03:08:32 "Rozov, Vlad" wrote:
>> >> There is an existing PR that was reverted due to a deadlock. As
>> deadlock is now fixed, the revert can now be undone.
>> >>
>> >> https://github.com/apache/spark/pull/50528
>> >>
>> https://github.com/apache/spark/commit/eb6cc4c9ee17406cd665991489b6619f5c7689ab
>> >> https://github.com/apache/spark/pull/50810
>> >>
>> >> Thank you,
>> >>
>> >> Vlad
>> >>
>> >> On May 25, 2025, at 6:05 PM, Hyukjin Kwon <gurwls...@apache.org>
>> wrote:
>> >>
>> >> Probably should avoid backporting it for improvements but If there is
>> a CVE that directly affects Spark, let's upgrade.
>> >>
>> >> On Mon, 26 May 2025 at 00:27, Rozov, Vlad <vro...@amazon.com.invalid>
>> wrote:
>> >> Should parquet version be upgraded to 1.15.1 or 1.15.2? There are 10
>> CVEs in the current 1.13.1 and even though they may not impact Spark there
>> are other improvements (better performance) that will benefit Spark users.
>> >>
>> >> Thank you,
>> >>
>> >> Vlad
>> >>
>> >> On May 24, 2025, at 8:02 PM, Hyukjin Kwon <gurwls...@apache.org
>> <mailto:gurwls...@apache.org>> wrote:
>> >>
>> >> Oh let me check. Thanks for letting me know.
>> >>
>> >> On Sun, May 25, 2025 at 12:00 PM Dongjoon Hyun <dongj...@apache.org
>> <mailto:dongj...@apache.org>> wrote:
>> >> I saw 38 commits to make this work. Thank you for driving this,
>> Hyukjin.
>> >>
>> >> BTW, your key seems to be new and is not in
>> https://dist.apache.org/repos/dist/dev/spark/KEYS yet. Could you
>> double-check?
>> >>
>> >> $ curl -LO https://dist.apache.org/repos/dist/dev/spark/KEYS
>> >> $ gpg --import KEYS
>> >> $ gpg --verify spark-3.5.6-bin-hadoop3.tgz.asc
>> >> gpg: assuming signed data in 'spark-3.5.6-bin-hadoop3.tgz'
>> >> gpg: Signature made Thu May 22 23:49:54 2025 PDT
>> >> gpg:                using RSA key
>> 0FE4571297AB84440673665669600C8338F65970
>> >> gpg:                issuer "gurwls...@apache.org<mailto:
>> gurwls...@apache.org>"
>> >> gpg: Can't check signature: No public key
>> >>
>> >> Dongjoon.
>> >>
>> >> On 2025/05/23 17:56:25 Allison Wang wrote:
>> >>> +1
>> >>>
>> >>> On Fri, May 23, 2025 at 10:15 AM Hyukjin Kwon <gurwls...@apache.org
>> <mailto:gurwls...@apache.org>> wrote:
>> >>>
>> >>>> Oh it's actually a test and also to release. Let me know if you have
>> any
>> >>>> concern!
>> >>>>
>> >>>> On Fri, May 23, 2025 at 11:25 PM Mridul Muralidharan <
>> mri...@gmail.com<mailto:mri...@gmail.com>>
>> >>>> wrote:
>> >>>>
>> >>>>> Hi Hyukjin,
>> >>>>>
>> >>>>>  This thread is to test the automated release, right ?
>> >>>>> Not to actually release it ?
>> >>>>>
>> >>>>> Regards,
>> >>>>> Mridul
>> >>>>>
>> >>>>> On Fri, May 23, 2025 at 8:26 AM Ruifeng Zheng <ruife...@apache.org
>> <mailto:ruife...@apache.org>>
>> >>>>> wrote:
>> >>>>>
>> >>>>>> +1
>> >>>>>>
>> >>>>>> On Fri, May 23, 2025 at 5:27 PM Hyukjin Kwon <gurwls...@apache.org
>> <mailto:gurwls...@apache.org>>
>> >>>>>> wrote:
>> >>>>>>
>> >>>>>>> Please vote on releasing the following candidate as Apache Spark
>> >>>>>>> version 3.5.6.
>> >>>>>>>
>> >>>>>>> The vote is open until May 27 (PST)  and passes if a majority +1
>> PMC
>> >>>>>>> votes are cast, with
>> >>>>>>> a minimum of 3 +1 votes.
>> >>>>>>>
>> >>>>>>> [ ] +1 Release this package as Apache Spark 3.5.6
>> >>>>>>> [ ] -1 Do not release this package because ...
>> >>>>>>>
>> >>>>>>> To learn more about Apache Spark, please see
>> https://spark.apache.org/
>> >>>>>>>
>> >>>>>>> The tag to be voted on is v3.5.6-rc5 (commit
>> >>>>>>> 303c18c74664f161b9b969ac343784c088b47593):
>> >>>>>>>
>> >>>>>>>
>> https://github.com/apache/spark/tree/303c18c74664f161b9b969ac343784c088b47593
>> >>>>>>>
>> >>>>>>> The release files, including signatures, digests, etc. can be
>> found at:
>> >>>>>>> https://dist.apache.org/repos/dist/dev/spark/v3.5.6-rc1-bin/
>> >>>>>>>
>> >>>>>>> Signatures used for Spark RCs can be found in this file:
>> >>>>>>> https://dist.apache.org/repos/dist/dev/spark/KEYS
>> >>>>>>>
>> >>>>>>> The staging repository for this release can be found at:
>> >>>>>>>
>> https://repository.apache.org/content/repositories/orgapachespark-1495/
>> >>>>>>>
>> >>>>>>> The documentation corresponding to this release can be found at:
>> >>>>>>> https://dist.apache.org/repos/dist/dev/spark/v3.5.6-rc1-docs/
>> >>>>>>>
>> >>>>>>> The list of bug fixes going into 3.5.6 can be found at the
>> following
>> >>>>>>> URL:
>> >>>>>>> https://issues.apache.org/jira/projects/SPARK/versions/12355703
>> >>>>>>>
>> >>>>>>> FAQ
>> >>>>>>>
>> >>>>>>> =========================
>> >>>>>>> How can I help test this release?
>> >>>>>>> =========================
>> >>>>>>>
>> >>>>>>> If you are a Spark user, you can help us test this release by
>> taking
>> >>>>>>> an existing Spark workload and running on this release candidate,
>> then
>> >>>>>>> reporting any regressions.
>> >>>>>>>
>> >>>>>>> If you're working in PySpark you can set up a virtual env and
>> install
>> >>>>>>> the current RC via "pip install
>> >>>>>>>
>> https://dist.apache.org/repos/dist/dev/spark/v3.5.6-rc1-bin/pyspark-3.5.6.tar.gz
>> >>>>>>> "
>> >>>>>>> and see if anything important breaks.
>> >>>>>>> In the Java/Scala, you can add the staging repository to your
>> projects
>> >>>>>>> resolvers and test
>> >>>>>>> with the RC (make sure to clean up the artifact cache
>> before/after so
>> >>>>>>> you don't end up building with a out of date RC going forward).
>> >>>>>>>
>> >>>>>>
>> >>>
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org<mailto:
>> dev-unsubscr...@spark.apache.org>
>> >>
>> >>
>> >>
>> >>
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>> >
>>
>>
>

-- 
Bjørn Jørgensen
Vestre Aspehaug 4, 6010 Ålesund
Norge

+47 480 94 297

Reply via email to