I don't think PARQUET-2432 has any issue itself. It looks to have triggered a deadlock case like https://github.com/apache/spark/pull/50594. I'd suggest that we fix forward if possible.
Thanks, Manu On Mon, Apr 21, 2025 at 11:19 PM Rozov, Vlad <vro...@amazon.com.invalid> wrote: > The deadlock is reproducible without Parquet. Please see > https://github.com/apache/spark/pull/50594. > > Thank you, > > Vlad > > On Apr 21, 2025, at 1:59 AM, Cheng Pan <pan3...@gmail.com> wrote: > > The deadlock is introduced by PARQUET-2432(1.14.0), if we decide > downgrade, the latest workable version is Parquet 1.13.1. > > Thanks, > Cheng Pan > > > > On Apr 21, 2025, at 16:53, Wenchen Fan <cloud0...@gmail.com> wrote: > > +1 to downgrade to Parquet 1.15.0 for Spark 4.0. According to > https://github.com/apache/spark/pull/50583#issuecomment-2815243571 , the > Parquet CVE does not affect Spark. > > On Mon, Apr 21, 2025 at 2:45 PM Hyukjin Kwon <gurwls...@apache.org> wrote: > >> That's nice but we need to wait for them to release, and upgrade right? >> Let's revert the parquet upgrade out of 4.0 branch since we're not directly >> affected by the CVE anyway. >> >> On Mon, 21 Apr 2025 at 15:42, Yuming Wang <yumw...@apache.org> wrote: >> >>> It seems this patch(https://github.com/apache/parquet-java/pull/3196) >>> can avoid deadlock issue if using Parquet 1.15.1. >>> >>> On Wed, Apr 16, 2025 at 5:39 PM Niranjan Jayakar >>> <n...@databricks.com.invalid> wrote: >>> >>>> I found another bug introduced in 4.0 that breaks Spark connect client >>>> x server compatibility: https://github.com/apache/spark/pull/50604. >>>> >>>> Once merged, this should be included in the next RC. >>>> >>>> On Thu, Apr 10, 2025 at 5:21 PM Wenchen Fan <cloud0...@gmail.com> >>>> wrote: >>>> >>>>> Please vote on releasing the following candidate as Apache Spark >>>>> version 4.0.0. >>>>> >>>>> The vote is open until April 15 (PST) and passes if a majority +1 PMC >>>>> votes are cast, with a minimum of 3 +1 votes. >>>>> >>>>> [ ] +1 Release this package as Apache Spark 4.0.0 >>>>> [ ] -1 Do not release this package because ... >>>>> >>>>> To learn more about Apache Spark, please see https://spark.apache.org/ >>>>> >>>>> The tag to be voted on is v4.0.0-rc4 (commit >>>>> e0801d9d8e33cd8835f3e3beed99a3588c16b776) >>>>> https://github.com/apache/spark/tree/v4.0.0-rc4 >>>>> >>>>> The release files, including signatures, digests, etc. can be found at: >>>>> https://dist.apache.org/repos/dist/dev/spark/v4.0.0-rc4-bin/ >>>>> >>>>> Signatures used for Spark RCs can be found in this file: >>>>> https://dist.apache.org/repos/dist/dev/spark/KEYS >>>>> >>>>> The staging repository for this release can be found at: >>>>> https://repository.apache.org/content/repositories/orgapachespark-1480/ >>>>> >>>>> The documentation corresponding to this release can be found at: >>>>> https://dist.apache.org/repos/dist/dev/spark/v4.0.0-rc4-docs/ >>>>> >>>>> The list of bug fixes going into 4.0.0 can be found at the following >>>>> URL: >>>>> https://issues.apache.org/jira/projects/SPARK/versions/12353359 >>>>> >>>>> This release is using the release script of the tag v4.0.0-rc4. >>>>> >>>>> FAQ >>>>> >>>>> ========================= >>>>> How can I help test this release? >>>>> ========================= >>>>> >>>>> If you are a Spark user, you can help us test this release by taking >>>>> an existing Spark workload and running on this release candidate, then >>>>> reporting any regressions. >>>>> >>>>> If you're working in PySpark you can set up a virtual env and install >>>>> the current RC and see if anything important breaks, in the Java/Scala >>>>> you can add the staging repository to your projects resolvers and test >>>>> with the RC (make sure to clean up the artifact cache before/after so >>>>> you don't end up building with a out of date RC going forward). >>>>> >>>> > >