Thanks to Fokko for identifying the root cause. I checked checksum and signature and ran unit tests. Also check Iceberg tests.
+1 (non-binding). On Tue, Sep 2, 2025 at 7:58 AM Fokko Driesprong <fo...@apache.org> wrote: > +1 (binding) > > All green <https://github.com/apache/iceberg/pull/13941>. Checked the > signatures, checksum and licenses. Thanks Gang for running this release! > > Kind regards, > Fokko > > Op di 2 sep 2025 om 15:03 schreef Fokko Driesprong <fo...@apache.org>: > > > Ok, ran the bisect: > > > > *➜ **parquet-java* *git:(**d5f86d7c**) **✗* git bisect bad > > > > > > d5f86d7c0e9894510e8af6dfd37444843e6d1bc4 is the first bad commit > > > > commit d5f86d7c0e9894510e8af6dfd37444843e6d1bc4 > > > > Author: Gang Wu <ust...@gmail.com> > > > > Date: Tue Jan 21 16:18:19 2025 +0800 > > > > > > GH-3133: Fix SizeStatistics to handle omitted histogram (#3134) > > > > > > .../apache/parquet/column/statistics/SizeStatistics.java | 6 ++++-- > > > > .../parquet/column/statistics/TestSizeStatistics.java | 16 > > ++++++++++++++++ > > > > .../format/converter/ParquetMetadataConverter.java | 10 ++++++++-- > > > > > > And this makes sense to me :) I've created a PR against Trino > > <https://github.com/trinodb/trino/pull/26511>, and got everything > passing with > > some help of Yuya <https://github.com/trinodb/trino/pull/26530>. I see > > some more tests failing at Iceberg > > <https://github.com/apache/iceberg/pull/13941>, which I'll dig into > > before casting my vote. > > > > Kind regards, > > Fokko > > > > > > Op di 2 sep 2025 om 14:30 schreef Fokko Driesprong <fo...@apache.org>: > > > >> Hey Rahul, Aihua, > >> > >> I was looking into the same thing. > >> > >> The PR that you're referring to, was already included since 1.15.0 > >> <https://github.com/apache/parquet-java/commits/apache-parquet-1.15.0>. > >> Iceberg currently uses Parquet 1.15.2 > >> < > https://github.com/apache/iceberg/blob/76ff67c658066bd7d05ce4ce54a1d6340ee0a899/gradle/libs.versions.toml#L80 > >. > >> I don't see anything obvious in the changelog > >> < > https://github.com/apache/parquet-java/releases/tag/apache-parquet-1.16.0-rc2 > > > >> that might have caused the increase in size. Let me do a git bisect to > find > >> out the PR that introduced the change. > >> > >> Kind regards, > >> Fokko > >> > >> Op di 2 sep 2025 om 14:11 schreef Rahul Sharma > >> <rahul.sha...@databricks.com.invalid>: > >> > >>> Hi Aihua, > >>> > >>> Regarding the Iceberg failure, which parquet-java version is the test > >>> passing for? I suspect that the failure might be related to > >>> size-statistics. Could you try running the test with > >>> `parquet.size.statistics.enabled=false`. This flag was added in this PR > >>> <https://github.com/apache/parquet-java/pull/3060>. > >>> > >>> Thanks, > >>> Rahul > >>> > >>> > >>> On Tue, Sep 2, 2025 at 3:07 AM Aihua Xu <aihu...@gmail.com> wrote: > >>> > >>> > Checked checksum and signature and ran unit tests. > >>> > > >>> > I'm also running the tests against Iceberg. Notice one failure > >>> > < > >>> > > >>> > https://github.com/apache/iceberg/blob/main/spark/v3.4/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteDataFilesAction.java#L308 > >>> > > > >>> > that > >>> > is from Iceberg format version 3 that is writing row lineage. Seems > the > >>> > file size increases after the version upgrade and I haven’t yet > >>> pinpointed > >>> > the exact change causing it. But I don't think that is a blocker for > >>> this > >>> > release though. > >>> > > >>> > org.opentest4j.AssertionFailedError: [Did not have the expected > number > >>> of > >>> > files] > >>> > expected: 20 > >>> > but was: 21 > >>> > at > >>> > > >>> > > >>> > org.apache.iceberg.spark.actions.TestRewriteDataFilesAction.shouldHaveFiles(TestRewriteDataFilesAction.java:2144) > >>> > at > >>> > > >>> > > >>> > org.apache.iceberg.spark.actions.TestRewriteDataFilesAction.testBinPackAfterPartitionChange(TestRewriteDataFilesAction.java:321) > >>> > > >>> > > >>> > On Mon, Sep 1, 2025 at 12:16 AM Gábor Szádovszky <ga...@apache.org> > >>> wrote: > >>> > > >>> > > I've checked tarball content, checksum, and signature. Executed > unit > >>> > tests, > >>> > > and also some of our internal tests. All passed. > >>> > > > >>> > > +1 (binding) > >>> > > > >>> > > Gang Wu <ust...@gmail.com> ezt írta (időpont: 2025. aug. 30., Szo, > >>> > 8:47): > >>> > > > >>> > > > Hi everyone, > >>> > > > > >>> > > > I propose the following RC to be released as the official Apache > >>> > Parquet > >>> > > > Java 1.16.0 release. > >>> > > > > >>> > > > The commit id is 402c3810c372d29603e181771acebfecc71bef61 > >>> > > > * This corresponds to the tag: apache-parquet-1.16.0-rc2 > >>> > > > * > >>> > > > > >>> > > > > >>> > > > >>> > > >>> > https://github.com/apache/parquet-java/tree/402c3810c372d29603e181771acebfecc71bef61 > >>> > > > > >>> > > > The release tarball, signature, and checksums are here: > >>> > > > * > >>> > > > >>> > https://dist.apache.org/repos/dist/dev/parquet/apache-parquet-1.16.0-rc2 > >>> > > > > >>> > > > You can find the KEYS file here: > >>> > > > * https://downloads.apache.org/parquet/KEYS > >>> > > > > >>> > > > You can find the changelog here: > >>> > > > * > >>> > > > > >>> > > > > >>> > > > >>> > > >>> > https://github.com/apache/parquet-java/releases/tag/apache-parquet-1.16.0-rc2 > >>> > > > > >>> > > > Binary artifacts are staged in Nexus here: > >>> > > > * > >>> > > > >>> > https://repository.apache.org/content/groups/staging/org/apache/parquet/ > >>> > > > > >>> > > > Please download, verify, and test. > >>> > > > > >>> > > > Please vote in the next 72 hours. > >>> > > > > >>> > > > [ ] +1 Release this as Apache Parquet Java 1.16.0 > >>> > > > [ ] +0 > >>> > > > [ ] -1 Do not release this because... > >>> > > > > >>> > > > Thanks, > >>> > > > Gang > >>> > > > > >>> > > > >>> > > >>> > >> >