Hi Everyone,

I would like to report a performance regression we've identified in Spark 
queries on Iceberg tables stored in cloud storage (tested with GCS), which I 
believe should be addressed in the 1.11.0 release.

Current SerializableFileIOWithSize drops file length, causing performance 
regression due to excessive metadata calls in Cloud Storage: 
https://github.com/apache/iceberg/ssues/16283. The fix overrides InputFile 
newInputFile(String path, long length) to preserve file length and avoid 
unwanted metadata calls https://github.com/apache/iceberg/pull/16284


On 2026/05/08 15:27:05 Péter Váry wrote:
> Just to clarify:
> 
> The following PRs are already merged to 1.11.0:
> 
>    - https://github.com/apache/iceberg/pull/14297 - Spark: Support writing
>    shredded variant in Iceberg-Spark
>    - https://github.com/apache/iceberg/pull/15512 - Spark: fix delete from
>    branch for canDeleteWhere where it does not resolve to the correct branch -
>    WAP fix
>    - https://github.com/apache/iceberg/pull/15475 - Flink: Add Nanosecond
>    Precision Support for Flink-Iceberg Integration
> 
> 
> The missing ones are the ones backporting those to other engine versions:
> 
>    - For: 14297 <https://github.com/apache/iceberg/pull/14297>:
>       - 16241 <https://github.com/apache/iceberg/pull/16241> - Backport for
>       variant shredding in Spark 4.0
>    - For: 15512 <https://github.com/apache/iceberg/pull/15512>:
>       - 16245 <https://github.com/apache/iceberg/pull/16245> - Spark:
>       backport PR #15512 to v3.4, v3.5, v4.0 for WAP branch delete fix
>    - For: 15475 <https://github.com/apache/iceberg/pull/15475>:
>       - #16183 <https://github.com/apache/iceberg/pull/16183>,  #16239
>       <https://github.com/apache/iceberg/pull/16239>, #16240
>       <https://github.com/apache/iceberg/pull/16240> - Backport for Nano
>       timestamps for Flink 2.0/1.20
> 
> 
> So the PRs needed on 1.11.0 are:
> https://github.com/apache/iceberg/pull/16241
> https://github.com/apache/iceberg/pull/16245
> https://github.com/apache/iceberg/pull/16183
> https://github.com/apache/iceberg/pull/16239
> https://github.com/apache/iceberg/pull/16240
> https://github.com/apache/iceberg/pull/16186
> 
> Aihua Xu <[email protected]> ezt írta (időpont: 2026. máj. 8., P, 17:13):
> 
> > Thank you all for the feedback and for verifying the release candidate.
> > Based on the issues identified above, we will include the following fixes
> > and cut RC2 with a new vote:
> >
> > https://github.com/apache/iceberg/pull/14297
> > https://github.com/apache/iceberg/pull/15512
> > https://github.com/apache/iceberg/pull/15475
> > https://github.com/apache/iceberg/pull/16186
> >
> > Please let me know if you have any questions or identified additional
> > issues.
> >
> > Thanks,
> > Aihua
> >
> > On Thu, May 7, 2026 at 10:09 PM Aihua Xu <[email protected]> wrote:
> >
> >> I also looked into this. There is a configuration
> >> gcs.analytics-core.enabled to enable/disable GCS Analytics Core. The
> >> current implementation always requires runtime dependency of GCS Analytics
> >> Core even if the configuration is off. Ideally we can lazy load such
> >> dependency so the dependency is only required when the feature is
> >> explicitly enabled. But since GCP is likely to enable GCS Analytics Core by
> >> default, I feel it's reasonable for downstream projects using non-bundle
> >> jars to add this dependency.
> >>
> >>
> >> On Thu, May 7, 2026 at 6:54 PM Steven Wu <[email protected]> wrote:
> >>
> >>> Looked a little more.
> >>>
> >>> So Iceberg's cloud modules consistently use compileOnly for vendor SDKs
> >>> and rely on either the bundle artifact or downstream coordination for
> >>> runtime. So, both changes are expected for downstream consumers using the
> >>> non-bundle jars. Maybe we don't need to change anything.
> >>>
> >>> iceberg-gcp module
> >>>
> >>> compileOnly platform(libs.google.libraries.bom)
> >>> compileOnly "com.google.cloud:google-cloud-storage"
> >>> compileOnly "com.google.cloud:google-cloud-kms"
> >>> compileOnly(libs.gcs.analytics.core)
> >>>
> >>>
> >>> On Thu, May 7, 2026 at 6:16 PM Steven Wu <[email protected]> wrote:
> >>>
> >>>> Yuya, thanks for reporting the discovery.
> >>>>
> >>>> Azure: I approved your PR and can merge it soon:
> >>>> https://github.com/apache/iceberg/pull/16186
> >>>> GCP: the new dependency is marked as compileOnly in PR 14333
> >>>> <https://github.com/apache/iceberg/pull/14333>, as it is an opt-in
> >>>> feature. we need to either change the dep to implementation or update the
> >>>> code similar to the Azure fix above.
> >>>>
> >>>>
> >>>> On Thu, May 7, 2026 at 4:07 PM Yuya Ebihara <
> >>>> [email protected]> wrote:
> >>>>
> >>>>> Hi Aihua,
> >>>>>
> >>>>> Thanks for leading the release!
> >>>>>
> >>>>> Just a quick reminder about two dependency-related items from a
> >>>>> downstream perspective:
> >>>>> * Azure module users will require azure-security-keyvault-keys, even
> >>>>> when table encryption is not used, as noted in
> >>>>> https://github.com/apache/iceberg/pull/16186
> >>>>> * GCS module users will require gcs-analytics-core
> >>>>>
> >>>>> I ran into CI failures with 1.11.0 in Trino because the project does
> >>>>> not use the azure-bundle or gcp-bundle modules.
> >>>>> The CI passed once we explicitly added these two dependencies.
> >>>>>
> >>>>> Thanks,
> >>>>> Yuya Ebihara
> >>>>>
> >>>>> On Fri, May 8, 2026 at 4:58 AM Péter Váry <[email protected]>
> >>>>> wrote:
> >>>>>
> >>>>>> First of all, thanks to everyone for the effort put into preparing
> >>>>>> this release!
> >>>>>>
> >>>>>> I would like to highlight that RC1 is built from a branch where the
> >>>>>> following features have not been backported to all engine versions:
> >>>>>> - Spark: Support writing shredded variant in Iceberg-Spark (
> >>>>>> https://github.com/apache/iceberg/pull/14297) - Available in Spark
> >>>>>> 4.1, but not in Spark 4.0
> >>>>>> - Spark: fix delete from branch for canDeleteWhere where it does not
> >>>>>> resolve to the correct branch (
> >>>>>> https://github.com/apache/iceberg/pull/15512) - Available in Spark
> >>>>>> 4.1, but not in Spark 4.0, 3.5, or 3.4
> >>>>>> - Flink: Add Nanosecond Precision Support for Flink-Iceberg
> >>>>>> Integration (https://github.com/apache/iceberg/pull/15475) -
> >>>>>> Available in Flink 2.1, but not in Flink 2.0 or 1.20
> >>>>>>
> >>>>>> It is up to the community to decide whether these missing backports
> >>>>>> should be considered release blockers. Most of the corresponding PRs 
> >>>>>> have
> >>>>>> already been merged to main (except #15512), and including them in the
> >>>>>> release should be relatively straightforward.
> >>>>>>
> >>>>>> From my perspective, I would prefer not to release with these gaps.
> >>>>>> That said, I understand the urgency and the need for a release, and I 
> >>>>>> am
> >>>>>> happy to go with the community’s decision.
> >>>>>>
> >>>>>> Peter
> >>>>>>
> >>>>>> Aihua Xu <[email protected]> ezt írta (időpont: 2026. máj. 7., Cs,
> >>>>>> 18:26):
> >>>>>>
> >>>>>>> Hi Everyone,
> >>>>>>>
> >>>>>>> I propose that we release the following RC as the official Apache
> >>>>>>> Iceberg 1.11.0 release.
> >>>>>>>
> >>>>>>> The commit ID is 0f657edf12dc29f8487a679bfdd4210e9588d014
> >>>>>>> * This corresponds to the tag: apache-iceberg-1.11.0-rc1
> >>>>>>> *
> >>>>>>> https://github.com/apache/iceberg/commits/apache-iceberg-1.11.0-rc1
> >>>>>>> *
> >>>>>>> https://github.com/apache/iceberg/tree/0f657edf12dc29f8487a679bfdd4210e9588d014
> >>>>>>>
> >>>>>>> The release tarball, signature, and checksums are here:
> >>>>>>> *
> >>>>>>> https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-1.11.0-rc1
> >>>>>>>
> >>>>>>> You can find the KEYS file here:
> >>>>>>> * https://downloads.apache.org/iceberg/KEYS
> >>>>>>>
> >>>>>>> Convenience binary artifacts are staged on Nexus. The Maven
> >>>>>>> repository URL is:
> >>>>>>> *
> >>>>>>> https://repository.apache.org/content/repositories/orgapacheiceberg-1278/
> >>>>>>>
> >>>>>>> Please download, verify, and test.
> >>>>>>>
> >>>>>>> Instructions for verifying a release can be found here:
> >>>>>>> * https://iceberg.apache.org/how-to-release/#how-to-verify-a-release
> >>>>>>>
> >>>>>>> Please vote in the next 72 hours.
> >>>>>>>
> >>>>>>> [ ] +1 Release this as Apache Iceberg 1.11.0
> >>>>>>> [ ] +0
> >>>>>>> [ ] -1 Do not release this because...
> >>>>>>>
> >>>>>>> Only PMC members have binding votes, but other community members are
> >>>>>>> encouraged to cast
> >>>>>>> non-binding votes. This vote will pass if there are 3 binding +1
> >>>>>>> votes and more binding
> >>>>>>> +1 votes than -1 votes.
> >>>>>>>
> >>>>>>>
> 

Reply via email to