Re: Iceberg 1.10.0 release update - August 2025

Steven Wu Thu, 07 Aug 2025 14:56:25 -0700

edited the subject line as we are into August.

We are still waiting for the following two changes for the 1.10.0 release
* Anton's fix for the data frame join using the same snapshot, which will
introduce a slight behavior change in spark 4.0.
* unknown type support.



On Fri, Aug 1, 2025 at 6:56 AM Alexandre Dutra <[email protected]> wrote:

> Hi Steven,
>
> A small regression with S3 signing has been reported to me. The fix is
> simple:
>
> https://github.com/apache/iceberg/pull/13718
>
> Would it be still possible to have it in 1.10 please?
>
> Thanks,
> Alex
>
>
> On Thu, Jul 31, 2025 at 7:19 PM Steven Wu <[email protected]> wrote:
> >
> > Currently, the 1.10.0 milestone have no open PRs
> > https://github.com/apache/iceberg/milestone/54
> >
> > The variant PR was merged this and last week. There are still some
> variant testing related PRs, which are probably not blockers for 1.10.0
> release.
> > * Spark variant read: https://github.com/apache/iceberg/pull/13219
> > * use short strings: https://github.com/apache/iceberg/pull/13284
> >
> > We are still waiting for the following two changes
> > * Anton's fix for the data frame join using the same snapshot, which
> will introduce a slight behavior change in spark 4.0.
> > * unknown type support. Fokko raised a discussion thread on a blocking
> issue.
> >
> > Anything else did I miss?
> >
> >
> >
> > On Sat, Jul 26, 2025 at 5:52 AM Fokko Driesprong <[email protected]>
> wrote:
> >>
> >> Hey all,
> >>
> >> The read path for the UnknownType needs some community discussion. I've
> raised a separate thread. PTAL
> >>
> >> Kind regards from Belgium,
> >> Fokko
> >>
> >> Op za 26 jul 2025 om 00:58 schreef Ryan Blue <[email protected]>:
> >>>
> >>> I thought that we said we wanted to get support out for v3 features in
> this release unless there is some reasonable blocker, like Spark not having
> geospatial types. To me, I think that means we should aim to get variant
> and unknown done so that we have a complete implementation with a major
> engine. And it should not be particularly difficult to get unknown done so
> I'd opt to get it in.
> >>>
> >>> On Fri, Jul 25, 2025 at 11:28 AM Steven Wu <[email protected]>
> wrote:
> >>>>
> >>>> > I believe we also wanted to get in at least the read path for
> UnknownType. Fokko has a WIP PR for that.
> >>>> I thought in the community sync the consensus is that this is not a
> blocker, because it is a new feature implementation. If it is ready, it
> will be included.
> >>>>
> >>>> On Fri, Jul 25, 2025 at 9:43 AM Kevin Liu <[email protected]>
> wrote:
> >>>>>
> >>>>> I think Fokko's OOO. Should we help with that PR?
> >>>>>
> >>>>> On Fri, Jul 25, 2025 at 9:38 AM Eduard Tudenhöfner <
> [email protected]> wrote:
> >>>>>>
> >>>>>> I believe we also wanted to get in at least the read path for
> UnknownType. Fokko has a WIP PR for that.
> >>>>>>
> >>>>>> On Fri, Jul 25, 2025 at 6:13 PM Steven Wu <[email protected]>
> wrote:
> >>>>>>>
> >>>>>>> 3. Spark: fix data frame join based on different versions of the
> same table that may lead to weird results. Anton is working on a fix. It
> requires a small behavior change (table state may be stale up to refresh
> interval). Hence it is better to include it in the 1.10.0 release where
> Spark 4.0 is first supported.
> >>>>>>> 4. Variant support in core and Spark 4.0. Ryan thinks this is very
> close and will prioritize the review.
> >>>>>>>
> >>>>>>> We still have the above two issues pending. 3 doesn't have a PR
> yet. PR for 4 is not associated with the milestone yet.
> >>>>>>>
> >>>>>>> On Fri, Jul 25, 2025 at 9:02 AM Kevin Liu <[email protected]>
> wrote:
> >>>>>>>>
> >>>>>>>> Thanks everyone for the review. The 2 PRs are both merged.
> >>>>>>>> Looks like there's only 1 PR left in the 1.10 milestone :)
> >>>>>>>>
> >>>>>>>> Best,
> >>>>>>>> Kevin Liu
> >>>>>>>>
> >>>>>>>> On Thu, Jul 24, 2025 at 7:44 PM Manu Zhang <
> [email protected]> wrote:
> >>>>>>>>>
> >>>>>>>>> Thanks Kevin. The first change is not in the versioned doc so it
> can be released anytime.
> >>>>>>>>>
> >>>>>>>>> Regards,
> >>>>>>>>> Manu
> >>>>>>>>>
> >>>>>>>>> On Fri, Jul 25, 2025 at 4:21 AM Kevin Liu <[email protected]>
> wrote:
> >>>>>>>>>>
> >>>>>>>>>> The 3 PRs above are merged. Thanks everyone for the review.
> >>>>>>>>>>
> >>>>>>>>>> I've added 2 more PRs to the 1.10 milestone. These are both
> nice-to-haves.
> >>>>>>>>>> - docs: add subpage for REST Catalog Spec in "Specification"
> #13521
> >>>>>>>>>> - REST-Fixture: Ensure strict mode on jdbc catalog for rest
> fixture #13599
> >>>>>>>>>>
> >>>>>>>>>> The first one changes the link for "REST Catalog Spec" on the
> left nav of https://iceberg.apache.org/spec/ from the swagger.io link to
> a dedicated page for IRC.
> >>>>>>>>>> The second one fixes the default behavior of
> `iceberg-rest-fixture` image to align with the general expectation when
> creating a table in a catalog.
> >>>>>>>>>>
> >>>>>>>>>> Please take a look. I would like to have both of these as part
> of the 1.10 release.
> >>>>>>>>>>
> >>>>>>>>>> Best,
> >>>>>>>>>> Kevin Liu
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Wed, Jul 23, 2025 at 1:31 PM Kevin Liu <
> [email protected]> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> Here are the 3 PRs to add corresponding tests.
> >>>>>>>>>>> https://github.com/apache/iceberg/pull/13648
> >>>>>>>>>>> https://github.com/apache/iceberg/pull/13649
> >>>>>>>>>>> https://github.com/apache/iceberg/pull/13650
> >>>>>>>>>>>
> >>>>>>>>>>> I've tagged them with the 1.10 milestone, waiting for CI to
> complete :)
> >>>>>>>>>>>
> >>>>>>>>>>> Best,
> >>>>>>>>>>> Kevin Liu
> >>>>>>>>>>>
> >>>>>>>>>>> On Wed, Jul 23, 2025 at 1:08 PM Steven Wu <
> [email protected]> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> Kevin, thanks for checking that. I will take a look at your
> backport PRs. Can you add them to the 1.10.0 milestone?
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Wed, Jul 23, 2025 at 12:27 PM Kevin Liu <
> [email protected]> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks again for driving this Steven! We're very close!!
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> As mentioned in the community sync today, I wanted to verify
> feature parity between Spark 3.5 and Spark 4.0 for this release.
> >>>>>>>>>>>>> I was able to verify that Spark 3.5 and Spark 4.0 have
> feature parity for this upcoming release. More details in the other devlist
> thread https://lists.apache.org/thread/7x7xcm3y87y81c4grq4nn9gdjd4jm05f
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>> Kevin Liu
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Wed, Jul 23, 2025 at 12:17 PM Steven Wu <
> [email protected]> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Another update on the release.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> The existing blocker PRs are almost done.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> During today's community sync, we identified the following
> issues/PRs to be included in the 1.10.0 release.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> backport of PR 13100 to the main branch. I have created a
> cherry-pick PR for that. There is a one line difference compared to the
> original PR due to the removal of the deprecated RemoveSnapshot class in
> main branch for 1.10.0 target. Amogh has suggested using RemoveSnapshots
> with a single snapshot id, which should be supported by all REST catalog
> servers.
> >>>>>>>>>>>>>> Flink compaction doesn't support row lineage. Fail the
> compaction for V3 tables. I created a PR for that. Will backport after it
> is merged.
> >>>>>>>>>>>>>> Spark: fix data frame join based on different versions of
> the same table that may lead to weird results. Anton is working on a fix.
> It requires a small behavior change (table state may be stale up to refresh
> interval). Hence it is better to include it in the 1.10.0 release where
> Spark 4.0 is first supported.
> >>>>>>>>>>>>>> Variant support in core and Spark 4.0. Ryan thinks this is
> very close and will prioritize the review.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>> steven
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> The 1.10.0 milestone can be found here.
> >>>>>>>>>>>>>> https://github.com/apache/iceberg/milestone/54
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Wed, Jul 16, 2025 at 9:15 AM Steven Wu <
> [email protected]> wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Ajantha/Robin, thanks for the note. We can include the PR
> in the 1.10.0 milestone.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Wed, Jul 16, 2025 at 3:20 AM Robin Moffatt
> <[email protected]> wrote:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Thanks Ajantha. Just to confirm, from a Confluent point
> of view, we will not be able to publish the connector on Confluent Hub
> until this CVE[1] is fixed.
> >>>>>>>>>>>>>>>> Since we would not publish a snapshot build, if the fix
> doesn't make it into 1.10 then we'd have to wait for 1.11 (or a dot release
> of 1.10) to be able to include the connector on Confluent Hub.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Thanks, Robin.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> [1]
> https://github.com/apache/iceberg/issues/10745#issuecomment-3074300861
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On Wed, 16 Jul 2025 at 04:03, Ajantha Bhat <
> [email protected]> wrote:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> I have approached Confluent people to help us publish
> the OSS Kafka Connect Iceberg sink plugin.
> >>>>>>>>>>>>>>>>> It seems we have a CVE from dependency that blocks us
> from publishing the plugin.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Please include the below PR for 1.10.0 release which
> fixes that.
> >>>>>>>>>>>>>>>>> https://github.com/apache/iceberg/pull/13561
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> - Ajantha
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> On Tue, Jul 15, 2025 at 10:48 AM Steven Wu <
> [email protected]> wrote:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> > Engines may model operations as deleting/inserting
> rows or as modifications to rows that preserve row ids.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Manu, I agree this sentence probably lacks some
> context. The first half (as deleting/inserting rows) is probably about the
> row lineage handling with equality deletes, which is described in another
> place.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> "Row lineage does not track lineage for rows updated
> via Equality Deletes, because engines using equality deletes avoid reading
> existing data before writing changes and can't provide the original row ID
> for the new rows. These updates are always treated as if the existing row
> was completely removed and a unique new row was added."
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> On Mon, Jul 14, 2025 at 5:49 PM Manu Zhang <
> [email protected]> wrote:
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Thanks Steven, I missed that part but the following
> sentence is a bit hard to understand (maybe just me)
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Engines may model operations as deleting/inserting
> rows or as modifications to rows that preserve row ids.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Can you please help to explain?
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Steven Wu <[email protected]>于2025年7月15日 周二04:41写道：
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Manu
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> The spec already covers the row lineage carry over
> (for replace)
> >>>>>>>>>>>>>>>>>>>> https://iceberg.apache.org/spec/#row-lineage
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> "When an existing row is moved to a different data
> file for any reason, writers should write _row_id and
> _last_updated_sequence_number according to the following rules:"
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>>>>>>> Steven
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> On Mon, Jul 14, 2025 at 1:38 PM Steven Wu <
> [email protected]> wrote:
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> another update on the release.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> We have one open PR left for the 1.10.0 milestone
> (with 25 closed PRs). Amogh is actively working on the last blocker PR.
> >>>>>>>>>>>>>>>>>>>>> Spark 4.0: Preserve row lineage information on
> compaction
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> I will publish a release candidate after the above
> blocker is merged and backported.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>>>>>>>> Steven
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> On Mon, Jul 7, 2025 at 11:56 PM Manu Zhang <
> [email protected]> wrote:
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> Hi Amogh,
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> Is it defined in the table spec that "replace"
> operation should carry over existing lineage info insteading of assigning
> new IDs? If not, we'd better firstly define it in spec because all engines
> and implementations need to follow it.
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> On Tue, Jul 8, 2025 at 11:44 AM Amogh Jahagirdar <
> [email protected]> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> One other area I think we need to make sure works
> with row lineage before release is data file compaction. At the moment, it
> looks like compaction will read the records from the data files without
> projecting the lineage fields. What this means is that on write of the new
> compacted data files we'd be losing the lineage information. There's no
> data change in a compaction but we do need to make sure the lineage info
> from carried over records is materialized in the newly compacted files so
> they don't get new IDs or inherit the new file sequence number. I'm working
> on addressing this as well, but I'd call this out as a blocker as well.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>> Robin Moffatt
> >>>>>>>>>>>>>>>> Sr. Principal Advisor, Streaming Data Technologies
>

Re: Iceberg 1.10.0 release update - August 2025

Reply via email to