Currently, the 1.10.0 milestone have no open PRs https://github.com/apache/iceberg/milestone/54
The variant PR was merged this and last week. There are still some variant testing related PRs, which are probably not blockers for 1.10.0 release. * Spark variant read: https://github.com/apache/iceberg/pull/13219 * use short strings: https://github.com/apache/iceberg/pull/13284 We are still waiting for the following two changes * Anton's fix for the data frame join using the same snapshot, which will introduce a slight behavior change in spark 4.0. * unknown type support. Fokko raised a discussion thread on a blocking issue. Anything else did I miss? On Sat, Jul 26, 2025 at 5:52 AM Fokko Driesprong <fo...@apache.org> wrote: > Hey all, > > The read path for the UnknownType needs some community discussion. I've > raised a separate thread > <https://lists.apache.org/thread/gq9lyndb574ptq7vkz83zgkp1lx7vp5x>. PTAL > > Kind regards from Belgium, > Fokko > > Op za 26 jul 2025 om 00:58 schreef Ryan Blue <rdb...@gmail.com>: > >> I thought that we said we wanted to get support out for v3 features in >> this release unless there is some reasonable blocker, like Spark not having >> geospatial types. To me, I think that means we should aim to get variant >> and unknown done so that we have a complete implementation with a major >> engine. And it should not be particularly difficult to get unknown done so >> I'd opt to get it in. >> >> On Fri, Jul 25, 2025 at 11:28 AM Steven Wu <stevenz...@gmail.com> wrote: >> >>> > I believe we also wanted to get in at least the read path for >>> UnknownType. Fokko has a WIP PR >>> <https://github.com/apache/iceberg/pull/13445> for that. >>> I thought in the community sync the consensus is that this is not a >>> blocker, because it is a new feature implementation. If it is ready, it >>> will be included. >>> >>> On Fri, Jul 25, 2025 at 9:43 AM Kevin Liu <kevinjq...@apache.org> wrote: >>> >>>> I think Fokko's OOO. Should we help with that PR? >>>> >>>> On Fri, Jul 25, 2025 at 9:38 AM Eduard Tudenhöfner < >>>> etudenhoef...@apache.org> wrote: >>>> >>>>> I believe we also wanted to get in at least the read path for >>>>> UnknownType. Fokko has a WIP PR >>>>> <https://github.com/apache/iceberg/pull/13445> for that. >>>>> >>>>> On Fri, Jul 25, 2025 at 6:13 PM Steven Wu <stevenz...@gmail.com> >>>>> wrote: >>>>> >>>>>> 3. Spark: fix data frame join based on different versions of the same >>>>>> table that may lead to weird results. Anton is working on a fix. It >>>>>> requires a small behavior change (table state may be stale up to refresh >>>>>> interval). Hence it is better to include it in the 1.10.0 release where >>>>>> Spark 4.0 is first supported. >>>>>> 4. Variant support in core and Spark 4.0. Ryan thinks this is very >>>>>> close and will prioritize the review. >>>>>> >>>>>> We still have the above two issues pending. 3 doesn't have a PR yet. >>>>>> PR for 4 is not associated with the milestone yet. >>>>>> >>>>>> On Fri, Jul 25, 2025 at 9:02 AM Kevin Liu <kevinjq...@apache.org> >>>>>> wrote: >>>>>> >>>>>>> Thanks everyone for the review. The 2 PRs are both merged. >>>>>>> Looks like there's only 1 PR left in the 1.10 milestone >>>>>>> <https://github.com/apache/iceberg/milestone/54> :) >>>>>>> >>>>>>> Best, >>>>>>> Kevin Liu >>>>>>> >>>>>>> On Thu, Jul 24, 2025 at 7:44 PM Manu Zhang <owenzhang1...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Thanks Kevin. The first change is not in the versioned doc so it >>>>>>>> can be released anytime. >>>>>>>> >>>>>>>> Regards, >>>>>>>> Manu >>>>>>>> >>>>>>>> On Fri, Jul 25, 2025 at 4:21 AM Kevin Liu <kevinjq...@apache.org> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> The 3 PRs above are merged. Thanks everyone for the review. >>>>>>>>> >>>>>>>>> I've added 2 more PRs to the 1.10 milestone. These are both >>>>>>>>> nice-to-haves. >>>>>>>>> - docs: add subpage for REST Catalog Spec in "Specification" >>>>>>>>> #13521 <https://github.com/apache/iceberg/pull/13521> >>>>>>>>> - REST-Fixture: Ensure strict mode on jdbc catalog for rest >>>>>>>>> fixture #13599 <https://github.com/apache/iceberg/pull/13599> >>>>>>>>> >>>>>>>>> The first one changes the link for "REST Catalog Spec" on the left >>>>>>>>> nav of https://iceberg.apache.org/spec/ from the swagger.io link >>>>>>>>> to a dedicated page for IRC. >>>>>>>>> The second one fixes the default behavior of >>>>>>>>> `iceberg-rest-fixture` image to align with the general expectation >>>>>>>>> when >>>>>>>>> creating a table in a catalog. >>>>>>>>> >>>>>>>>> Please take a look. I would like to have both of these as part of >>>>>>>>> the 1.10 release. >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> Kevin Liu >>>>>>>>> >>>>>>>>> >>>>>>>>> On Wed, Jul 23, 2025 at 1:31 PM Kevin Liu <kevinjq...@apache.org> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Here are the 3 PRs to add corresponding tests. >>>>>>>>>> https://github.com/apache/iceberg/pull/13648 >>>>>>>>>> https://github.com/apache/iceberg/pull/13649 >>>>>>>>>> https://github.com/apache/iceberg/pull/13650 >>>>>>>>>> >>>>>>>>>> I've tagged them with the 1.10 milestone, waiting for CI to >>>>>>>>>> complete :) >>>>>>>>>> >>>>>>>>>> Best, >>>>>>>>>> Kevin Liu >>>>>>>>>> >>>>>>>>>> On Wed, Jul 23, 2025 at 1:08 PM Steven Wu <stevenz...@gmail.com> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Kevin, thanks for checking that. I will take a look at your >>>>>>>>>>> backport PRs. Can you add them to the 1.10.0 milestone? >>>>>>>>>>> >>>>>>>>>>> On Wed, Jul 23, 2025 at 12:27 PM Kevin Liu < >>>>>>>>>>> kevinjq...@apache.org> wrote: >>>>>>>>>>> >>>>>>>>>>>> Thanks again for driving this Steven! We're very close!! >>>>>>>>>>>> >>>>>>>>>>>> As mentioned in the community sync today, I wanted to verify >>>>>>>>>>>> feature parity between Spark 3.5 and Spark 4.0 for this release. >>>>>>>>>>>> I was able to verify that Spark 3.5 and Spark 4.0 have feature >>>>>>>>>>>> parity for this upcoming release. More details in the other >>>>>>>>>>>> devlist thread >>>>>>>>>>>> https://lists.apache.org/thread/7x7xcm3y87y81c4grq4nn9gdjd4jm05f >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Kevin Liu >>>>>>>>>>>> >>>>>>>>>>>> On Wed, Jul 23, 2025 at 12:17 PM Steven Wu < >>>>>>>>>>>> stevenz...@gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Another update on the release. >>>>>>>>>>>>> >>>>>>>>>>>>> The existing blocker PRs are almost done. >>>>>>>>>>>>> >>>>>>>>>>>>> During today's community sync, we identified the following >>>>>>>>>>>>> issues/PRs to be included in the 1.10.0 release. >>>>>>>>>>>>> >>>>>>>>>>>>> 1. backport of PR 13100 to the main branch. I have created >>>>>>>>>>>>> a cherry-pick PR >>>>>>>>>>>>> <https://github.com/apache/iceberg/pull/13647> for that. >>>>>>>>>>>>> There is a one line difference compared to the original PR due >>>>>>>>>>>>> to the >>>>>>>>>>>>> removal of the deprecated RemoveSnapshot class in main branch >>>>>>>>>>>>> for 1.10.0 >>>>>>>>>>>>> target. Amogh has suggested using RemoveSnapshots with a >>>>>>>>>>>>> single snapshot >>>>>>>>>>>>> id, which should be supported by all REST catalog servers. >>>>>>>>>>>>> 2. Flink compaction doesn't support row lineage. Fail the >>>>>>>>>>>>> compaction for V3 tables. I created a PR >>>>>>>>>>>>> <https://github.com/apache/iceberg/pull/13646> for that. >>>>>>>>>>>>> Will backport after it is merged. >>>>>>>>>>>>> 3. Spark: fix data frame join based on different versions >>>>>>>>>>>>> of the same table that may lead to weird results. Anton is >>>>>>>>>>>>> working on a >>>>>>>>>>>>> fix. It requires a small behavior change (table state may be >>>>>>>>>>>>> stale up to >>>>>>>>>>>>> refresh interval). Hence it is better to include it in the >>>>>>>>>>>>> 1.10.0 release >>>>>>>>>>>>> where Spark 4.0 is first supported. >>>>>>>>>>>>> 4. Variant support in core and Spark 4.0. Ryan thinks this >>>>>>>>>>>>> is very close and will prioritize the review. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> steven >>>>>>>>>>>>> >>>>>>>>>>>>> The 1.10.0 milestone can be found here. >>>>>>>>>>>>> https://github.com/apache/iceberg/milestone/54 >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Wed, Jul 16, 2025 at 9:15 AM Steven Wu < >>>>>>>>>>>>> stevenz...@gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Ajantha/Robin, thanks for the note. We can include the PR in >>>>>>>>>>>>>> the 1.10.0 milestone. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Wed, Jul 16, 2025 at 3:20 AM Robin Moffatt >>>>>>>>>>>>>> <ro...@confluent.io.invalid> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks Ajantha. Just to confirm, from a Confluent point of >>>>>>>>>>>>>>> view, we will not be able to publish the connector on Confluent >>>>>>>>>>>>>>> Hub until >>>>>>>>>>>>>>> this CVE[1] is fixed. >>>>>>>>>>>>>>> Since we would not publish a snapshot build, if the fix >>>>>>>>>>>>>>> doesn't make it into 1.10 then we'd have to wait for 1.11 (or a >>>>>>>>>>>>>>> dot release >>>>>>>>>>>>>>> of 1.10) to be able to include the connector on Confluent Hub. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, Robin. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>> https://github.com/apache/iceberg/issues/10745#issuecomment-3074300861 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Wed, 16 Jul 2025 at 04:03, Ajantha Bhat < >>>>>>>>>>>>>>> ajanthab...@gmail.com> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I have approached Confluent people >>>>>>>>>>>>>>>> <https://github.com/apache/iceberg/issues/10745#issuecomment-3058281281> >>>>>>>>>>>>>>>> to help us publish the OSS Kafka Connect Iceberg sink plugin. >>>>>>>>>>>>>>>> It seems we have a CVE from dependency that blocks us from >>>>>>>>>>>>>>>> publishing the plugin. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Please include the below PR for 1.10.0 release which fixes >>>>>>>>>>>>>>>> that. >>>>>>>>>>>>>>>> https://github.com/apache/iceberg/pull/13561 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> - Ajantha >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Tue, Jul 15, 2025 at 10:48 AM Steven Wu < >>>>>>>>>>>>>>>> stevenz...@gmail.com> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> > Engines may model operations as deleting/inserting rows >>>>>>>>>>>>>>>>> or as modifications to rows that preserve row ids. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Manu, I agree this sentence probably lacks some context. >>>>>>>>>>>>>>>>> The first half (as deleting/inserting rows) is probably >>>>>>>>>>>>>>>>> about the row lineage handling with equality deletes, which >>>>>>>>>>>>>>>>> is described in >>>>>>>>>>>>>>>>> another place. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> "Row lineage does not track lineage for rows updated via >>>>>>>>>>>>>>>>> Equality >>>>>>>>>>>>>>>>> Deletes >>>>>>>>>>>>>>>>> <https://iceberg.apache.org/spec/#equality-delete-files>, >>>>>>>>>>>>>>>>> because engines using equality deletes avoid reading existing >>>>>>>>>>>>>>>>> data before >>>>>>>>>>>>>>>>> writing changes and can't provide the original row ID for the >>>>>>>>>>>>>>>>> new rows. >>>>>>>>>>>>>>>>> These updates are always treated as if the existing row was >>>>>>>>>>>>>>>>> completely >>>>>>>>>>>>>>>>> removed and a unique new row was added." >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Mon, Jul 14, 2025 at 5:49 PM Manu Zhang < >>>>>>>>>>>>>>>>> owenzhang1...@gmail.com> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks Steven, I missed that part but the following >>>>>>>>>>>>>>>>>> sentence is a bit hard to understand (maybe just me) >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Engines may model operations as deleting/inserting rows >>>>>>>>>>>>>>>>>> or as modifications to rows that preserve row ids. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Can you please help to explain? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Steven Wu <stevenz...@gmail.com>于2025年7月15日 周二04:41写道: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Manu >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> The spec already covers the row lineage carry over (for >>>>>>>>>>>>>>>>>>> replace) >>>>>>>>>>>>>>>>>>> https://iceberg.apache.org/spec/#row-lineage >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> "When an existing row is moved to a different data file >>>>>>>>>>>>>>>>>>> for any reason, writers should write _row_id and >>>>>>>>>>>>>>>>>>> _last_updated_sequence_number according to the >>>>>>>>>>>>>>>>>>> following rules:" >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>> Steven >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Mon, Jul 14, 2025 at 1:38 PM Steven Wu < >>>>>>>>>>>>>>>>>>> stevenz...@gmail.com> wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> another update on the release. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> We have one open PR left for the 1.10.0 milestone >>>>>>>>>>>>>>>>>>>> <https://github.com/apache/iceberg/milestone/54> (with >>>>>>>>>>>>>>>>>>>> 25 closed PRs). Amogh is actively working on the last >>>>>>>>>>>>>>>>>>>> blocker PR. >>>>>>>>>>>>>>>>>>>> Spark 4.0: Preserve row lineage information on >>>>>>>>>>>>>>>>>>>> compaction >>>>>>>>>>>>>>>>>>>> <https://github.com/apache/iceberg/pull/13555> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I will publish a release candidate after the above >>>>>>>>>>>>>>>>>>>> blocker is merged and backported. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>> Steven >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Mon, Jul 7, 2025 at 11:56 PM Manu Zhang < >>>>>>>>>>>>>>>>>>>> owenzhang1...@gmail.com> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Hi Amogh, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Is it defined in the table spec that "replace" >>>>>>>>>>>>>>>>>>>>> operation should carry over existing lineage info >>>>>>>>>>>>>>>>>>>>> insteading of assigning >>>>>>>>>>>>>>>>>>>>> new IDs? If not, we'd better firstly define it in spec >>>>>>>>>>>>>>>>>>>>> because all engines >>>>>>>>>>>>>>>>>>>>> and implementations need to follow it. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On Tue, Jul 8, 2025 at 11:44 AM Amogh Jahagirdar < >>>>>>>>>>>>>>>>>>>>> 2am...@gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> One other area I think we need to make sure works >>>>>>>>>>>>>>>>>>>>>> with row lineage before release is data file compaction. >>>>>>>>>>>>>>>>>>>>>> At >>>>>>>>>>>>>>>>>>>>>> the moment, >>>>>>>>>>>>>>>>>>>>>> <https://github.com/apache/iceberg/blob/main/spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/SparkBinPackFileRewriteRunner.java#L44> >>>>>>>>>>>>>>>>>>>>>> it >>>>>>>>>>>>>>>>>>>>>> looks like compaction will read the records from the >>>>>>>>>>>>>>>>>>>>>> data files without >>>>>>>>>>>>>>>>>>>>>> projecting the lineage fields. What this means is that >>>>>>>>>>>>>>>>>>>>>> on write of the new >>>>>>>>>>>>>>>>>>>>>> compacted data files we'd be losing the lineage >>>>>>>>>>>>>>>>>>>>>> information. There's no >>>>>>>>>>>>>>>>>>>>>> data change in a compaction but we do need to make sure >>>>>>>>>>>>>>>>>>>>>> the lineage info >>>>>>>>>>>>>>>>>>>>>> from carried over records is materialized in the newly >>>>>>>>>>>>>>>>>>>>>> compacted files so >>>>>>>>>>>>>>>>>>>>>> they don't get new IDs or inherit the new file sequence >>>>>>>>>>>>>>>>>>>>>> number. I'm working >>>>>>>>>>>>>>>>>>>>>> on addressing this as well, but I'd call this out as a >>>>>>>>>>>>>>>>>>>>>> blocker as well. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> *Robin Moffatt* >>>>>>>>>>>>>>> *Sr. Principal Advisor, Streaming Data Technologies* >>>>>>>>>>>>>>> >>>>>>>>>>>>>>