Hi, I have perhaps managed to deadlock this process :) I'd appreciate some help untangling it. The recap is in my previous email (below).
thanks, Robin. On Thu, 29 Jan 2026 at 06:20, Robin Moffatt <[email protected]> wrote: > Hi Kevin, > > Just recapping so that I'm clear, cos I'm getting confused :) > I have two related PRs: > > #15124: Add Flink Quickstart docker image > #15062: Add Flink quickstart (which includes the Dockerfile too) > > I can see a few routes forward: > > 1. Merge #15062, fast-follow with #15124 once we're happy with the publish > script (I've not seen anything raised about it yet tho?) > 2. Merge #15124 minus publish script, and then #15062 still relying on > local image build (not sure what this would achieve vs the option above > tho?) > 3. Merge #15124 including publish script, then #15062 using the published > image not the local build > > Either way, one thing that needs resolving is the Dockerfile location: > flink/quickstart (#15062) vs docker/iceberg-flink-quickstart (#15124). > > LMK if I've missed an angle here. > > thanks, Robin > > On Wed, 28 Jan 2026 at 15:57, Kevin Liu <[email protected]> wrote: > >> Thanks for working on this, Robin! It looks like the complexity here is >> publishing the docker image. What do you think about isolating that part? >> (Just move the publish script out of #15124) We can start >> with the Dockerfile definition, which allows us to build locally. This >> should unblock us from merging the getting started docs in #15062 >> Thoughts? >> >> Best, >> Kevin Liu >> >> On Wed, Jan 28, 2026 at 5:57 AM Robin Moffatt via dev < >> [email protected]> wrote: >> >>> Hi, >>> >>> Thanks for the discussion and input. >>> It sounds like there are no major blockers. Could someone please review >>> https://github.com/apache/iceberg/pull/15124 ? >>> >>> thanks, >>> >>> Robin. >>> >>> On Mon, 26 Jan 2026 at 16:36, Kevin Liu <[email protected]> wrote: >>> >>>> Hey folks, >>>> >>>> We have a Dockerfile defined in pyiceberg [1] that uses the Spark base >>>> image and installs all the necessary jars. This is used for our integration >>>> test setup [2] and is inspired by databricks/docker-spark-iceberg [3]. >>>> We've made many improvements such as upgrading to Spark 4, supporting Spark >>>> Connect, and better image build caching. >>>> >>>> This is already self-contained and can be reused by other subprojects. >>>> In fact, iceberg-rust already uses it [4] and I try to keep them in sync. >>>> I think it would be beneficial for the project to publish this image >>>> and something similar for Flink. >>>> >>>> Let me know what you think. >>>> >>>> Best, >>>> Kevin Liu >>>> >>>> >>>> >>>> [1] >>>> https://github.com/apache/iceberg-python/blob/6de6d6acad440885788fb1a24c04ed647b92af0e/dev/spark/Dockerfile >>>> [2] >>>> https://github.com/apache/iceberg-python/blob/6de6d6acad440885788fb1a24c04ed647b92af0e/dev/docker-compose-integration.yml#L20-L21 >>>> [3] >>>> https://github.com/databricks/docker-spark-iceberg/blob/cf617dc29e8672792e76b9bcf6017af52f570020/spark/Dockerfile >>>> [4] >>>> https://github.com/apache/iceberg-rust/blob/330f21da894948fc10b57d541cb2d6f32c8bdbb8/crates/integration_tests/testdata/spark/Dockerfile >>>> >>>> On Mon, Jan 26, 2026 at 10:27 AM Steven Wu <[email protected]> >>>> wrote: >>>> >>>>> > Since the integration code for both Spark and Flink lives in our >>>>> repository, it might make sense to also store the Docker images and the >>>>> corresponding scripts there. >>>>> >>>>> I agree with Peter here. >>>>> >>>>> The previous thread has some concerns if the Iceberg project should >>>>> host those docker images. Not sure if the opinions have changed. >>>>> >>>>> On Mon, Jan 26, 2026 at 2:43 AM Robin Moffatt via dev < >>>>> [email protected]> wrote: >>>>> >>>>>> Thanks Ajantha, I'd not seen that thread. >>>>>> Having looked at it, am I understanding the view to be that ideally >>>>>> Flink would publish a Docker image that included the Iceberg >>>>>> dependencies? >>>>>> >>>>>> However we do this, I feel that the user coming to run the Flink >>>>>> quickstart should not have to build their own Docker image; this adds >>>>>> unnecessary friction that is easily alleviated. >>>>>> >>>>>> If I've understood the situation correctly, then I'm happy to discuss >>>>>> this idea with the Flink community; please let me know before I do so. >>>>>> >>>>>> thanks, Robin. >>>>>> >>>>>> On Fri, 23 Jan 2026 at 16:50, Ajantha Bhat <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Hi Robin and Peter, >>>>>>> >>>>>>> I discussed community-maintained Docker images previously: >>>>>>> https://lists.apache.org/thread/xl1cwq7vmnh6zgfd2vck2nq7dfd33ncq >>>>>>> >>>>>>> The consensus was to publish only the REST fixture Docker image >>>>>>> <https://hub.docker.com/r/apache/iceberg-rest-fixture> (now at >>>>>>> 100K+ total downloads) and use Docker images published by the main >>>>>>> engines >>>>>>> in the quickstart, instead of maintaining these images ourselves. >>>>>>> See the thread above for more details. >>>>>>> >>>>>>> With respect to adding a Flink quickstart page, I’m in favor of >>>>>>> adding it and relying on the Docker images provided by Flink rather than >>>>>>> maintaining our own images. >>>>>>> - Ajantha >>>>>>> >>>>>>> On Fri, Jan 23, 2026 at 9:43 PM Péter Váry < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> Hi Robin, >>>>>>>> It would be nice to separate them. I expect that we will have some >>>>>>>> extra stuff to do with the docker image. For example make sure that we >>>>>>>> have >>>>>>>> ci in place to build it. >>>>>>>> Thanks, >>>>>>>> Peter >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Jan 23, 2026, 16:55 Robin Moffatt via dev < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>>> Thanks for the positive reception of this idea. >>>>>>>>> I've drafted a PR [1] and would appreciate input :) >>>>>>>>> >>>>>>>>> Also, should I keep this and the quickstart PR [2] as separate >>>>>>>>> PRs, or combine them? >>>>>>>>> >>>>>>>>> thanks, Robin. >>>>>>>>> >>>>>>>>> >>>>>>>>> [1] https://github.com/apache/iceberg/pull/15124 >>>>>>>>> [2] https://github.com/apache/iceberg/pull/15062 >>>>>>>>> >>>>>>>>> On Fri, 23 Jan 2026 at 13:58, Jean-Baptiste Onofré < >>>>>>>>> [email protected]> wrote: >>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> This is a great idea. >>>>>>>>>> >>>>>>>>>> If we are moving forward with an "official" Docker image >>>>>>>>>> published by the project, we must ensure it is fully compliant with >>>>>>>>>> ASF >>>>>>>>>> requirements regarding LICENSE/NOTICE files, etc. While this may seem >>>>>>>>>> straightforward, it is a detail that is often overlooked. >>>>>>>>>> >>>>>>>>>> I would be happy to help with this process. >>>>>>>>>> >>>>>>>>>> Regards, >>>>>>>>>> JB >>>>>>>>>> >>>>>>>>>> On Fri, Jan 23, 2026 at 1:52 PM Maximilian Michels < >>>>>>>>>> [email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> Hey Robin, >>>>>>>>>>> >>>>>>>>>>> +1 That's a great idea. It's often a bit painful for new users >>>>>>>>>>> to get >>>>>>>>>>> all the dependencies in the right place. >>>>>>>>>>> >>>>>>>>>>> +1 for building upon the official Flink Docker images: >>>>>>>>>>> https://hub.docker.com/r/apache/flink >>>>>>>>>>> >>>>>>>>>>> -Max >>>>>>>>>>> >>>>>>>>>>> On Fri, Jan 23, 2026 at 12:27 PM Péter Váry < >>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>> > >>>>>>>>>>> > Hi Robin, >>>>>>>>>>> > >>>>>>>>>>> > I would love to see the Flink quickstart image in the Iceberg >>>>>>>>>>> repo. >>>>>>>>>>> > >>>>>>>>>>> > Ajantha was working on the Spark side: >>>>>>>>>>> https://github.com/apache/iceberg/issues/13519 >>>>>>>>>>> > The conclusion was: >>>>>>>>>>> >> >>>>>>>>>>> >> we should both remove the vendor reference and bring this >>>>>>>>>>> back up to date. My preference would be to rely on the Spark image < >>>>>>>>>>> https://hub.docker.com/r/apache/spark> provided by the Apache >>>>>>>>>>> Spark project, similar to what we do for the Hive < >>>>>>>>>>> https://iceberg.apache.org/hive-quickstart/> quickstart. We >>>>>>>>>>> should be able to load all the Iceberg-specific JARs through the >>>>>>>>>>> spark.jars.packages configuration < >>>>>>>>>>> https://spark.apache.org/docs/3.5.1/configuration.html>. >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > Ajantha also added the link to the old dev list thread: >>>>>>>>>>> https://lists.apache.org/thread/4kknk8mvnffbmhdt63z8t4ps0mt1jbf4 >>>>>>>>>>> > >>>>>>>>>>> > Thanks for working on this, >>>>>>>>>>> > Peter >>>>>>>>>>> > >>>>>>>>>>> > Robin Moffatt via dev <[email protected]> ezt írta >>>>>>>>>>> (időpont: 2026. jan. 22., Cs, 19:23): >>>>>>>>>>> >> >>>>>>>>>>> >> Hi, >>>>>>>>>>> >> >>>>>>>>>>> >> Following discussion on the Flink quickstart PR [1], what do >>>>>>>>>>> people think about adding an official quickstart Docker image for >>>>>>>>>>> Flink to >>>>>>>>>>> the project? >>>>>>>>>>> >> At the moment the Spark quickstart uses >>>>>>>>>>> tabulario/spark-iceberg so perhaps that could be brought into the >>>>>>>>>>> project >>>>>>>>>>> too. >>>>>>>>>>> >> >>>>>>>>>>> >> thanks, Robin. >>>>>>>>>>> >> >>>>>>>>>>> >> 1: https://github.com/apache/iceberg/pull/15062 >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>> >>> > >
