Hi Kevin, Just recapping so that I'm clear, cos I'm getting confused :) I have two related PRs:
#15124: Add Flink Quickstart docker image #15062: Add Flink quickstart (which includes the Dockerfile too) I can see a few routes forward: 1. Merge #15062, fast-follow with #15124 once we're happy with the publish script (I've not seen anything raised about it yet tho?) 2. Merge #15124 minus publish script, and then #15062 still relying on local image build (not sure what this would achieve vs the option above tho?) 3. Merge #15124 including publish script, then #15062 using the published image not the local build Either way, one thing that needs resolving is the Dockerfile location: flink/quickstart (#15062) vs docker/iceberg-flink-quickstart (#15124). LMK if I've missed an angle here. thanks, Robin On Wed, 28 Jan 2026 at 15:57, Kevin Liu <[email protected]> wrote: > Thanks for working on this, Robin! It looks like the complexity here is > publishing the docker image. What do you think about isolating that part? > (Just move the publish script out of #15124) We can start > with the Dockerfile definition, which allows us to build locally. This > should unblock us from merging the getting started docs in #15062 > Thoughts? > > Best, > Kevin Liu > > On Wed, Jan 28, 2026 at 5:57 AM Robin Moffatt via dev < > [email protected]> wrote: > >> Hi, >> >> Thanks for the discussion and input. >> It sounds like there are no major blockers. Could someone please review >> https://github.com/apache/iceberg/pull/15124 ? >> >> thanks, >> >> Robin. >> >> On Mon, 26 Jan 2026 at 16:36, Kevin Liu <[email protected]> wrote: >> >>> Hey folks, >>> >>> We have a Dockerfile defined in pyiceberg [1] that uses the Spark base >>> image and installs all the necessary jars. This is used for our integration >>> test setup [2] and is inspired by databricks/docker-spark-iceberg [3]. >>> We've made many improvements such as upgrading to Spark 4, supporting Spark >>> Connect, and better image build caching. >>> >>> This is already self-contained and can be reused by other subprojects. >>> In fact, iceberg-rust already uses it [4] and I try to keep them in sync. >>> I think it would be beneficial for the project to publish this image and >>> something similar for Flink. >>> >>> Let me know what you think. >>> >>> Best, >>> Kevin Liu >>> >>> >>> >>> [1] >>> https://github.com/apache/iceberg-python/blob/6de6d6acad440885788fb1a24c04ed647b92af0e/dev/spark/Dockerfile >>> [2] >>> https://github.com/apache/iceberg-python/blob/6de6d6acad440885788fb1a24c04ed647b92af0e/dev/docker-compose-integration.yml#L20-L21 >>> [3] >>> https://github.com/databricks/docker-spark-iceberg/blob/cf617dc29e8672792e76b9bcf6017af52f570020/spark/Dockerfile >>> [4] >>> https://github.com/apache/iceberg-rust/blob/330f21da894948fc10b57d541cb2d6f32c8bdbb8/crates/integration_tests/testdata/spark/Dockerfile >>> >>> On Mon, Jan 26, 2026 at 10:27 AM Steven Wu <[email protected]> wrote: >>> >>>> > Since the integration code for both Spark and Flink lives in our >>>> repository, it might make sense to also store the Docker images and the >>>> corresponding scripts there. >>>> >>>> I agree with Peter here. >>>> >>>> The previous thread has some concerns if the Iceberg project should >>>> host those docker images. Not sure if the opinions have changed. >>>> >>>> On Mon, Jan 26, 2026 at 2:43 AM Robin Moffatt via dev < >>>> [email protected]> wrote: >>>> >>>>> Thanks Ajantha, I'd not seen that thread. >>>>> Having looked at it, am I understanding the view to be that ideally >>>>> Flink would publish a Docker image that included the Iceberg dependencies? >>>>> >>>>> However we do this, I feel that the user coming to run the Flink >>>>> quickstart should not have to build their own Docker image; this adds >>>>> unnecessary friction that is easily alleviated. >>>>> >>>>> If I've understood the situation correctly, then I'm happy to discuss >>>>> this idea with the Flink community; please let me know before I do so. >>>>> >>>>> thanks, Robin. >>>>> >>>>> On Fri, 23 Jan 2026 at 16:50, Ajantha Bhat <[email protected]> >>>>> wrote: >>>>> >>>>>> Hi Robin and Peter, >>>>>> >>>>>> I discussed community-maintained Docker images previously: >>>>>> https://lists.apache.org/thread/xl1cwq7vmnh6zgfd2vck2nq7dfd33ncq >>>>>> >>>>>> The consensus was to publish only the REST fixture Docker image >>>>>> <https://hub.docker.com/r/apache/iceberg-rest-fixture> (now at 100K+ >>>>>> total downloads) and use Docker images published by the main engines in >>>>>> the >>>>>> quickstart, instead of maintaining these images ourselves. >>>>>> See the thread above for more details. >>>>>> >>>>>> With respect to adding a Flink quickstart page, I’m in favor of >>>>>> adding it and relying on the Docker images provided by Flink rather than >>>>>> maintaining our own images. >>>>>> - Ajantha >>>>>> >>>>>> On Fri, Jan 23, 2026 at 9:43 PM Péter Váry < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> Hi Robin, >>>>>>> It would be nice to separate them. I expect that we will have some >>>>>>> extra stuff to do with the docker image. For example make sure that we >>>>>>> have >>>>>>> ci in place to build it. >>>>>>> Thanks, >>>>>>> Peter >>>>>>> >>>>>>> >>>>>>> On Fri, Jan 23, 2026, 16:55 Robin Moffatt via dev < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> Thanks for the positive reception of this idea. >>>>>>>> I've drafted a PR [1] and would appreciate input :) >>>>>>>> >>>>>>>> Also, should I keep this and the quickstart PR [2] as separate PRs, >>>>>>>> or combine them? >>>>>>>> >>>>>>>> thanks, Robin. >>>>>>>> >>>>>>>> >>>>>>>> [1] https://github.com/apache/iceberg/pull/15124 >>>>>>>> [2] https://github.com/apache/iceberg/pull/15062 >>>>>>>> >>>>>>>> On Fri, 23 Jan 2026 at 13:58, Jean-Baptiste Onofré <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> This is a great idea. >>>>>>>>> >>>>>>>>> If we are moving forward with an "official" Docker image published >>>>>>>>> by the project, we must ensure it is fully compliant with ASF >>>>>>>>> requirements >>>>>>>>> regarding LICENSE/NOTICE files, etc. While this may seem >>>>>>>>> straightforward, >>>>>>>>> it is a detail that is often overlooked. >>>>>>>>> >>>>>>>>> I would be happy to help with this process. >>>>>>>>> >>>>>>>>> Regards, >>>>>>>>> JB >>>>>>>>> >>>>>>>>> On Fri, Jan 23, 2026 at 1:52 PM Maximilian Michels <[email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hey Robin, >>>>>>>>>> >>>>>>>>>> +1 That's a great idea. It's often a bit painful for new users to >>>>>>>>>> get >>>>>>>>>> all the dependencies in the right place. >>>>>>>>>> >>>>>>>>>> +1 for building upon the official Flink Docker images: >>>>>>>>>> https://hub.docker.com/r/apache/flink >>>>>>>>>> >>>>>>>>>> -Max >>>>>>>>>> >>>>>>>>>> On Fri, Jan 23, 2026 at 12:27 PM Péter Váry < >>>>>>>>>> [email protected]> wrote: >>>>>>>>>> > >>>>>>>>>> > Hi Robin, >>>>>>>>>> > >>>>>>>>>> > I would love to see the Flink quickstart image in the Iceberg >>>>>>>>>> repo. >>>>>>>>>> > >>>>>>>>>> > Ajantha was working on the Spark side: >>>>>>>>>> https://github.com/apache/iceberg/issues/13519 >>>>>>>>>> > The conclusion was: >>>>>>>>>> >> >>>>>>>>>> >> we should both remove the vendor reference and bring this back >>>>>>>>>> up to date. My preference would be to rely on the Spark image < >>>>>>>>>> https://hub.docker.com/r/apache/spark> provided by the Apache >>>>>>>>>> Spark project, similar to what we do for the Hive < >>>>>>>>>> https://iceberg.apache.org/hive-quickstart/> quickstart. We >>>>>>>>>> should be able to load all the Iceberg-specific JARs through the >>>>>>>>>> spark.jars.packages configuration < >>>>>>>>>> https://spark.apache.org/docs/3.5.1/configuration.html>. >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > Ajantha also added the link to the old dev list thread: >>>>>>>>>> https://lists.apache.org/thread/4kknk8mvnffbmhdt63z8t4ps0mt1jbf4 >>>>>>>>>> > >>>>>>>>>> > Thanks for working on this, >>>>>>>>>> > Peter >>>>>>>>>> > >>>>>>>>>> > Robin Moffatt via dev <[email protected]> ezt írta >>>>>>>>>> (időpont: 2026. jan. 22., Cs, 19:23): >>>>>>>>>> >> >>>>>>>>>> >> Hi, >>>>>>>>>> >> >>>>>>>>>> >> Following discussion on the Flink quickstart PR [1], what do >>>>>>>>>> people think about adding an official quickstart Docker image for >>>>>>>>>> Flink to >>>>>>>>>> the project? >>>>>>>>>> >> At the moment the Spark quickstart uses >>>>>>>>>> tabulario/spark-iceberg so perhaps that could be brought into the >>>>>>>>>> project >>>>>>>>>> too. >>>>>>>>>> >> >>>>>>>>>> >> thanks, Robin. >>>>>>>>>> >> >>>>>>>>>> >> 1: https://github.com/apache/iceberg/pull/15062 >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>> >>
