Hi Peter,

Thanks for the direction. I'll remove the publish step so that we can get
the quickstart published, and then work on the publishing subsequently.

Do you think the Dockerfile is best kept in flink/quickstart, or
docker/iceberg-flink-quickstart ?

thanks, Robin

On Thu, 5 Feb 2026 at 16:10, Péter Váry <[email protected]> wrote:

> I think we have two options:
>
>    1. Remove the image publication from this PR (
>    https://github.com/apache/iceberg/pull/15124) for now, and proceed
>    with adding the Docker image and updating the documentation.
>    2. Alternatively, we could discuss publishing the Flink quickstart
>    image at the next Iceberg Community Sync and use that as an opportunity to
>    simplify both the documentation and the overall user experience.
>
>
>
>
> Robin Moffatt via dev <[email protected]> ezt írta (időpont: 2026.
> febr. 4., Sze, 18:52):
>
>> Hi,
>>
>> I have perhaps managed to deadlock this process :) I'd appreciate some
>> help untangling it. The recap is in my previous email (below).
>>
>> thanks, Robin.
>>
>> On Thu, 29 Jan 2026 at 06:20, Robin Moffatt <[email protected]> wrote:
>>
>>> Hi Kevin,
>>>
>>> Just recapping so that I'm clear, cos I'm getting confused :)
>>> I have two related PRs:
>>>
>>> #15124: Add Flink Quickstart docker image
>>> #15062: Add Flink quickstart (which includes the Dockerfile too)
>>>
>>> I can see a few routes forward:
>>>
>>> 1. Merge #15062, fast-follow with #15124 once we're happy with the
>>> publish script (I've not seen anything raised about it yet tho?)
>>> 2. Merge #15124 minus publish script, and then #15062 still relying on
>>> local image build (not sure what this would achieve vs the option above
>>> tho?)
>>> 3. Merge #15124 including publish script, then #15062 using the
>>> published image not the local build
>>>
>>> Either way, one thing that needs resolving is the Dockerfile location:
>>> flink/quickstart (#15062) vs docker/iceberg-flink-quickstart (#15124).
>>>
>>> LMK if I've missed an angle here.
>>>
>>> thanks, Robin
>>>
>>> On Wed, 28 Jan 2026 at 15:57, Kevin Liu <[email protected]> wrote:
>>>
>>>> Thanks for working on this, Robin! It looks like the complexity here is
>>>> publishing the docker image. What do you think about isolating that part?
>>>> (Just move the publish script out of #15124) We can start
>>>> with the Dockerfile definition, which allows us to build locally. This
>>>> should unblock us from merging the getting started docs in #15062
>>>> Thoughts?
>>>>
>>>> Best,
>>>> Kevin Liu
>>>>
>>>> On Wed, Jan 28, 2026 at 5:57 AM Robin Moffatt via dev <
>>>> [email protected]> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Thanks for the discussion and input.
>>>>> It sounds like there are no major blockers. Could someone please
>>>>> review https://github.com/apache/iceberg/pull/15124 ?
>>>>>
>>>>> thanks,
>>>>>
>>>>> Robin.
>>>>>
>>>>> On Mon, 26 Jan 2026 at 16:36, Kevin Liu <[email protected]> wrote:
>>>>>
>>>>>> Hey folks,
>>>>>>
>>>>>> We have a Dockerfile defined in pyiceberg [1] that uses the Spark
>>>>>> base image and installs all the necessary jars. This is used for our
>>>>>> integration test setup [2] and is inspired by
>>>>>> databricks/docker-spark-iceberg [3]. We've made many improvements such as
>>>>>> upgrading to Spark 4, supporting Spark Connect, and better image build
>>>>>> caching.
>>>>>>
>>>>>> This is already self-contained and can be reused by other
>>>>>> subprojects. In fact, iceberg-rust already uses it [4] and I try to keep
>>>>>> them in sync.
>>>>>> I think it would be beneficial for the project to publish this image
>>>>>> and something similar for Flink.
>>>>>>
>>>>>> Let me know what you think.
>>>>>>
>>>>>> Best,
>>>>>> Kevin Liu
>>>>>>
>>>>>>
>>>>>>
>>>>>> [1]
>>>>>> https://github.com/apache/iceberg-python/blob/6de6d6acad440885788fb1a24c04ed647b92af0e/dev/spark/Dockerfile
>>>>>> [2]
>>>>>> https://github.com/apache/iceberg-python/blob/6de6d6acad440885788fb1a24c04ed647b92af0e/dev/docker-compose-integration.yml#L20-L21
>>>>>> [3]
>>>>>> https://github.com/databricks/docker-spark-iceberg/blob/cf617dc29e8672792e76b9bcf6017af52f570020/spark/Dockerfile
>>>>>> [4]
>>>>>> https://github.com/apache/iceberg-rust/blob/330f21da894948fc10b57d541cb2d6f32c8bdbb8/crates/integration_tests/testdata/spark/Dockerfile
>>>>>>
>>>>>> On Mon, Jan 26, 2026 at 10:27 AM Steven Wu <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> > Since the integration code for both Spark and Flink lives in our
>>>>>>> repository, it might make sense to also store the Docker images and the
>>>>>>> corresponding scripts there.
>>>>>>>
>>>>>>> I agree with Peter here.
>>>>>>>
>>>>>>> The previous thread has some concerns if the Iceberg project should
>>>>>>> host those docker images. Not sure if the opinions have changed.
>>>>>>>
>>>>>>> On Mon, Jan 26, 2026 at 2:43 AM Robin Moffatt via dev <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> Thanks Ajantha, I'd not seen that thread.
>>>>>>>> Having looked at it, am I understanding the view to be that ideally
>>>>>>>> Flink would publish a Docker image that included the Iceberg 
>>>>>>>> dependencies?
>>>>>>>>
>>>>>>>> However we do this, I feel that the user coming to run the Flink
>>>>>>>> quickstart should not have to build their own Docker image; this adds
>>>>>>>> unnecessary friction that is easily alleviated.
>>>>>>>>
>>>>>>>> If I've understood the situation correctly, then I'm happy to
>>>>>>>> discuss this idea with the Flink community; please let me know before 
>>>>>>>> I do
>>>>>>>> so.
>>>>>>>>
>>>>>>>> thanks, Robin.
>>>>>>>>
>>>>>>>> On Fri, 23 Jan 2026 at 16:50, Ajantha Bhat <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi Robin and Peter,
>>>>>>>>>
>>>>>>>>> I discussed community-maintained Docker images previously:
>>>>>>>>> https://lists.apache.org/thread/xl1cwq7vmnh6zgfd2vck2nq7dfd33ncq
>>>>>>>>>
>>>>>>>>> The consensus was to publish only the REST fixture Docker image
>>>>>>>>> <https://hub.docker.com/r/apache/iceberg-rest-fixture> (now at
>>>>>>>>> 100K+ total downloads) and use Docker images published by the main 
>>>>>>>>> engines
>>>>>>>>> in the quickstart, instead of maintaining these images ourselves.
>>>>>>>>> See the thread above for more details.
>>>>>>>>>
>>>>>>>>> With respect to adding a Flink quickstart page, I’m in favor of
>>>>>>>>> adding it and relying on the Docker images provided by Flink rather 
>>>>>>>>> than
>>>>>>>>> maintaining our own images.
>>>>>>>>> - Ajantha
>>>>>>>>>
>>>>>>>>> On Fri, Jan 23, 2026 at 9:43 PM Péter Váry <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Robin,
>>>>>>>>>> It would be nice to separate them. I expect that we will have
>>>>>>>>>> some extra stuff to do with the docker image. For example make sure 
>>>>>>>>>> that we
>>>>>>>>>> have ci in place to build it.
>>>>>>>>>> Thanks,
>>>>>>>>>> Peter
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Fri, Jan 23, 2026, 16:55 Robin Moffatt via dev <
>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>> Thanks for the positive reception of this idea.
>>>>>>>>>>> I've drafted a PR [1] and would appreciate input :)
>>>>>>>>>>>
>>>>>>>>>>> Also, should I keep this and the quickstart PR [2] as separate
>>>>>>>>>>> PRs, or combine them?
>>>>>>>>>>>
>>>>>>>>>>> thanks, Robin.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> [1] https://github.com/apache/iceberg/pull/15124
>>>>>>>>>>> [2] https://github.com/apache/iceberg/pull/15062
>>>>>>>>>>>
>>>>>>>>>>> On Fri, 23 Jan 2026 at 13:58, Jean-Baptiste Onofré <
>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> This is a great idea.
>>>>>>>>>>>>
>>>>>>>>>>>> If we are moving forward with an "official" Docker image
>>>>>>>>>>>> published by the project, we must ensure it is fully compliant 
>>>>>>>>>>>> with ASF
>>>>>>>>>>>> requirements regarding LICENSE/NOTICE files, etc. While this may 
>>>>>>>>>>>> seem
>>>>>>>>>>>> straightforward, it is a detail that is often overlooked.
>>>>>>>>>>>>
>>>>>>>>>>>> I would be happy to help with this process.
>>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> JB
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Jan 23, 2026 at 1:52 PM Maximilian Michels <
>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hey Robin,
>>>>>>>>>>>>>
>>>>>>>>>>>>> +1 That's a great idea. It's often a bit painful for new users
>>>>>>>>>>>>> to get
>>>>>>>>>>>>> all the dependencies in the right place.
>>>>>>>>>>>>>
>>>>>>>>>>>>> +1 for building upon the official Flink Docker images:
>>>>>>>>>>>>> https://hub.docker.com/r/apache/flink
>>>>>>>>>>>>>
>>>>>>>>>>>>> -Max
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Jan 23, 2026 at 12:27 PM Péter Váry <
>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > Hi Robin,
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > I would love to see the Flink quickstart image in the
>>>>>>>>>>>>> Iceberg repo.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > Ajantha was working on the Spark side:
>>>>>>>>>>>>> https://github.com/apache/iceberg/issues/13519
>>>>>>>>>>>>> > The conclusion was:
>>>>>>>>>>>>> >>
>>>>>>>>>>>>> >> we should both remove the vendor reference and bring this
>>>>>>>>>>>>> back up to date. My preference would be to rely on the Spark 
>>>>>>>>>>>>> image <
>>>>>>>>>>>>> https://hub.docker.com/r/apache/spark> provided by the Apache
>>>>>>>>>>>>> Spark project, similar to what we do for the Hive <
>>>>>>>>>>>>> https://iceberg.apache.org/hive-quickstart/> quickstart. We
>>>>>>>>>>>>> should be able to load all the Iceberg-specific JARs through the
>>>>>>>>>>>>> spark.jars.packages configuration <
>>>>>>>>>>>>> https://spark.apache.org/docs/3.5.1/configuration.html>.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > Ajantha also added the link to the old dev list thread:
>>>>>>>>>>>>> https://lists.apache.org/thread/4kknk8mvnffbmhdt63z8t4ps0mt1jbf4
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > Thanks for working on this,
>>>>>>>>>>>>> > Peter
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > Robin Moffatt via dev <[email protected]> ezt írta
>>>>>>>>>>>>> (időpont: 2026. jan. 22., Cs, 19:23):
>>>>>>>>>>>>> >>
>>>>>>>>>>>>> >> Hi,
>>>>>>>>>>>>> >>
>>>>>>>>>>>>> >> Following discussion on the Flink quickstart PR [1], what
>>>>>>>>>>>>> do people think about adding an official quickstart Docker image 
>>>>>>>>>>>>> for Flink
>>>>>>>>>>>>> to the project?
>>>>>>>>>>>>> >> At the moment the Spark quickstart uses
>>>>>>>>>>>>> tabulario/spark-iceberg so perhaps that could be brought into the 
>>>>>>>>>>>>> project
>>>>>>>>>>>>> too.
>>>>>>>>>>>>> >>
>>>>>>>>>>>>> >> thanks, Robin.
>>>>>>>>>>>>> >>
>>>>>>>>>>>>> >> 1: https://github.com/apache/iceberg/pull/15062
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>
>>>>>
>>>
>>>
>>

Reply via email to