In there are no objections then I would prefer it in the
docker/iceberg-flink-quickstart

Robin Moffatt via dev <[email protected]> ezt írta (időpont: 2026.
febr. 6., P, 11:13):

> Hi Peter,
>
> Thanks for the direction. I'll remove the publish step so that we can get
> the quickstart published, and then work on the publishing subsequently.
>
> Do you think the Dockerfile is best kept in flink/quickstart, or
> docker/iceberg-flink-quickstart ?
>
> thanks, Robin
>
> On Thu, 5 Feb 2026 at 16:10, Péter Váry <[email protected]>
> wrote:
>
>> I think we have two options:
>>
>>    1. Remove the image publication from this PR (
>>    https://github.com/apache/iceberg/pull/15124) for now, and proceed
>>    with adding the Docker image and updating the documentation.
>>    2. Alternatively, we could discuss publishing the Flink quickstart
>>    image at the next Iceberg Community Sync and use that as an opportunity to
>>    simplify both the documentation and the overall user experience.
>>
>>
>>
>>
>> Robin Moffatt via dev <[email protected]> ezt írta (időpont: 2026.
>> febr. 4., Sze, 18:52):
>>
>>> Hi,
>>>
>>> I have perhaps managed to deadlock this process :) I'd appreciate some
>>> help untangling it. The recap is in my previous email (below).
>>>
>>> thanks, Robin.
>>>
>>> On Thu, 29 Jan 2026 at 06:20, Robin Moffatt <[email protected]> wrote:
>>>
>>>> Hi Kevin,
>>>>
>>>> Just recapping so that I'm clear, cos I'm getting confused :)
>>>> I have two related PRs:
>>>>
>>>> #15124: Add Flink Quickstart docker image
>>>> #15062: Add Flink quickstart (which includes the Dockerfile too)
>>>>
>>>> I can see a few routes forward:
>>>>
>>>> 1. Merge #15062, fast-follow with #15124 once we're happy with the
>>>> publish script (I've not seen anything raised about it yet tho?)
>>>> 2. Merge #15124 minus publish script, and then #15062 still relying on
>>>> local image build (not sure what this would achieve vs the option above
>>>> tho?)
>>>> 3. Merge #15124 including publish script, then #15062 using the
>>>> published image not the local build
>>>>
>>>> Either way, one thing that needs resolving is the Dockerfile location:
>>>> flink/quickstart (#15062) vs docker/iceberg-flink-quickstart (#15124).
>>>>
>>>> LMK if I've missed an angle here.
>>>>
>>>> thanks, Robin
>>>>
>>>> On Wed, 28 Jan 2026 at 15:57, Kevin Liu <[email protected]> wrote:
>>>>
>>>>> Thanks for working on this, Robin! It looks like the complexity here
>>>>> is publishing the docker image. What do you think about isolating that
>>>>> part? (Just move the publish script out of #15124) We can start
>>>>> with the Dockerfile definition, which allows us to build locally. This
>>>>> should unblock us from merging the getting started docs in #15062
>>>>> Thoughts?
>>>>>
>>>>> Best,
>>>>> Kevin Liu
>>>>>
>>>>> On Wed, Jan 28, 2026 at 5:57 AM Robin Moffatt via dev <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Thanks for the discussion and input.
>>>>>> It sounds like there are no major blockers. Could someone please
>>>>>> review https://github.com/apache/iceberg/pull/15124 ?
>>>>>>
>>>>>> thanks,
>>>>>>
>>>>>> Robin.
>>>>>>
>>>>>> On Mon, 26 Jan 2026 at 16:36, Kevin Liu <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Hey folks,
>>>>>>>
>>>>>>> We have a Dockerfile defined in pyiceberg [1] that uses the Spark
>>>>>>> base image and installs all the necessary jars. This is used for our
>>>>>>> integration test setup [2] and is inspired by
>>>>>>> databricks/docker-spark-iceberg [3]. We've made many improvements such 
>>>>>>> as
>>>>>>> upgrading to Spark 4, supporting Spark Connect, and better image build
>>>>>>> caching.
>>>>>>>
>>>>>>> This is already self-contained and can be reused by other
>>>>>>> subprojects. In fact, iceberg-rust already uses it [4] and I try to keep
>>>>>>> them in sync.
>>>>>>> I think it would be beneficial for the project to publish this image
>>>>>>> and something similar for Flink.
>>>>>>>
>>>>>>> Let me know what you think.
>>>>>>>
>>>>>>> Best,
>>>>>>> Kevin Liu
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> [1]
>>>>>>> https://github.com/apache/iceberg-python/blob/6de6d6acad440885788fb1a24c04ed647b92af0e/dev/spark/Dockerfile
>>>>>>> [2]
>>>>>>> https://github.com/apache/iceberg-python/blob/6de6d6acad440885788fb1a24c04ed647b92af0e/dev/docker-compose-integration.yml#L20-L21
>>>>>>> [3]
>>>>>>> https://github.com/databricks/docker-spark-iceberg/blob/cf617dc29e8672792e76b9bcf6017af52f570020/spark/Dockerfile
>>>>>>> [4]
>>>>>>> https://github.com/apache/iceberg-rust/blob/330f21da894948fc10b57d541cb2d6f32c8bdbb8/crates/integration_tests/testdata/spark/Dockerfile
>>>>>>>
>>>>>>> On Mon, Jan 26, 2026 at 10:27 AM Steven Wu <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> > Since the integration code for both Spark and Flink lives in our
>>>>>>>> repository, it might make sense to also store the Docker images and the
>>>>>>>> corresponding scripts there.
>>>>>>>>
>>>>>>>> I agree with Peter here.
>>>>>>>>
>>>>>>>> The previous thread has some concerns if the Iceberg project should
>>>>>>>> host those docker images. Not sure if the opinions have changed.
>>>>>>>>
>>>>>>>> On Mon, Jan 26, 2026 at 2:43 AM Robin Moffatt via dev <
>>>>>>>> [email protected]> wrote:
>>>>>>>>
>>>>>>>>> Thanks Ajantha, I'd not seen that thread.
>>>>>>>>> Having looked at it, am I understanding the view to be that
>>>>>>>>> ideally Flink would publish a Docker image that included the Iceberg
>>>>>>>>> dependencies?
>>>>>>>>>
>>>>>>>>> However we do this, I feel that the user coming to run the Flink
>>>>>>>>> quickstart should not have to build their own Docker image; this adds
>>>>>>>>> unnecessary friction that is easily alleviated.
>>>>>>>>>
>>>>>>>>> If I've understood the situation correctly, then I'm happy to
>>>>>>>>> discuss this idea with the Flink community; please let me know before 
>>>>>>>>> I do
>>>>>>>>> so.
>>>>>>>>>
>>>>>>>>> thanks, Robin.
>>>>>>>>>
>>>>>>>>> On Fri, 23 Jan 2026 at 16:50, Ajantha Bhat <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Robin and Peter,
>>>>>>>>>>
>>>>>>>>>> I discussed community-maintained Docker images previously:
>>>>>>>>>> https://lists.apache.org/thread/xl1cwq7vmnh6zgfd2vck2nq7dfd33ncq
>>>>>>>>>>
>>>>>>>>>> The consensus was to publish only the REST fixture Docker image
>>>>>>>>>> <https://hub.docker.com/r/apache/iceberg-rest-fixture> (now at
>>>>>>>>>> 100K+ total downloads) and use Docker images published by the main 
>>>>>>>>>> engines
>>>>>>>>>> in the quickstart, instead of maintaining these images ourselves.
>>>>>>>>>> See the thread above for more details.
>>>>>>>>>>
>>>>>>>>>> With respect to adding a Flink quickstart page, I’m in favor of
>>>>>>>>>> adding it and relying on the Docker images provided by Flink rather 
>>>>>>>>>> than
>>>>>>>>>> maintaining our own images.
>>>>>>>>>> - Ajantha
>>>>>>>>>>
>>>>>>>>>> On Fri, Jan 23, 2026 at 9:43 PM Péter Váry <
>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Robin,
>>>>>>>>>>> It would be nice to separate them. I expect that we will have
>>>>>>>>>>> some extra stuff to do with the docker image. For example make sure 
>>>>>>>>>>> that we
>>>>>>>>>>> have ci in place to build it.
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Peter
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Jan 23, 2026, 16:55 Robin Moffatt via dev <
>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Thanks for the positive reception of this idea.
>>>>>>>>>>>> I've drafted a PR [1] and would appreciate input :)
>>>>>>>>>>>>
>>>>>>>>>>>> Also, should I keep this and the quickstart PR [2] as separate
>>>>>>>>>>>> PRs, or combine them?
>>>>>>>>>>>>
>>>>>>>>>>>> thanks, Robin.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> [1] https://github.com/apache/iceberg/pull/15124
>>>>>>>>>>>> [2] https://github.com/apache/iceberg/pull/15062
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, 23 Jan 2026 at 13:58, Jean-Baptiste Onofré <
>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> This is a great idea.
>>>>>>>>>>>>>
>>>>>>>>>>>>> If we are moving forward with an "official" Docker image
>>>>>>>>>>>>> published by the project, we must ensure it is fully compliant 
>>>>>>>>>>>>> with ASF
>>>>>>>>>>>>> requirements regarding LICENSE/NOTICE files, etc. While this may 
>>>>>>>>>>>>> seem
>>>>>>>>>>>>> straightforward, it is a detail that is often overlooked.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I would be happy to help with this process.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>> JB
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Jan 23, 2026 at 1:52 PM Maximilian Michels <
>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hey Robin,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> +1 That's a great idea. It's often a bit painful for new
>>>>>>>>>>>>>> users to get
>>>>>>>>>>>>>> all the dependencies in the right place.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> +1 for building upon the official Flink Docker images:
>>>>>>>>>>>>>> https://hub.docker.com/r/apache/flink
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -Max
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, Jan 23, 2026 at 12:27 PM Péter Váry <
>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > Hi Robin,
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > I would love to see the Flink quickstart image in the
>>>>>>>>>>>>>> Iceberg repo.
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > Ajantha was working on the Spark side:
>>>>>>>>>>>>>> https://github.com/apache/iceberg/issues/13519
>>>>>>>>>>>>>> > The conclusion was:
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >> we should both remove the vendor reference and bring this
>>>>>>>>>>>>>> back up to date. My preference would be to rely on the Spark 
>>>>>>>>>>>>>> image <
>>>>>>>>>>>>>> https://hub.docker.com/r/apache/spark> provided by the
>>>>>>>>>>>>>> Apache Spark project, similar to what we do for the Hive <
>>>>>>>>>>>>>> https://iceberg.apache.org/hive-quickstart/> quickstart. We
>>>>>>>>>>>>>> should be able to load all the Iceberg-specific JARs through the
>>>>>>>>>>>>>> spark.jars.packages configuration <
>>>>>>>>>>>>>> https://spark.apache.org/docs/3.5.1/configuration.html>.
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > Ajantha also added the link to the old dev list thread:
>>>>>>>>>>>>>> https://lists.apache.org/thread/4kknk8mvnffbmhdt63z8t4ps0mt1jbf4
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > Thanks for working on this,
>>>>>>>>>>>>>> > Peter
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > Robin Moffatt via dev <[email protected]> ezt írta
>>>>>>>>>>>>>> (időpont: 2026. jan. 22., Cs, 19:23):
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >> Hi,
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >> Following discussion on the Flink quickstart PR [1], what
>>>>>>>>>>>>>> do people think about adding an official quickstart Docker image 
>>>>>>>>>>>>>> for Flink
>>>>>>>>>>>>>> to the project?
>>>>>>>>>>>>>> >> At the moment the Spark quickstart uses
>>>>>>>>>>>>>> tabulario/spark-iceberg so perhaps that could be brought into 
>>>>>>>>>>>>>> the project
>>>>>>>>>>>>>> too.
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >> thanks, Robin.
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >> 1: https://github.com/apache/iceberg/pull/15062
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>
>>>>
>>>
>

Reply via email to