One important aspect to consider is where the integration code actually lives. Both the Spark and Flink integrations are maintained directly in the Iceberg repository, which means the Iceberg community is responsible for keeping these connectors working. If we moved the Docker image creation into the Spark or Flink projects, we would introduce a circular dependency that would make release coordination much more complicated.
For example, imagine Spark releases version 4.2. At that point, no Iceberg integration exists yet. Once we update Iceberg, the support for Spark 4.2 would land in an Iceberg release. Let's say, Iceberg 1.12.0. At that point, we can publish the iceberg-1.12.0-spark-4.2-quickstart image, aligned with our release cycle. But if the Spark project were responsible for publishing the image, they would need a separate, additional release cycle just for the Docker image, which doesn't fit naturally into their workflow. Given this, my suggestion is that Iceberg should publish the quickstart Docker images for integrations we own, like Spark and Flink. For integrations where we don’t own the code, such as Trino and Hive, the respective projects should continue to publish their own images. Sung Yun <[email protected]> ezt írta (időpont: 2026. febr. 20., P, 3:29): > Hi Kevin, thanks for raising this. > > I agree this discussion is warranted. In the previous thread [1] we > largely deferred making a decision on whether the Iceberg community should > publish Docker images beyond the REST TCK image, so I think it makes sense > to revisit it now. > > While it's tempting to help out the community in every possible way, I > think it's important to stay focused on what the project /subprojects are > best positioned to own. In a way, I'm concerned that publishing engine > specific Iceberg images as supported artifacts could create a long term > maintenance burden, since we don't maintain those engines ourselves. > > From my perspective, the key question is on what criteria we should use > when deciding whether to publish a Docker image, and I think the clearest > line is whether it supports Iceberg subprojects (or other OSS projects) in > testing their integration with Iceberg, where we can reasonably expect it > to support it to a high standard. > > I'm curious to hear others' thoughts on this topic. > > Cheers, > Sung > > [1] https://lists.apache.org/thread/xl1cwq7vmnh6zgfd2vck2nq7dfd33ncq > > On 2026/02/19 21:06:56 Kevin Liu wrote: > > Hi everyone, > > > > I want to continue the discussion on which Docker images the Iceberg > > project should publish. This has come up several times [1][2][3][4] and > I'd > > like to continue the discussion here. > > > > So far, the main outcome has been the publication of > > apache/iceberg-rest-fixture [5] (100K+ downloads), following a consensus > > [2] to limit community-maintained images to the REST fixture and rely on > > upstream engine projects for quickstarts. A separate thread and issue > > [3][6] proposed replacing the tabulario/spark-iceberg quickstart image > with > > the official apache/spark image. Most recently, a proposal to add a Flink > > quickstart image [4] has reopened the broader question. > > > > One concrete case for expanding scope: both iceberg-python and > iceberg-rust > > currently maintain their own Spark+Iceberg Docker images for integration > > testing, and we already try to keep them in sync manually [7][8]. This is > > exactly the kind of duplication that centralizing under the main iceberg > > repo would solve; just as we did with apache/iceberg-rest-fixture. > > Publishing a shared apache/iceberg-spark image would give all > subprojects a > > single, well-maintained image to depend on, and reduce the maintenance > > burden across the ecosystem. We can do the same for the Flink+Iceberg > setup. > > > > Given the traction the REST fixture image has seen, I think it's worth > > revisiting the scope of what we publish. I'd love to hear updated views > > from the community. > > > > Thanks, > > Kevin Liu > > > > [1] https://lists.apache.org/thread/dr6nsvd8jm2gr2nn5vf7nkpr0pc5d03h > > [2] https://lists.apache.org/thread/xl1cwq7vmnh6zgfd2vck2nq7dfd33ncq > > [3] https://lists.apache.org/thread/4kknk8mvnffbmhdt63z8t4ps0mt1jbf4 > > [4] https://lists.apache.org/thread/grlgvl9fslcxrsnxyb7qqh7vjd4kkqo3 > > [5] https://hub.docker.com/r/apache/iceberg-rest-fixture > > [6] https://github.com/apache/iceberg/issues/13519 > > [7] https://github.com/apache/iceberg-python/tree/main/dev/spark > > [8] https://github.com/apache/iceberg-rust/tree/main/dev/spark > > >
