Hi Kevin, thanks for raising this. I agree this discussion is warranted. In the previous thread [1] we largely deferred making a decision on whether the Iceberg community should publish Docker images beyond the REST TCK image, so I think it makes sense to revisit it now.
While it's tempting to help out the community in every possible way, I think it's important to stay focused on what the project /subprojects are best positioned to own. In a way, I'm concerned that publishing engine specific Iceberg images as supported artifacts could create a long term maintenance burden, since we don't maintain those engines ourselves. >From my perspective, the key question is on what criteria we should use when >deciding whether to publish a Docker image, and I think the clearest line is >whether it supports Iceberg subprojects (or other OSS projects) in testing >their integration with Iceberg, where we can reasonably expect it to support >it to a high standard. I'm curious to hear others' thoughts on this topic. Cheers, Sung [1] https://lists.apache.org/thread/xl1cwq7vmnh6zgfd2vck2nq7dfd33ncq On 2026/02/19 21:06:56 Kevin Liu wrote: > Hi everyone, > > I want to continue the discussion on which Docker images the Iceberg > project should publish. This has come up several times [1][2][3][4] and I'd > like to continue the discussion here. > > So far, the main outcome has been the publication of > apache/iceberg-rest-fixture [5] (100K+ downloads), following a consensus > [2] to limit community-maintained images to the REST fixture and rely on > upstream engine projects for quickstarts. A separate thread and issue > [3][6] proposed replacing the tabulario/spark-iceberg quickstart image with > the official apache/spark image. Most recently, a proposal to add a Flink > quickstart image [4] has reopened the broader question. > > One concrete case for expanding scope: both iceberg-python and iceberg-rust > currently maintain their own Spark+Iceberg Docker images for integration > testing, and we already try to keep them in sync manually [7][8]. This is > exactly the kind of duplication that centralizing under the main iceberg > repo would solve; just as we did with apache/iceberg-rest-fixture. > Publishing a shared apache/iceberg-spark image would give all subprojects a > single, well-maintained image to depend on, and reduce the maintenance > burden across the ecosystem. We can do the same for the Flink+Iceberg setup. > > Given the traction the REST fixture image has seen, I think it's worth > revisiting the scope of what we publish. I'd love to hear updated views > from the community. > > Thanks, > Kevin Liu > > [1] https://lists.apache.org/thread/dr6nsvd8jm2gr2nn5vf7nkpr0pc5d03h > [2] https://lists.apache.org/thread/xl1cwq7vmnh6zgfd2vck2nq7dfd33ncq > [3] https://lists.apache.org/thread/4kknk8mvnffbmhdt63z8t4ps0mt1jbf4 > [4] https://lists.apache.org/thread/grlgvl9fslcxrsnxyb7qqh7vjd4kkqo3 > [5] https://hub.docker.com/r/apache/iceberg-rest-fixture > [6] https://github.com/apache/iceberg/issues/13519 > [7] https://github.com/apache/iceberg-python/tree/main/dev/spark > [8] https://github.com/apache/iceberg-rust/tree/main/dev/spark >
