Hi everyone, I want to continue the discussion on which Docker images the Iceberg project should publish. This has come up several times [1][2][3][4] and I'd like to continue the discussion here.
So far, the main outcome has been the publication of apache/iceberg-rest-fixture [5] (100K+ downloads), following a consensus [2] to limit community-maintained images to the REST fixture and rely on upstream engine projects for quickstarts. A separate thread and issue [3][6] proposed replacing the tabulario/spark-iceberg quickstart image with the official apache/spark image. Most recently, a proposal to add a Flink quickstart image [4] has reopened the broader question. One concrete case for expanding scope: both iceberg-python and iceberg-rust currently maintain their own Spark+Iceberg Docker images for integration testing, and we already try to keep them in sync manually [7][8]. This is exactly the kind of duplication that centralizing under the main iceberg repo would solve; just as we did with apache/iceberg-rest-fixture. Publishing a shared apache/iceberg-spark image would give all subprojects a single, well-maintained image to depend on, and reduce the maintenance burden across the ecosystem. We can do the same for the Flink+Iceberg setup. Given the traction the REST fixture image has seen, I think it's worth revisiting the scope of what we publish. I'd love to hear updated views from the community. Thanks, Kevin Liu [1] https://lists.apache.org/thread/dr6nsvd8jm2gr2nn5vf7nkpr0pc5d03h [2] https://lists.apache.org/thread/xl1cwq7vmnh6zgfd2vck2nq7dfd33ncq [3] https://lists.apache.org/thread/4kknk8mvnffbmhdt63z8t4ps0mt1jbf4 [4] https://lists.apache.org/thread/grlgvl9fslcxrsnxyb7qqh7vjd4kkqo3 [5] https://hub.docker.com/r/apache/iceberg-rest-fixture [6] https://github.com/apache/iceberg/issues/13519 [7] https://github.com/apache/iceberg-python/tree/main/dev/spark [8] https://github.com/apache/iceberg-rust/tree/main/dev/spark
