So the point Khalid was trying to make is that there are legitimate reasons you might use different container images for the driver pod vs the executor pod. It has nothing to do with Docker versions.
Since the bulk of the actual work happens on the executor you may want additional libraries, tools or software in that image that your job code can call. This same software may be entirely unnecessary on the driver allowing you to use a smaller image for that versus the executor image. As a practical example for a ML use case you might want to have the optional Intel MKL or OpenBLAS dependencies which can significantly bloat the size of your container image (by hundreds of megabytes) and would only be needed by the executor pods. Rob From: Mich Talebzadeh <mich.talebza...@gmail.com> Date: Wednesday, 8 December 2021 at 17:42 To: Khalid Mammadov <khalidmammad...@gmail.com> Cc: "user @spark" <u...@spark.apache.org>, Spark dev list <dev@spark.apache.org> Subject: Re: docker image distribution in Kubernetes cluster Thanks Khalid for your notes I have not come across a use case where the docker version on the driver and executors need to be different. My thinking is that spark.kubernetes.executor.container.image is the correct reference as in the Kubernetes where container is the correct terminology and also both driver and executors are spark specific. cheers view my Linkedin profile Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On Wed, 8 Dec 2021 at 11:47, Khalid Mammadov <khalidmammad...@gmail.com> wrote: Hi Mitch IMO, it's done to provide most flexibility. So, some users can have limited/restricted version of the image or with some additional software that they use on the executors that is used during processing. So, in your case you only need to provide the first one since the other two configs will be copied from it Regards Khalid On Wed, 8 Dec 2021, 10:41 Mich Talebzadeh, <mich.talebza...@gmail.com> wrote: Just a correction that in Spark 3.2 documentation it states that Property NameDefaultMeaning spark.kubernetes.container.image(none)Container image to use for the Spark application. This is usually of the form example.com/repo/spark:v1.0.0. This configuration is required and must be provided by the user, unless explicit images are provided for each different container type.2.3.0 spark.kubernetes.driver.container.image(value of spark.kubernetes.container.image)Custom container image to use for the driver.2.3.0 spark.kubernetes.executor.container.image(value of spark.kubernetes.container.image)Custom container image to use for executors. So both driver and executor images are mapped to the container image. In my opinion, they are redundant and will potentially add confusion so they should be removed? view my Linkedin profile Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On Wed, 8 Dec 2021 at 10:15, Mich Talebzadeh <mich.talebza...@gmail.com> wrote: Hi, We have three conf parameters to distribute the docker image with spark-sumit in Kubernetes cluster. These are spark-submit --verbose \ --conf spark.kubernetes.driver.docker.image=${IMAGEGCP} \ --conf spark.kubernetes.executor.docker.image=${IMAGEGCP} \ --conf spark.kubernetes.container.image=${IMAGEGCP} \ when the above is run, it shows (spark.kubernetes.driver.docker.image,eu.gcr.io/axial-glow-224522/spark-py:3.1.1-scala_2.12-8-jre-slim-buster-addedpackages) (spark.kubernetes.executor.docker.image,eu.gcr.io/axial-glow-224522/spark-py:3.1.1-scala_2.12-8-jre-slim-buster-addedpackages) (spark.kubernetes.container.image,eu.gcr.io/axial-glow-224522/spark-py:3.1.1-scala_2.12-8-jre-slim-buster-addedpackages) You notice that I am using the same docker image for driver, executor and container. In Spark 3.2 (actually in recent spark versions), I cannot see reference to driver or executor. Are these depreciated? It appears that Spark still accepts them? Thanks view my Linkedin profile Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. h