[jira] [Commented] (SPARK-23638) Spark on k8s: spark.kubernetes.initContainer.image has no effect
[ https://issues.apache.org/jira/browse/SPARK-23638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16440067#comment-16440067 ] Yinan Li commented on SPARK-23638: -- Can this be closed? > Spark on k8s: spark.kubernetes.initContainer.image has no effect > > > Key: SPARK-23638 > URL: https://issues.apache.org/jira/browse/SPARK-23638 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.3.0 > Environment: K8 server: Ubuntu 16.04 > Submission client: macOS Sierra 10.12.x > Client Version: version.Info\{Major:"1", Minor:"9", GitVersion:"v1.9.3", > GitCommit:"d2835416544f298c919e2ead3be3d0864b52323b", GitTreeState:"clean", > BuildDate:"2018-02-07T12:22:21Z", GoVersion:"go1.9.2", Compiler:"gc", > Platform:"darwin/amd64"} > Server Version: version.Info\{Major:"1", Minor:"8", GitVersion:"v1.8.3", > GitCommit:"f0efb3cb883751c5ffdbe6d515f3cb4fbe7b7acd", GitTreeState:"clean", > BuildDate:"2017-11-08T18:27:48Z", GoVersion:"go1.8.3", Compiler:"gc", > Platform:"linux/amd64"} >Reporter: maheshvra >Priority: Major > > Hi all - I am trying to use initContainer to download remote dependencies. To > begin with, I ran a test with initContainer which basically "echo hello > world". However, when i triggered the pod deployment via spark-submit, I did > not see any trace of initContainer execution in my kubernetes cluster. > > {code:java} > SPARK_DRIVER_MEMORY: 1g > SPARK_DRIVER_CLASS: com.bigdata.App SPARK_DRIVER_ARGS: -c > /opt/spark/work-dir/app/main/environments/int -w > ./../../workflows/workflow_main.json -e prod -n features -v off > SPARK_DRIVER_BIND_ADDRESS: > SPARK_JAVA_OPT_0: -Dspark.submit.deployMode=cluster > SPARK_JAVA_OPT_1: -Dspark.driver.blockManager.port=7079 > SPARK_JAVA_OPT_2: -Dspark.app.name=fg-am00-raw12 > SPARK_JAVA_OPT_3: > -Dspark.kubernetes.container.image=docker.com/cmapp/fg-am00-raw:1.0.0 > SPARK_JAVA_OPT_4: -Dspark.app.id=spark-4fa9a5ce1b1d401fa9c1e413ff030d44 > SPARK_JAVA_OPT_5: > -Dspark.jars=/opt/spark/jars/aws-java-sdk-1.7.4.jar,/opt/spark/jars/hadoop-aws-2.7.3.jar,/opt/spark/jars/guava-14.0.1.jar,/opt/spark/jars/SparkApp.jar,/opt/spark/jars/datacleanup-component-1.0-SNAPSHOT.jar > > SPARK_JAVA_OPT_6: -Dspark.driver.port=7078 > SPARK_JAVA_OPT_7: > -Dspark.kubernetes.initContainer.image=docker.com/cmapp/custombusybox:1.0.0 > SPARK_JAVA_OPT_8: > -Dspark.kubernetes.executor.podNamePrefix=fg-am00-raw12-b1c8112b8536304ab0fc64fcc41e0615 > > SPARK_JAVA_OPT_9: > -Dspark.kubernetes.driver.pod.name=fg-am00-raw12-b1c8112b8536304ab0fc64fcc41e0615-driver > > SPARK_JAVA_OPT_10: > -Dspark.driver.host=fg-am00-raw12-b1c8112b8536304ab0fc64fcc41e0615-driver-svc.experimental.svc > SPARK_JAVA_OPT_11: -Dspark.executor.instances=5 > SPARK_JAVA_OPT_12: > -Dspark.hadoop.fs.s3a.server-side-encryption-algorithm=AES256 > SPARK_JAVA_OPT_13: -Dspark.kubernetes.namespace=experimental > SPARK_JAVA_OPT_14: > -Dspark.kubernetes.authenticate.driver.serviceAccountName=experimental-service-account > SPARK_JAVA_OPT_15: -Dspark.master=k8s://https://bigdata > {code} > > Further, I did not see spec.initContainers section in the generated pod. > Please see the details below > > {code:java} > > { > "kind": "Pod", > "apiVersion": "v1", > "metadata": { > "name": "fg-am00-raw12-b1c8112b8536304ab0fc64fcc41e0615-driver", > "namespace": "experimental", > "selfLink": > "/api/v1/namespaces/experimental/pods/fg-am00-raw12-b1c8112b8536304ab0fc64fcc41e0615-driver", > "uid": "adc5a50a-2342-11e8-87dc-12c5b3954044", > "resourceVersion": "299054", > "creationTimestamp": "2018-03-09T02:36:32Z", > "labels": { > "spark-app-selector": "spark-4fa9a5ce1b1d401fa9c1e413ff030d44", > "spark-role": "driver" > }, > "annotations": { > "spark-app-name": "fg-am00-raw12" > } > }, > "spec": { > "volumes": [ > { > "name": "experimental-service-account-token-msmth", > "secret": { > "secretName": "experimental-service-account-token-msmth", > "defaultMode": 420 > } > } > ], > "containers": [ > { > "name": "spark-kubernetes-driver", > "image": "docker.com/cmapp/fg-am00-raw:1.0.0", > "args": [ > "driver" > ], > "env": [ > { > "name": "SPARK_DRIVER_MEMORY", > "value": "1g" > }, > { > "name": "SPARK_DRIVER_CLASS", > "value": "com.myapp.App" > }, > { > "name": "SPARK_DRIVER_ARGS", > "value": "-c /opt/spark/work-dir/app/main/environments/int -w > ./../../workflows/workflow_main.json -e prod -n features -v off" > }, > { > "name": "SPARK_DRIVER_BIND_ADDRESS", > "valueFrom": { > "fieldRef": { > "apiVersion": "v1", > "fieldPath": "status.podIP" > } > } > }, > { > "name": "SPARK_MOUNTED_CLASSPATH", > "value": >
[jira] [Commented] (SPARK-23638) Spark on k8s: spark.kubernetes.initContainer.image has no effect
[ https://issues.apache.org/jira/browse/SPARK-23638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16402070#comment-16402070 ] Yinan Li commented on SPARK-23638: -- The Kubernetes-specific submission client will only add an init-container to the driver and executor pods if there is any remote dependencies to download. Otherwise, it won't regardless if you specify \{{spark.kubernetes.initContainer.image}}. > Spark on k8s: spark.kubernetes.initContainer.image has no effect > > > Key: SPARK-23638 > URL: https://issues.apache.org/jira/browse/SPARK-23638 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.3.0 > Environment: K8 server: Ubuntu 16.04 > Submission client: macOS Sierra 10.12.x > Client Version: version.Info\{Major:"1", Minor:"9", GitVersion:"v1.9.3", > GitCommit:"d2835416544f298c919e2ead3be3d0864b52323b", GitTreeState:"clean", > BuildDate:"2018-02-07T12:22:21Z", GoVersion:"go1.9.2", Compiler:"gc", > Platform:"darwin/amd64"} > Server Version: version.Info\{Major:"1", Minor:"8", GitVersion:"v1.8.3", > GitCommit:"f0efb3cb883751c5ffdbe6d515f3cb4fbe7b7acd", GitTreeState:"clean", > BuildDate:"2017-11-08T18:27:48Z", GoVersion:"go1.8.3", Compiler:"gc", > Platform:"linux/amd64"} >Reporter: maheshvra >Priority: Major > > Hi all - I am trying to use initContainer to download remote dependencies. To > begin with, I ran a test with initContainer which basically "echo hello > world". However, when i triggered the pod deployment via spark-submit, I did > not see any trace of initContainer execution in my kubernetes cluster. > > {code:java} > SPARK_DRIVER_MEMORY: 1g > SPARK_DRIVER_CLASS: com.bigdata.App SPARK_DRIVER_ARGS: -c > /opt/spark/work-dir/app/main/environments/int -w > ./../../workflows/workflow_main.json -e prod -n features -v off > SPARK_DRIVER_BIND_ADDRESS: > SPARK_JAVA_OPT_0: -Dspark.submit.deployMode=cluster > SPARK_JAVA_OPT_1: -Dspark.driver.blockManager.port=7079 > SPARK_JAVA_OPT_2: -Dspark.app.name=fg-am00-raw12 > SPARK_JAVA_OPT_3: > -Dspark.kubernetes.container.image=docker.com/cmapp/fg-am00-raw:1.0.0 > SPARK_JAVA_OPT_4: -Dspark.app.id=spark-4fa9a5ce1b1d401fa9c1e413ff030d44 > SPARK_JAVA_OPT_5: > -Dspark.jars=/opt/spark/jars/aws-java-sdk-1.7.4.jar,/opt/spark/jars/hadoop-aws-2.7.3.jar,/opt/spark/jars/guava-14.0.1.jar,/opt/spark/jars/SparkApp.jar,/opt/spark/jars/datacleanup-component-1.0-SNAPSHOT.jar > > SPARK_JAVA_OPT_6: -Dspark.driver.port=7078 > SPARK_JAVA_OPT_7: > -Dspark.kubernetes.initContainer.image=docker.com/cmapp/custombusybox:1.0.0 > SPARK_JAVA_OPT_8: > -Dspark.kubernetes.executor.podNamePrefix=fg-am00-raw12-b1c8112b8536304ab0fc64fcc41e0615 > > SPARK_JAVA_OPT_9: > -Dspark.kubernetes.driver.pod.name=fg-am00-raw12-b1c8112b8536304ab0fc64fcc41e0615-driver > > SPARK_JAVA_OPT_10: > -Dspark.driver.host=fg-am00-raw12-b1c8112b8536304ab0fc64fcc41e0615-driver-svc.experimental.svc > SPARK_JAVA_OPT_11: -Dspark.executor.instances=5 > SPARK_JAVA_OPT_12: > -Dspark.hadoop.fs.s3a.server-side-encryption-algorithm=AES256 > SPARK_JAVA_OPT_13: -Dspark.kubernetes.namespace=experimental > SPARK_JAVA_OPT_14: > -Dspark.kubernetes.authenticate.driver.serviceAccountName=experimental-service-account > SPARK_JAVA_OPT_15: -Dspark.master=k8s://https://bigdata > {code} > > Further, I did not see spec.initContainers section in the generated pod. > Please see the details below > > {code:java} > > { > "kind": "Pod", > "apiVersion": "v1", > "metadata": { > "name": "fg-am00-raw12-b1c8112b8536304ab0fc64fcc41e0615-driver", > "namespace": "experimental", > "selfLink": > "/api/v1/namespaces/experimental/pods/fg-am00-raw12-b1c8112b8536304ab0fc64fcc41e0615-driver", > "uid": "adc5a50a-2342-11e8-87dc-12c5b3954044", > "resourceVersion": "299054", > "creationTimestamp": "2018-03-09T02:36:32Z", > "labels": { > "spark-app-selector": "spark-4fa9a5ce1b1d401fa9c1e413ff030d44", > "spark-role": "driver" > }, > "annotations": { > "spark-app-name": "fg-am00-raw12" > } > }, > "spec": { > "volumes": [ > { > "name": "experimental-service-account-token-msmth", > "secret": { > "secretName": "experimental-service-account-token-msmth", > "defaultMode": 420 > } > } > ], > "containers": [ > { > "name": "spark-kubernetes-driver", > "image": "docker.com/cmapp/fg-am00-raw:1.0.0", > "args": [ > "driver" > ], > "env": [ > { > "name": "SPARK_DRIVER_MEMORY", > "value": "1g" > }, > { > "name": "SPARK_DRIVER_CLASS", > "value": "com.myapp.App" > }, > { > "name": "SPARK_DRIVER_ARGS", > "value": "-c /opt/spark/work-dir/app/main/environments/int -w > ./../../workflows/workflow_main.json -e prod -n features -v off" > }, > { > "name": "SPARK_DRIVER_BIND_ADDRESS", > "valueFrom": { > "fieldRef": { > "apiVersion": "v1", > "fieldPath": "status.podIP" > } > } > }, >
[jira] [Commented] (SPARK-23638) Spark on k8s: spark.kubernetes.initContainer.image has no effect
[ https://issues.apache.org/jira/browse/SPARK-23638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16394615#comment-16394615 ] Anton Okolnychyi commented on SPARK-23638: -- Could you also share your spark-submit command? It looks you are specifying a custom docker image for the init container (as ``spark.kubernetes.initContainer.image`` is different from ``spark.kubernetes.container.image``). Are you sure you need a custom docker image for the init container? In general, if you have a remote jar in --jars and specify ``spark.kubernetes.container.image``, Spark will create an init container for you and you do not need to reason about it. > Spark on k8s: spark.kubernetes.initContainer.image has no effect > > > Key: SPARK-23638 > URL: https://issues.apache.org/jira/browse/SPARK-23638 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.3.0 > Environment: K8 server: Ubuntu 16.04 > Submission client: macOS Sierra 10.12.x > Client Version: version.Info\{Major:"1", Minor:"9", GitVersion:"v1.9.3", > GitCommit:"d2835416544f298c919e2ead3be3d0864b52323b", GitTreeState:"clean", > BuildDate:"2018-02-07T12:22:21Z", GoVersion:"go1.9.2", Compiler:"gc", > Platform:"darwin/amd64"} > Server Version: version.Info\{Major:"1", Minor:"8", GitVersion:"v1.8.3", > GitCommit:"f0efb3cb883751c5ffdbe6d515f3cb4fbe7b7acd", GitTreeState:"clean", > BuildDate:"2017-11-08T18:27:48Z", GoVersion:"go1.8.3", Compiler:"gc", > Platform:"linux/amd64"} >Reporter: maheshvra >Priority: Major > > Hi all - I am trying to use initContainer to download remote dependencies. To > begin with, I ran a test with initContainer which basically "echo hello > world". However, when i triggered the pod deployment via spark-submit, I did > not see any trace of initContainer execution in my kubernetes cluster. > > {code:java} > SPARK_DRIVER_MEMORY: 1g > SPARK_DRIVER_CLASS: com.bigdata.App SPARK_DRIVER_ARGS: -c > /opt/spark/work-dir/app/main/environments/int -w > ./../../workflows/workflow_main.json -e prod -n features -v off > SPARK_DRIVER_BIND_ADDRESS: > SPARK_JAVA_OPT_0: -Dspark.submit.deployMode=cluster > SPARK_JAVA_OPT_1: -Dspark.driver.blockManager.port=7079 > SPARK_JAVA_OPT_2: -Dspark.app.name=fg-am00-raw12 > SPARK_JAVA_OPT_3: > -Dspark.kubernetes.container.image=docker.com/cmapp/fg-am00-raw:1.0.0 > SPARK_JAVA_OPT_4: -Dspark.app.id=spark-4fa9a5ce1b1d401fa9c1e413ff030d44 > SPARK_JAVA_OPT_5: > -Dspark.jars=/opt/spark/jars/aws-java-sdk-1.7.4.jar,/opt/spark/jars/hadoop-aws-2.7.3.jar,/opt/spark/jars/guava-14.0.1.jar,/opt/spark/jars/SparkApp.jar,/opt/spark/jars/datacleanup-component-1.0-SNAPSHOT.jar > > SPARK_JAVA_OPT_6: -Dspark.driver.port=7078 > SPARK_JAVA_OPT_7: > -Dspark.kubernetes.initContainer.image=docker.com/cmapp/custombusybox:1.0.0 > SPARK_JAVA_OPT_8: > -Dspark.kubernetes.executor.podNamePrefix=fg-am00-raw12-b1c8112b8536304ab0fc64fcc41e0615 > > SPARK_JAVA_OPT_9: > -Dspark.kubernetes.driver.pod.name=fg-am00-raw12-b1c8112b8536304ab0fc64fcc41e0615-driver > > SPARK_JAVA_OPT_10: > -Dspark.driver.host=fg-am00-raw12-b1c8112b8536304ab0fc64fcc41e0615-driver-svc.experimental.svc > SPARK_JAVA_OPT_11: -Dspark.executor.instances=5 > SPARK_JAVA_OPT_12: > -Dspark.hadoop.fs.s3a.server-side-encryption-algorithm=AES256 > SPARK_JAVA_OPT_13: -Dspark.kubernetes.namespace=experimental > SPARK_JAVA_OPT_14: > -Dspark.kubernetes.authenticate.driver.serviceAccountName=experimental-service-account > SPARK_JAVA_OPT_15: -Dspark.master=k8s://https://bigdata > {code} > > Further, I did not see spec.initContainers section in the generated pod. > Please see the details below > > {code:java} > > { > "kind": "Pod", > "apiVersion": "v1", > "metadata": { > "name": "fg-am00-raw12-b1c8112b8536304ab0fc64fcc41e0615-driver", > "namespace": "experimental", > "selfLink": > "/api/v1/namespaces/experimental/pods/fg-am00-raw12-b1c8112b8536304ab0fc64fcc41e0615-driver", > "uid": "adc5a50a-2342-11e8-87dc-12c5b3954044", > "resourceVersion": "299054", > "creationTimestamp": "2018-03-09T02:36:32Z", > "labels": { > "spark-app-selector": "spark-4fa9a5ce1b1d401fa9c1e413ff030d44", > "spark-role": "driver" > }, > "annotations": { > "spark-app-name": "fg-am00-raw12" > } > }, > "spec": { > "volumes": [ > { > "name": "experimental-service-account-token-msmth", > "secret": { > "secretName": "experimental-service-account-token-msmth", > "defaultMode": 420 > } > } > ], > "containers": [ > { > "name": "spark-kubernetes-driver", > "image": "docker.com/cmapp/fg-am00-raw:1.0.0", > "args": [ > "driver" > ], > "env": [ > { > "name": "SPARK_DRIVER_MEMORY", > "value": "1g" > }, > { > "name": "SPARK_DRIVER_CLASS", > "value": "com.myapp.App" > }, > { > "name": "SPARK_DRIVER_ARGS", > "value": "-c