[ https://issues.apache.org/jira/browse/SPARK-35715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pardhu Madipalli updated SPARK-35715: ------------------------------------- Description: When we provide a local file as a dependency using "--files" option, the file is not getting copied to work directories of executors. h5. Example 1: {code:java} $SPARK_HOME/bin/spark-submit --master k8s://https://<ip-address-k8s> \ --deploy-mode cluster \ --name spark-pi \ --class org.apache.spark.examples.SparkPi \ --conf spark.executor.instances=1 \ --conf spark.kubernetes.container.image=<spark-3.1.2-image> \ --conf spark.kubernetes.driver.pod.name=sparkdriverpod \ --files local:///etc/xattr.conf \ local:///opt/spark/examples/jars/spark-examples_2.12-3.1.2.jar 1000000 {code} h6. Content of Spark Executor work-dir: {code:java} ~$ kubectl exec -n default spark-pi-22de6279f6bec01c-exec-1 ls /opt/spark/work-dir/ spark-examples_2.12-3.1.2.jar {code} We can notice here that the file _/etc/xattr.conf_ is *NOT* copied to _/opt/spark/work-dir/ ._ ---- Instead of using "--files", if we use "--jars" option the file is getting copied as expected. h5. Example 2: {code:java} $SPARK_HOME/bin/spark-submit --master k8s://https://<ip-address-k8s> \ --deploy-mode cluster \ --name spark-pi \ --class org.apache.spark.examples.SparkPi \ --conf spark.executor.instances=1 \ --conf spark.kubernetes.container.image=<spark-3.1.2-image> \ --conf spark.kubernetes.driver.pod.name=sparkdriverpod \ --jars local:///etc/xattr.conf \ local:///opt/spark/examples/jars/spark-examples_2.12-3.1.2.jar 1000000 {code} h6. Content of Spark Executor work-dir: {code:java} ~$ kubectl exec -n default spark-pi-22de6279f6bec01c-exec-1 ls /opt/spark/work-dir/ spark-examples_2.12-3.1.2.jar xattr.conf {code} We can notice here that the file _/etc/xattr.conf_ *IS COPIED* to _/opt/spark/work-dir/ ._ I tested this with versions *3.1.2* and *3.0.2*. It is behaving the same way in both cases. was: When we provide a local file as a dependency using "--files" option, the file is not getting copied to work directories of executors. h5. Example 1: {code:java} $SPARK_HOME/bin/spark-submit --master k8s://https://<ip-address-k8s> \ --deploy-mode cluster \ --name spark-pi \ --class org.apache.spark.examples.SparkPi \ --conf spark.executor.instances=1 \ --conf spark.kubernetes.container.image=<spark-3.1.2-image> \ --conf spark.kubernetes.driver.pod.name=sparkdriverpod \ --files local:///etc/xattr.conf \ local:///opt/spark/examples/jars/spark-examples_2.12-3.1.2.jar 1000000 {code} h6. Content of Spark Executor work-dir: {code:java} ~$ kubectl exec -n default spark-pi-22de6279f6bec01c-exec-1 ls /opt/spark/work-dir/ spark-examples_2.12-3.1.2.jar {code} We can notice here that the file _/etc/xattr.conf_ is *NOT* copied to _/opt/spark/work-dir/ ._ ---- Instead of using "--files", if we use "--jars" option the file is getting copied as expected. h5. Example 2: {code:java} $SPARK_HOME/bin/spark-submit --master k8s://https://<ip-address-k8s> \ --deploy-mode cluster \ --name spark-pi \ --class org.apache.spark.examples.SparkPi \ --conf spark.executor.instances=1 \ --conf spark.kubernetes.container.image=<spark-3.1.2-image> \ --conf spark.kubernetes.driver.pod.name=sparkdriverpod \ --jars local:///etc/xattr.conf \ local:///opt/spark/examples/jars/spark-examples_2.12-3.1.2.jar 1000000 {code} h6. Content of Spark Executor work-dir: {code:java} ~$ kubectl exec -n default spark-pi-22de6279f6bec01c-exec-1 ls /opt/spark/work-dir/ spark-examples_2.12-3.1.2.jar xattr.conf {code} We can notice here that the file _/etc/xattr.conf_ *IS COPIED* to _/opt/spark/work-dir/ ._ I tested this with versions *3.1.2* and *3.0.2*. It is behaving the same way in both cases. > Option "--files" with local:// prefix is not honoured for Spark on kubernetes > ----------------------------------------------------------------------------- > > Key: SPARK-35715 > URL: https://issues.apache.org/jira/browse/SPARK-35715 > Project: Spark > Issue Type: Bug > Components: Kubernetes > Affects Versions: 3.0.2, 3.1.2 > Reporter: Pardhu Madipalli > Priority: Major > > When we provide a local file as a dependency using "--files" option, the file > is not getting copied to work directories of executors. > h5. Example 1: > > {code:java} > $SPARK_HOME/bin/spark-submit --master k8s://https://<ip-address-k8s> \ > --deploy-mode cluster \ > --name spark-pi \ > --class org.apache.spark.examples.SparkPi \ > --conf spark.executor.instances=1 \ > --conf spark.kubernetes.container.image=<spark-3.1.2-image> \ > --conf spark.kubernetes.driver.pod.name=sparkdriverpod \ > --files local:///etc/xattr.conf \ > local:///opt/spark/examples/jars/spark-examples_2.12-3.1.2.jar 1000000 > {code} > > h6. Content of Spark Executor work-dir: > > {code:java} > ~$ kubectl exec -n default spark-pi-22de6279f6bec01c-exec-1 ls > /opt/spark/work-dir/ > spark-examples_2.12-3.1.2.jar > {code} > > We can notice here that the file _/etc/xattr.conf_ is *NOT* copied to > _/opt/spark/work-dir/ ._ > > ---- > > Instead of using "--files", if we use "--jars" option the file is getting > copied as expected. > h5. Example 2: > {code:java} > $SPARK_HOME/bin/spark-submit --master k8s://https://<ip-address-k8s> \ > --deploy-mode cluster \ > --name spark-pi \ > --class org.apache.spark.examples.SparkPi \ > --conf spark.executor.instances=1 \ > --conf spark.kubernetes.container.image=<spark-3.1.2-image> \ > --conf spark.kubernetes.driver.pod.name=sparkdriverpod \ > --jars local:///etc/xattr.conf \ > local:///opt/spark/examples/jars/spark-examples_2.12-3.1.2.jar 1000000 > {code} > h6. Content of Spark Executor work-dir: > > {code:java} > ~$ kubectl exec -n default spark-pi-22de6279f6bec01c-exec-1 ls > /opt/spark/work-dir/ > spark-examples_2.12-3.1.2.jar > xattr.conf > {code} > We can notice here that the file _/etc/xattr.conf_ *IS COPIED* to > _/opt/spark/work-dir/ ._ > > I tested this with versions *3.1.2* and *3.0.2*. It is behaving the same way > in both cases. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org