[ https://issues.apache.org/jira/browse/SPARK-31726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17480756#comment-17480756 ]
koert kuipers commented on SPARK-31726: --------------------------------------- [~beregon87] about --jars, are you seeing that the jars are also not available on driver, or not added to classpath, or both? i ran a simple test where i added a jar from s3, e.g. --jars s3a://some/jar.jar, and was surprised to find the driver could not find a class in that jar (on kubernetes with cluster deploy mode). this would be a more serious bug given the description of --jars clearly says it should: Comma-separated list of jars to include on the driver and executor classpaths. now with --files its to bad the drivers dont get it but at least it does what it says on the tin (which does not include a promise to get the files to the driver): Comma-separated list of files to be placed in the working directory of each executor. > Make spark.files available in driver with cluster deploy mode on kubernetes > --------------------------------------------------------------------------- > > Key: SPARK-31726 > URL: https://issues.apache.org/jira/browse/SPARK-31726 > Project: Spark > Issue Type: Improvement > Components: Kubernetes, Spark Core > Affects Versions: 3.0.0 > Reporter: koert kuipers > Priority: Minor > > currently on yarn with cluster deploy mode --files makes the files available > for driver and executors and also put them on classpath for driver and > executors. > on k8s with cluster deploy mode --files makes the files available on > executors but they are not on classpath. it does not make the files available > on driver and they are not on driver classpath. > it would be nice if the k8s behavior was consistent with yarn, or at least > makes the files available on driver. once the files are available there is a > simple workaround to get them on classpath using > spark.driver.extraClassPath="./" > background: > we recently started testing kubernetes for spark. our main platform is yarn > on which we use client deploy mode. our first experience was that client > deploy mode was difficult to use on k8s (we dont launch from inside a pod). > so we switched to cluster deploy mode, which seems to behave well on k8s. but > then we realized that our program rely on reading files on classpath > (application.conf, log4j.properties etc.) that are on the client but now are > no longer on the driver (since driver is no longer on client). an easy fix > for this seems to be to ship the files using --files to make them available > on driver, but we could not get this to work. > -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org