Hi ,
This is probably not a spark issue, and more a configuration that I am missing. Any help would be appreciated. I am running Spark from a docker template with the following configuration: version: '2' services: master: image: gettyimages/spark command: bin/spark-class org.apache.spark.deploy.master.Master -h master hostname: master environment: MASTER: spark://master:7077 SPARK_CONF_DIR: /conf SPARK_PUBLIC_DNS: localhost expose: - 7001 - 7002 - 7003 - 7004 - 7005 - 7006 - 7077 - 6066 ports: - 4040:4040 - 6066:6066 - 7077:7077 - 8080:8080 worker: image: gettyimages/spark command: bin/spark-class org.apache.spark.deploy.worker.Worker spark://master:7077 hostname: worker environment: SPARK_CONF_DIR: /conf SPARK_WORKER_CORES: 2 SPARK_WORKER_MEMORY: 1g SPARK_WORKER_PORT: 8881 SPARK_WORKER_WEBUI_PORT: 8081 SPARK_PUBLIC_DNS: localhost links: - master expose: - 7012 - 7013 - 7014 - 7015 - 7016 - 8881 ports: - 8081:8081 And I have the following simple Java program: SparkConf conf = new SparkConf().setMaster("spark://localhost:7077").setAppName("Word Count Sample App"); conf.set("spark.dynamicAllocation.enabled","false"); String file = "test.txt"; JavaSparkContext sc = new JavaSparkContext(conf); JavaRDD<String> textFile = sc.textFile("src/main/resources/" + file); JavaPairRDD<String, Integer> counts = textFile.flatMap(s -> Arrays.asList(s.split("[ ,]")).iterator()).mapToPair(word -> new Tuple2<>(word, 1)).reduceByKey((a, b) -> a + b);counts.foreach(p -> System.out.println(p)); System.out.println("Total words: " + counts.count()); counts.saveAsTextFile(file + "out.txt"); The problem that I am having is that at runtime , Java is calling the following command Spark Executor Command: "/usr/jdk1.8.0_131/bin/java" "-cp" "/conf:/usr/spark-2.3.0/jars/*:/usr/hadoop-2.8.3/etc/hadoop/:/usr/hadoop-2.8 .3/etc/hadoop/*:/usr/hadoop-2.8.3/share/hadoop/common/lib/*:/usr/hadoop-2.8. 3/share/hadoop/common/*:/usr/hadoop-2.8.3/share/hadoop/hdfs/*:/usr/hadoop-2. 8.3/share/hadoop/hdfs/lib/*:/usr/hadoop-2.8.3/share/hadoop/yarn/lib/*:/usr/h adoop-2.8.3/share/hadoop/yarn/*:/usr/hadoop-2.8.3/share/hadoop/mapreduce/lib /*:/usr/hadoop-2.8.3/share/hadoop/mapreduce/*:/usr/hadoop-2.8.3/share/hadoop /tools/lib/*" "-Xmx1024M" "-Dspark.driver.port=59906" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "spark://CoarseGrainedScheduler@yeikel-pc:59906" "--executor-id" "6" "--hostname" "172.19.0.3" "--cores" "2" "--app-id" "app-20180401005243-0000" "--worker-url" "spark://Worker@172.19.0.3:8881" Which results in Caused by: java.io.IOException: Failed to connect to yeikel-pc:59906 at org.apache.spark.network.client.TransportClientFactory.createClient(Transpor tClientFactory.java:245) at org.apache.spark.network.client.TransportClientFactory.createClient(Transpor tClientFactory.java:187) at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:198) at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:194) at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:190) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:11 42) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:6 17) at java.lang.Thread.run(Thread.java:748) Caused by: java.net.UnknownHostException: yeikel-pc Can I overwrite the "--driver-url" from java? OR how can I disable CoarseGrainedScheduler? I tried to set spark.dynamicAllocation.enabled to false but that did not work.