I am running the following command on a Hadoop cluster to launch Spark shell
with DRA:
spark-shell --conf spark.dynamicAllocation.enabled=true --conf
spark.shuffle.service.enabled=true --conf
spark.dynamicAllocation.minExecutors=4 --conf
spark.dynamicAllocation.maxExecutors=12 --conf
spark.dynamicAllocation.sustainedSchedulerBacklogTimeout=120 --conf
spark.dynamicAllocation.schedulerBacklogTimeout=300 --conf
spark.dynamicAllocation.executorIdleTimeout=60 --executor-memory 512m --master
yarn-client --queue default
This is the code I'm running within the Spark Shell - just demo stuff from teh
web site.
import org.apache.spark.mllib.clustering.KMeans
import org.apache.spark.mllib.linalg.Vectors
// Load and parse the data
val data = sc.textFile("hdfs://ns/public/sample/kmeans_data.txt")
val parsedData = data.map(s => Vectors.dense(s.split('
').map(_.toDouble))).cache()
// Cluster the data into two classes using KMeans
val numClusters = 2
val numIterations = 20
val clusters = KMeans.train(parsedData, numClusters, numIterations)
This works fine on Spark 1.4.1 but is failing on Spark 1.5.1. Did something
change that I need to do differently for DRA on 1.5.1?
This is the error I am getting:
15/10/29 21:44:19 WARN YarnScheduler: Initial job has not accepted any
resources; check your cluster UI to ensure that workers are registered and have
sufficient resources
15/10/29 21:44:34 WARN YarnScheduler: Initial job has not accepted any
resources; check your cluster UI to ensure that workers are registered and have
sufficient resources
15/10/29 21:44:49 WARN YarnScheduler: Initial job has not accepted any
resources; check your cluster UI to ensure that workers are registered and have
sufficient resources
That happens to be the same error you get if you haven't followed the steps to
enable DRA, however I have done those and as I said if I just flip to Spark
1.4.1 on the same cluster it works with my YARN config.