Initial job has not accepted any resources

arnaudbriche Thu, 07 Aug 2014 09:11:57 -0700

Hi,

I'm trying a simple thing: create an RDD from a text file (~3GB) located in
GlusterFS, which is mounted by all Spark cluster machines,  and calling
rdd.count(); but Spark never managed to complete the job, giving message
like the following: WARN TaskSchedulerImpl: Initial job has not accepted any
resources; check your cluster UI to ensure that workers are registered and
have sufficient memory


I run a standalone Spark Cluster with 1 master node and 5 worker nodes;
worker nodes are 12 cores, 64GB machine and I allocated 6 cores et 32GB to
each Spark slave (1 per slave machine).

I run spark-shell with the following command: spark-shell --master
--driver-cores 6 --executor-memory 16g 

Following is my Spark shell session:

scala> val f = sc.textFile("/mnt/backups/stats.json")
14/08/07 17:15:05 INFO MemoryStore: ensureFreeSpace(138763) called with
curMem=0, maxMem=309225062
14/08/07 17:15:05 INFO MemoryStore: Block broadcast_0 stored as values to
memory (estimated size 135.5 KB, free 294.8 MB)
f: org.apache.spark.rdd.RDD[String] = MappedRDD[1] at textFile at
<console>:12

scala> f.count()
14/08/07 17:15:18 INFO FileInputFormat: Total input paths to process : 1
14/08/07 17:15:18 INFO SparkContext: Starting job: count at <console>:15
14/08/07 17:15:18 INFO DAGScheduler: Got job 0 (count at <console>:15) with
38 output partitions (allowLocal=false)
14/08/07 17:15:18 INFO DAGScheduler: Final stage: Stage 0(count at
<console>:15)
14/08/07 17:15:18 INFO DAGScheduler: Parents of final stage: List()
14/08/07 17:15:18 INFO DAGScheduler: Missing parents: List()
14/08/07 17:15:18 INFO DAGScheduler: Submitting Stage 0 (MappedRDD[1] at
textFile at <console>:12), which has no missing parents
14/08/07 17:15:18 INFO DAGScheduler: Submitting 38 missing tasks from Stage
0 (MappedRDD[1] at textFile at <console>:12)
14/08/07 17:15:18 INFO TaskSchedulerImpl: Adding task set 0.0 with 38 tasks
14/08/07 17:15:33 WARN TaskSchedulerImpl: Initial job has not accepted any
resources; check your cluster UI to ensure that workers are registered and
have sufficient memory
14/08/07 17:15:48 WARN TaskSchedulerImpl: Initial job has not accepted any
resources; check your cluster UI to ensure that workers are registered and
have sufficient memory
14/08/07 17:16:03 WARN TaskSchedulerImpl: Initial job has not accepted any
resources; check your cluster UI to ensure that workers are registered and
have sufficient memory
14/08/07 17:16:18 WARN TaskSchedulerImpl: Initial job has not accepted any
resources; check your cluster UI to ensure that workers are registered and
have sufficient memory
14/08/07 17:16:26 INFO AppClient$ClientActor: Executor updated:
app-20140807171444-0002/4 is now EXITED (Command exited with code 1)
14/08/07 17:16:26 INFO SparkDeploySchedulerBackend: Executor
app-20140807171444-0002/4 removed: Command exited with code 1
14/08/07 17:16:26 INFO AppClient$ClientActor: Executor added:
app-20140807171444-0002/5 on worker-20140807155724-172.18.31.153-7778
(172.18.31.153:7778) with 6 cores
14/08/07 17:16:26 INFO SparkDeploySchedulerBackend: Granted executor ID
app-20140807171444-0002/5 on hostPort 172.18.31.153:7778 with 6 cores, 16.0
GB RAM
14/08/07 17:16:26 INFO AppClient$ClientActor: Executor updated:
app-20140807171444-0002/0 is now EXITED (Command exited with code 1)
14/08/07 17:16:26 INFO SparkDeploySchedulerBackend: Executor
app-20140807171444-0002/0 removed: Command exited with code 1
14/08/07 17:16:26 INFO AppClient$ClientActor: Executor added:
app-20140807171444-0002/6 on worker-20140807155724-172.22.56.186-7778
(172.22.56.186:7778) with 6 cores
14/08/07 17:16:26 INFO SparkDeploySchedulerBackend: Granted executor ID
app-20140807171444-0002/6 on hostPort 172.22.56.186:7778 with 6 cores, 16.0
GB RAM
14/08/07 17:16:26 INFO AppClient$ClientActor: Executor updated:
app-20140807171444-0002/1 is now EXITED (Command exited with code 1)
14/08/07 17:16:26 INFO SparkDeploySchedulerBackend: Executor
app-20140807171444-0002/1 removed: Command exited with code 1
14/08/07 17:16:26 INFO AppClient$ClientActor: Executor added:
app-20140807171444-0002/7 on worker-20140807155724-172.28.173.218-7778
(172.28.173.218:7778) with 6 cores
14/08/07 17:16:26 INFO SparkDeploySchedulerBackend: Granted executor ID
app-20140807171444-0002/7 on hostPort 172.28.173.218:7778 with 6 cores, 16.0
GB RAM
14/08/07 17:16:26 INFO AppClient$ClientActor: Executor updated:
app-20140807171444-0002/5 is now RUNNING
14/08/07 17:16:26 INFO AppClient$ClientActor: Executor updated:
app-20140807171444-0002/6 is now RUNNING
14/08/07 17:16:26 INFO AppClient$ClientActor: Executor updated:
app-20140807171444-0002/3 is now EXITED (Command exited with code 1)
14/08/07 17:16:26 INFO SparkDeploySchedulerBackend: Executor
app-20140807171444-0002/3 removed: Command exited with code 1
14/08/07 17:16:26 INFO AppClient$ClientActor: Executor added:
app-20140807171444-0002/8 on worker-20140807155724-172.23.64.98-7778
(172.23.64.98:7778) with 6 cores
14/08/07 17:16:26 INFO SparkDeploySchedulerBackend: Granted executor ID
app-20140807171444-0002/8 on hostPort 172.23.64.98:7778 with 6 cores, 16.0
GB RAM
14/08/07 17:16:26 INFO AppClient$ClientActor: Executor updated:
app-20140807171444-0002/7 is now RUNNING
14/08/07 17:16:26 INFO AppClient$ClientActor: Executor updated:
app-20140807171444-0002/8 is now RUNNING
14/08/07 17:16:26 INFO AppClient$ClientActor: Executor updated:
app-20140807171444-0002/2 is now EXITED (Command exited with code 1)
14/08/07 17:16:26 INFO SparkDeploySchedulerBackend: Executor
app-20140807171444-0002/2 removed: Command exited with code 1
14/08/07 17:16:26 INFO AppClient$ClientActor: Executor added:
app-20140807171444-0002/9 on worker-20140807155724-172.29.166.84-7778
(172.29.166.84:7778) with 6 cores
14/08/07 17:16:26 INFO SparkDeploySchedulerBackend: Granted executor ID
app-20140807171444-0002/9 on hostPort 172.29.166.84:7778 with 6 cores, 16.0
GB RAM
14/08/07 17:16:26 INFO AppClient$ClientActor: Executor updated:
app-20140807171444-0002/9 is now RUNNING
14/08/07 17:16:33 WARN TaskSchedulerImpl: Initial job has not accepted any
resources; check your cluster UI to ensure that workers are registered and
have sufficient memory
14/08/07 17:16:48 WARN TaskSchedulerImpl: Initial job has not accepted any
resources; check your cluster UI to ensure that workers are registered and
have sufficient memory
14/08/07 17:17:03 WARN TaskSchedulerImpl: Initial job has not accepted any
resources; check your cluster UI to ensure that workers are registered and
have sufficient memory
14/08/07 17:17:18 WARN TaskSchedulerImpl: Initial job has not accepted any
resources; check your cluster UI to ensure that workers are registered and
have sufficient memory
14/08/07 17:17:33 WARN TaskSchedulerImpl: Initial job has not accepted any
resources; check your cluster UI to ensure that workers are registered and
have sufficient memory
14/08/07 17:17:48 WARN TaskSchedulerImpl: Initial job has not accepted any
resources; check your cluster UI to ensure that workers are registered and
have sufficient memory
14/08/07 17:18:03 WARN TaskSchedulerImpl: Initial job has not accepted any
resources; check your cluster UI to ensure that workers are registered and
have sufficient memory
14/08/07 17:18:07 INFO AppClient$ClientActor: Executor updated:
app-20140807171444-0002/5 is now EXITED (Command exited with code 1)
14/08/07 17:18:07 INFO SparkDeploySchedulerBackend: Executor
app-20140807171444-0002/5 removed: Command exited with code 1
14/08/07 17:18:07 INFO AppClient$ClientActor: Executor added:
app-20140807171444-0002/10 on worker-20140807155724-172.18.31.153-7778
(172.18.31.153:7778) with 6 cores
14/08/07 17:18:07 INFO SparkDeploySchedulerBackend: Granted executor ID
app-20140807171444-0002/10 on hostPort 172.18.31.153:7778 with 6 cores, 16.0
GB RAM
14/08/07 17:18:07 INFO AppClient$ClientActor: Executor updated:
app-20140807171444-0002/7 is now EXITED (Command exited with code 1)
14/08/07 17:18:07 INFO SparkDeploySchedulerBackend: Executor
app-20140807171444-0002/7 removed: Command exited with code 1
14/08/07 17:18:07 INFO AppClient$ClientActor: Executor added:
app-20140807171444-0002/11 on worker-20140807155724-172.28.173.218-7778
(172.28.173.218:7778) with 6 cores
14/08/07 17:18:07 INFO SparkDeploySchedulerBackend: Granted executor ID
app-20140807171444-0002/11 on hostPort 172.28.173.218:7778 with 6 cores,
16.0 GB RAM
14/08/07 17:18:07 INFO AppClient$ClientActor: Executor updated:
app-20140807171444-0002/6 is now EXITED (Command exited with code 1)
14/08/07 17:18:07 INFO SparkDeploySchedulerBackend: Executor
app-20140807171444-0002/6 removed: Command exited with code 1
14/08/07 17:18:07 INFO AppClient$ClientActor: Executor added:
app-20140807171444-0002/12 on worker-20140807155724-172.22.56.186-7778
(172.22.56.186:7778) with 6 cores
14/08/07 17:18:07 INFO SparkDeploySchedulerBackend: Granted executor ID
app-20140807171444-0002/12 on hostPort 172.22.56.186:7778 with 6 cores, 16.0
GB RAM
14/08/07 17:18:07 INFO AppClient$ClientActor: Executor updated:
app-20140807171444-0002/10 is now RUNNING
14/08/07 17:18:07 INFO AppClient$ClientActor: Executor updated:
app-20140807171444-0002/11 is now RUNNING
14/08/07 17:18:07 INFO AppClient$ClientActor: Executor updated:
app-20140807171444-0002/12 is now RUNNING
14/08/07 17:18:07 INFO AppClient$ClientActor: Executor updated:
app-20140807171444-0002/8 is now EXITED (Command exited with code 1)
14/08/07 17:18:07 INFO SparkDeploySchedulerBackend: Executor
app-20140807171444-0002/8 removed: Command exited with code 1
14/08/07 17:18:07 INFO AppClient$ClientActor: Executor added:
app-20140807171444-0002/13 on worker-20140807155724-172.23.64.98-7778
(172.23.64.98:7778) with 6 cores
14/08/07 17:18:07 INFO SparkDeploySchedulerBackend: Granted executor ID
app-20140807171444-0002/13 on hostPort 172.23.64.98:7778 with 6 cores, 16.0
GB RAM
14/08/07 17:18:07 INFO AppClient$ClientActor: Executor updated:
app-20140807171444-0002/13 is now RUNNING
14/08/07 17:18:08 INFO AppClient$ClientActor: Executor updated:
app-20140807171444-0002/9 is now EXITED (Command exited with code 1)
14/08/07 17:18:08 INFO SparkDeploySchedulerBackend: Executor
app-20140807171444-0002/9 removed: Command exited with code 1
14/08/07 17:18:08 ERROR SparkDeploySchedulerBackend: Application has been
killed. Reason: Master removed our application: FAILED
14/08/07 17:18:08 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks
have all completed, from pool
14/08/07 17:18:08 INFO TaskSchedulerImpl: Cancelling stage 0
14/08/07 17:18:08 INFO DAGScheduler: Failed to run count at <console>:15
14/08/07 17:18:08 INFO SparkUI: Stopped Spark web UI at
http://redis-1-prod.adyoulike.net:4040
14/08/07 17:18:08 INFO DAGScheduler: Stopping DAGScheduler
14/08/07 17:18:08 INFO SparkDeploySchedulerBackend: Shutting down all
executors
14/08/07 17:18:08 INFO SparkDeploySchedulerBackend: Asking each executor to
shut down
org.apache.spark.SparkException: Job aborted due to stage failure: Master
removed our application: FAILED
    at
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1049)
    at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1033)
    at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1031)
    at
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
    at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
    at
org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1031)
    at
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:635)
    at
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:635)
    at scala.Option.foreach(Option.scala:236)
    at
org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:635)
    at
org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1234)
    at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
    at akka.actor.ActorCell.invoke(ActorCell.scala:456)
    at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
    at akka.dispatch.Mailbox.run(Mailbox.scala:219)
    at
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
    at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
    at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
    at
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
    at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)


scala> 14/08/07 17:18:18 WARN TaskSchedulerImpl: Initial job has not
accepted any resources; check your cluster UI to ensure that workers are
registered and have sufficient memory


scala>

scala> 14/08/07 17:18:33 WARN TaskSchedulerImpl: Initial job has not
accepted any resources; check your cluster UI to ensure that workers are
registered and have sufficient memory
14/08/07 17:18:38 INFO AppClient: Stop request to Master timed out; it may
already be shut down.
14/08/07 17:18:39 INFO MapOutputTrackerMasterActor: MapOutputTrackerActor
stopped!
14/08/07 17:18:39 INFO ConnectionManager: Selector thread was interrupted!
14/08/07 17:18:39 INFO ConnectionManager: ConnectionManager stopped
14/08/07 17:18:39 INFO MemoryStore: MemoryStore cleared
14/08/07 17:18:39 INFO BlockManager: BlockManager stopped
14/08/07 17:18:39 INFO BlockManagerMasterActor: Stopping BlockManagerMaster
14/08/07 17:18:39 INFO BlockManagerMaster: BlockManagerMaster stopped
14/08/07 17:18:39 INFO SparkContext: Successfully stopped SparkContext
14/08/07 17:18:39 INFO RemoteActorRefProvider$RemotingTerminator: Shutting
down remote daemon.
14/08/07 17:18:39 INFO RemoteActorRefProvider$RemotingTerminator: Remote
daemon shut down; proceeding with flushing remote transports.
14/08/07 17:18:39 INFO Remoting: Remoting shut down
14/08/07 17:18:39 INFO RemoteActorRefProvider$RemotingTerminator: Remoting
shut down.




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Initial-job-has-not-accepted-any-resources-tp11668.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Initial job has not accepted any resources

Reply via email to