Hi, I'm trying a simple thing: create an RDD from a text file (~3GB) located in GlusterFS, which is mounted by all Spark cluster machines, and calling rdd.count(); but Spark never managed to complete the job, giving message like the following: WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory
I run a standalone Spark Cluster with 1 master node and 5 worker nodes; worker nodes are 12 cores, 64GB machine and I allocated 6 cores et 32GB to each Spark slave (1 per slave machine). I run spark-shell with the following command: spark-shell --master --driver-cores 6 --executor-memory 16g Following is my Spark shell session: scala> val f = sc.textFile("/mnt/backups/stats.json") 14/08/07 17:15:05 INFO MemoryStore: ensureFreeSpace(138763) called with curMem=0, maxMem=309225062 14/08/07 17:15:05 INFO MemoryStore: Block broadcast_0 stored as values to memory (estimated size 135.5 KB, free 294.8 MB) f: org.apache.spark.rdd.RDD[String] = MappedRDD[1] at textFile at <console>:12 scala> f.count() 14/08/07 17:15:18 INFO FileInputFormat: Total input paths to process : 1 14/08/07 17:15:18 INFO SparkContext: Starting job: count at <console>:15 14/08/07 17:15:18 INFO DAGScheduler: Got job 0 (count at <console>:15) with 38 output partitions (allowLocal=false) 14/08/07 17:15:18 INFO DAGScheduler: Final stage: Stage 0(count at <console>:15) 14/08/07 17:15:18 INFO DAGScheduler: Parents of final stage: List() 14/08/07 17:15:18 INFO DAGScheduler: Missing parents: List() 14/08/07 17:15:18 INFO DAGScheduler: Submitting Stage 0 (MappedRDD[1] at textFile at <console>:12), which has no missing parents 14/08/07 17:15:18 INFO DAGScheduler: Submitting 38 missing tasks from Stage 0 (MappedRDD[1] at textFile at <console>:12) 14/08/07 17:15:18 INFO TaskSchedulerImpl: Adding task set 0.0 with 38 tasks 14/08/07 17:15:33 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 14/08/07 17:15:48 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 14/08/07 17:16:03 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 14/08/07 17:16:18 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 14/08/07 17:16:26 INFO AppClient$ClientActor: Executor updated: app-20140807171444-0002/4 is now EXITED (Command exited with code 1) 14/08/07 17:16:26 INFO SparkDeploySchedulerBackend: Executor app-20140807171444-0002/4 removed: Command exited with code 1 14/08/07 17:16:26 INFO AppClient$ClientActor: Executor added: app-20140807171444-0002/5 on worker-20140807155724-172.18.31.153-7778 (172.18.31.153:7778) with 6 cores 14/08/07 17:16:26 INFO SparkDeploySchedulerBackend: Granted executor ID app-20140807171444-0002/5 on hostPort 172.18.31.153:7778 with 6 cores, 16.0 GB RAM 14/08/07 17:16:26 INFO AppClient$ClientActor: Executor updated: app-20140807171444-0002/0 is now EXITED (Command exited with code 1) 14/08/07 17:16:26 INFO SparkDeploySchedulerBackend: Executor app-20140807171444-0002/0 removed: Command exited with code 1 14/08/07 17:16:26 INFO AppClient$ClientActor: Executor added: app-20140807171444-0002/6 on worker-20140807155724-172.22.56.186-7778 (172.22.56.186:7778) with 6 cores 14/08/07 17:16:26 INFO SparkDeploySchedulerBackend: Granted executor ID app-20140807171444-0002/6 on hostPort 172.22.56.186:7778 with 6 cores, 16.0 GB RAM 14/08/07 17:16:26 INFO AppClient$ClientActor: Executor updated: app-20140807171444-0002/1 is now EXITED (Command exited with code 1) 14/08/07 17:16:26 INFO SparkDeploySchedulerBackend: Executor app-20140807171444-0002/1 removed: Command exited with code 1 14/08/07 17:16:26 INFO AppClient$ClientActor: Executor added: app-20140807171444-0002/7 on worker-20140807155724-172.28.173.218-7778 (172.28.173.218:7778) with 6 cores 14/08/07 17:16:26 INFO SparkDeploySchedulerBackend: Granted executor ID app-20140807171444-0002/7 on hostPort 172.28.173.218:7778 with 6 cores, 16.0 GB RAM 14/08/07 17:16:26 INFO AppClient$ClientActor: Executor updated: app-20140807171444-0002/5 is now RUNNING 14/08/07 17:16:26 INFO AppClient$ClientActor: Executor updated: app-20140807171444-0002/6 is now RUNNING 14/08/07 17:16:26 INFO AppClient$ClientActor: Executor updated: app-20140807171444-0002/3 is now EXITED (Command exited with code 1) 14/08/07 17:16:26 INFO SparkDeploySchedulerBackend: Executor app-20140807171444-0002/3 removed: Command exited with code 1 14/08/07 17:16:26 INFO AppClient$ClientActor: Executor added: app-20140807171444-0002/8 on worker-20140807155724-172.23.64.98-7778 (172.23.64.98:7778) with 6 cores 14/08/07 17:16:26 INFO SparkDeploySchedulerBackend: Granted executor ID app-20140807171444-0002/8 on hostPort 172.23.64.98:7778 with 6 cores, 16.0 GB RAM 14/08/07 17:16:26 INFO AppClient$ClientActor: Executor updated: app-20140807171444-0002/7 is now RUNNING 14/08/07 17:16:26 INFO AppClient$ClientActor: Executor updated: app-20140807171444-0002/8 is now RUNNING 14/08/07 17:16:26 INFO AppClient$ClientActor: Executor updated: app-20140807171444-0002/2 is now EXITED (Command exited with code 1) 14/08/07 17:16:26 INFO SparkDeploySchedulerBackend: Executor app-20140807171444-0002/2 removed: Command exited with code 1 14/08/07 17:16:26 INFO AppClient$ClientActor: Executor added: app-20140807171444-0002/9 on worker-20140807155724-172.29.166.84-7778 (172.29.166.84:7778) with 6 cores 14/08/07 17:16:26 INFO SparkDeploySchedulerBackend: Granted executor ID app-20140807171444-0002/9 on hostPort 172.29.166.84:7778 with 6 cores, 16.0 GB RAM 14/08/07 17:16:26 INFO AppClient$ClientActor: Executor updated: app-20140807171444-0002/9 is now RUNNING 14/08/07 17:16:33 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 14/08/07 17:16:48 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 14/08/07 17:17:03 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 14/08/07 17:17:18 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 14/08/07 17:17:33 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 14/08/07 17:17:48 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 14/08/07 17:18:03 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 14/08/07 17:18:07 INFO AppClient$ClientActor: Executor updated: app-20140807171444-0002/5 is now EXITED (Command exited with code 1) 14/08/07 17:18:07 INFO SparkDeploySchedulerBackend: Executor app-20140807171444-0002/5 removed: Command exited with code 1 14/08/07 17:18:07 INFO AppClient$ClientActor: Executor added: app-20140807171444-0002/10 on worker-20140807155724-172.18.31.153-7778 (172.18.31.153:7778) with 6 cores 14/08/07 17:18:07 INFO SparkDeploySchedulerBackend: Granted executor ID app-20140807171444-0002/10 on hostPort 172.18.31.153:7778 with 6 cores, 16.0 GB RAM 14/08/07 17:18:07 INFO AppClient$ClientActor: Executor updated: app-20140807171444-0002/7 is now EXITED (Command exited with code 1) 14/08/07 17:18:07 INFO SparkDeploySchedulerBackend: Executor app-20140807171444-0002/7 removed: Command exited with code 1 14/08/07 17:18:07 INFO AppClient$ClientActor: Executor added: app-20140807171444-0002/11 on worker-20140807155724-172.28.173.218-7778 (172.28.173.218:7778) with 6 cores 14/08/07 17:18:07 INFO SparkDeploySchedulerBackend: Granted executor ID app-20140807171444-0002/11 on hostPort 172.28.173.218:7778 with 6 cores, 16.0 GB RAM 14/08/07 17:18:07 INFO AppClient$ClientActor: Executor updated: app-20140807171444-0002/6 is now EXITED (Command exited with code 1) 14/08/07 17:18:07 INFO SparkDeploySchedulerBackend: Executor app-20140807171444-0002/6 removed: Command exited with code 1 14/08/07 17:18:07 INFO AppClient$ClientActor: Executor added: app-20140807171444-0002/12 on worker-20140807155724-172.22.56.186-7778 (172.22.56.186:7778) with 6 cores 14/08/07 17:18:07 INFO SparkDeploySchedulerBackend: Granted executor ID app-20140807171444-0002/12 on hostPort 172.22.56.186:7778 with 6 cores, 16.0 GB RAM 14/08/07 17:18:07 INFO AppClient$ClientActor: Executor updated: app-20140807171444-0002/10 is now RUNNING 14/08/07 17:18:07 INFO AppClient$ClientActor: Executor updated: app-20140807171444-0002/11 is now RUNNING 14/08/07 17:18:07 INFO AppClient$ClientActor: Executor updated: app-20140807171444-0002/12 is now RUNNING 14/08/07 17:18:07 INFO AppClient$ClientActor: Executor updated: app-20140807171444-0002/8 is now EXITED (Command exited with code 1) 14/08/07 17:18:07 INFO SparkDeploySchedulerBackend: Executor app-20140807171444-0002/8 removed: Command exited with code 1 14/08/07 17:18:07 INFO AppClient$ClientActor: Executor added: app-20140807171444-0002/13 on worker-20140807155724-172.23.64.98-7778 (172.23.64.98:7778) with 6 cores 14/08/07 17:18:07 INFO SparkDeploySchedulerBackend: Granted executor ID app-20140807171444-0002/13 on hostPort 172.23.64.98:7778 with 6 cores, 16.0 GB RAM 14/08/07 17:18:07 INFO AppClient$ClientActor: Executor updated: app-20140807171444-0002/13 is now RUNNING 14/08/07 17:18:08 INFO AppClient$ClientActor: Executor updated: app-20140807171444-0002/9 is now EXITED (Command exited with code 1) 14/08/07 17:18:08 INFO SparkDeploySchedulerBackend: Executor app-20140807171444-0002/9 removed: Command exited with code 1 14/08/07 17:18:08 ERROR SparkDeploySchedulerBackend: Application has been killed. Reason: Master removed our application: FAILED 14/08/07 17:18:08 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 14/08/07 17:18:08 INFO TaskSchedulerImpl: Cancelling stage 0 14/08/07 17:18:08 INFO DAGScheduler: Failed to run count at <console>:15 14/08/07 17:18:08 INFO SparkUI: Stopped Spark web UI at http://redis-1-prod.adyoulike.net:4040 14/08/07 17:18:08 INFO DAGScheduler: Stopping DAGScheduler 14/08/07 17:18:08 INFO SparkDeploySchedulerBackend: Shutting down all executors 14/08/07 17:18:08 INFO SparkDeploySchedulerBackend: Asking each executor to shut down org.apache.spark.SparkException: Job aborted due to stage failure: Master removed our application: FAILED at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1049) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1033) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1031) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1031) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:635) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:635) at scala.Option.foreach(Option.scala:236) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:635) at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1234) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) at akka.actor.ActorCell.invoke(ActorCell.scala:456) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) at akka.dispatch.Mailbox.run(Mailbox.scala:219) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) scala> 14/08/07 17:18:18 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory scala> scala> 14/08/07 17:18:33 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 14/08/07 17:18:38 INFO AppClient: Stop request to Master timed out; it may already be shut down. 14/08/07 17:18:39 INFO MapOutputTrackerMasterActor: MapOutputTrackerActor stopped! 14/08/07 17:18:39 INFO ConnectionManager: Selector thread was interrupted! 14/08/07 17:18:39 INFO ConnectionManager: ConnectionManager stopped 14/08/07 17:18:39 INFO MemoryStore: MemoryStore cleared 14/08/07 17:18:39 INFO BlockManager: BlockManager stopped 14/08/07 17:18:39 INFO BlockManagerMasterActor: Stopping BlockManagerMaster 14/08/07 17:18:39 INFO BlockManagerMaster: BlockManagerMaster stopped 14/08/07 17:18:39 INFO SparkContext: Successfully stopped SparkContext 14/08/07 17:18:39 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon. 14/08/07 17:18:39 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports. 14/08/07 17:18:39 INFO Remoting: Remoting shut down 14/08/07 17:18:39 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Initial-job-has-not-accepted-any-resources-tp11668.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org