Hi All, I'm running into a weird issue with my test mesos cluster, I have a 3 master / 3 slave HA configuration. Marathon and Chronos are working as they should and I can deploy dockerized applications to the slave nodes without issue using Marathon. I downloaded Spark 1.2 and built from source. Standalone mode works correctly but when I attempt to submit jobs to the Mesos Cluster from Spark, it connects and shows up as a framework but I get "Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory". I have appended the relevant info believe below and I appreciate any help with this. I've tried this in both coarse and fine grain and I get the same result.
-Brian I'm running on ubuntu trusty 64 my spark-env.sh contains export MESOS_NATIVE_LIBRARY=/usr/local/lib/libmesos.so export SPARK_EXECUTOR_URI=http://192.0.3.11:8081/spark-1.2.0.tgz export MASTER=mesos://zk://192.0.3.11:2181,192.0.3.12:2181, 192.0.3.13:2181/mesos export SPARK_WORKER_MEMORY=512M export SPARK_WORKER_CORES=1 export SPARK_LOCAL_IP=192.0.3.11 My Mesos Cluster sees *Cluster*: Mesos_Cluster *Server*: 192.0.3.12:5050 *Version*: 0.21.1 *Built*: a week ago by root *Started*: 2 hours ago *Elected*: 2 hours ago *Resources* *CPUs* *Mem* *Total* 3 2.9 GB *Used* 0 0 B *Offered* 0 0 B *Idle* 3 2.9 GB In the Spark Log I see vagrant@master1:~/spark-1.2.0$ ./bin/run-example SparkPi 3 Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 15/01/19 02:41:40 INFO SecurityManager: Changing view acls to: vagrant 15/01/19 02:41:40 INFO SecurityManager: Changing modify acls to: vagrant 15/01/19 02:41:40 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(vagrant); users with modify permissions: Set(vagrant) 15/01/19 02:41:41 INFO Slf4jLogger: Slf4jLogger started 15/01/19 02:41:41 INFO Remoting: Starting remoting 15/01/19 02:41:42 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@master1:56626] 15/01/19 02:41:42 INFO Utils: Successfully started service 'sparkDriver' on port 56626. 15/01/19 02:41:42 INFO SparkEnv: Registering MapOutputTracker 15/01/19 02:41:42 INFO SparkEnv: Registering BlockManagerMaster 15/01/19 02:41:42 INFO DiskBlockManager: Created local directory at /tmp/spark-local-20150119024142-16af 15/01/19 02:41:42 INFO MemoryStore: MemoryStore started with capacity 267.3 MB 15/01/19 02:41:42 INFO HttpFileServer: HTTP File server directory is /tmp/spark-80342d7e-780f-4550-933d-adce88265322 15/01/19 02:41:42 INFO HttpServer: Starting HTTP Server 15/01/19 02:41:42 INFO Utils: Successfully started service 'HTTP file server' on port 36273. 15/01/19 02:41:43 INFO Utils: Successfully started service 'SparkUI' on port 4040. 15/01/19 02:41:43 INFO SparkUI: Started SparkUI at http://master1:4040 15/01/19 02:41:43 INFO SparkContext: Added JAR file:/home/vagrant/spark-1.2.0/examples/target/scala-2.10/spark-examples-1.2.0-hadoop1.0.4.jar at http://192.0.3.11:36273/jars/spark-examples-1.2.0-hadoop1.0.4.jar with timestamp 1421635303639 2015-01-19 02:41:44,069:19208(0x7f7da54b3700):ZOO_INFO@log_env@712: Client environment:zookeeper.version=zookeeper C client 3.4.5 2015-01-19 02:41:44,070:19208(0x7f7da54b3700):ZOO_INFO@log_env@716: Client environment:host.name=master1 2015-01-19 02:41:44,070:19208(0x7f7da54b3700):ZOO_INFO@log_env@723: Client environment:os.name=Linux 2015-01-19 02:41:44,071:19208(0x7f7da54b3700):ZOO_INFO@log_env@724: Client environment:os.arch=3.13.0-43-generic 2015-01-19 02:41:44,071:19208(0x7f7da54b3700):ZOO_INFO@log_env@725: Client environment:os.version=#72-Ubuntu SMP Mon Dec 8 19:35:06 UTC 2014 2015-01-19 02:41:44,072:19208(0x7f7da54b3700):ZOO_INFO@log_env@733: Client environment:user.name=vagrant 2015-01-19 02:41:44,072:19208(0x7f7da54b3700):ZOO_INFO@log_env@741: Client environment:user.home=/home/vagrant 2015-01-19 02:41:44,073:19208(0x7f7da54b3700):ZOO_INFO@log_env@753: Client environment:user.dir=/home/vagrant/spark-1.2.0 2015-01-19 02:41:44,073:19208(0x7f7da54b3700):ZOO_INFO@zookeeper_init@786: Initiating client connection, host=192.0.3.11:2181,192.0.3.12:2181, 192.0.3.13:2181sessionTimeout=10000 watcher=0x7f7daa4516a0 sessionId=0 sessionPasswd=<null> context=0xcf0a60 flags=0 2015-01-19 02:41:44,077:19208(0x7f7da3cb0700):ZOO_INFO@check_events@1703: initiated connection to server [192.0.3.13:2181] 2015-01-19 02:41:44,080:19208(0x7f7da3cb0700):ZOO_INFO@check_events@1750: session establishment complete on server [192.0.3.13:2181], sessionId=0x34aff9e627f000e, negotiated timeout=10000 I0119 02:41:44.082293 19313 sched.cpp:137] Version: 0.21.1 I0119 02:41:44.088546 19315 group.cpp:313] Group process (group(1)@ 192.0.3.11:50317) connected to ZooKeeper I0119 02:41:44.088948 19315 group.cpp:790] Syncing group operations: queue size (joins, cancels, datas) = (0, 0, 0) I0119 02:41:44.089274 19315 group.cpp:385] Trying to create path '/mesos' in ZooKeeper I0119 02:41:44.112208 19320 detector.cpp:138] Detected a new leader: (id='2') I0119 02:41:44.113049 19315 group.cpp:659] Trying to get '/mesos/info_0000000002' in ZooKeeper I0119 02:41:44.115067 19316 detector.cpp:433] A new leading master (UPID= [email protected]:5050) is detected I0119 02:41:44.118728 19317 sched.cpp:234] New master detected at [email protected]:5050 I0119 02:41:44.119282 19317 sched.cpp:242] No credentials provided. Attempting to register without authentication I0119 02:41:44.123064 19317 sched.cpp:408] Framework registered with 20150119-003609-201523392-5050-7198-0002 15/01/19 02:41:44 INFO MesosSchedulerBackend: Registered as framework ID 20150119-003609-201523392-5050-7198-0002 15/01/19 02:41:44 INFO NettyBlockTransferService: Server created on 54462 15/01/19 02:41:44 INFO BlockManagerMaster: Trying to register BlockManager 15/01/19 02:41:44 INFO BlockManagerMasterActor: Registering block manager master1:54462 with 267.3 MB RAM, BlockManagerId(<driver>, master1, 54462) 15/01/19 02:41:44 INFO BlockManagerMaster: Registered BlockManager 15/01/19 02:41:44 INFO SparkContext: Starting job: reduce at SparkPi.scala:35 15/01/19 02:41:44 INFO DAGScheduler: Got job 0 (reduce at SparkPi.scala:35) with 3 output partitions (allowLocal=false) 15/01/19 02:41:44 INFO DAGScheduler: Final stage: Stage 0(reduce at SparkPi.scala:35) 15/01/19 02:41:44 INFO DAGScheduler: Parents of final stage: List() 15/01/19 02:41:44 INFO DAGScheduler: Missing parents: List() 15/01/19 02:41:44 INFO DAGScheduler: Submitting Stage 0 (MappedRDD[1] at map at SparkPi.scala:31), which has no missing parents 15/01/19 02:41:45 INFO MemoryStore: ensureFreeSpace(1728) called with curMem=0, maxMem=280248975 15/01/19 02:41:45 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 1728.0 B, free 267.3 MB) 15/01/19 02:41:45 INFO MemoryStore: ensureFreeSpace(1235) called with curMem=1728, maxMem=280248975 15/01/19 02:41:45 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 1235.0 B, free 267.3 MB) 15/01/19 02:41:45 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on master1:54462 (size: 1235.0 B, free: 267.3 MB) 15/01/19 02:41:45 INFO BlockManagerMaster: Updated info of block broadcast_0_piece0 15/01/19 02:41:45 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:838 15/01/19 02:41:45 INFO DAGScheduler: Submitting 3 missing tasks from Stage 0 (MappedRDD[1] at map at SparkPi.scala:31) 15/01/19 02:41:45 INFO TaskSchedulerImpl: Adding task set 0.0 with 3 tasks 15/01/19 02:42:00 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory and it keeps repeating "Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory" I have verified that http://192.0.3.11:8081/spark-1.2.0.tgz is accessible from all the slave nodes. *My Spark Environment variables list * *Environment* *Runtime Information* *Name ▾* *Value* Java Home /usr/lib/jvm/java-7-openjdk-amd64/jre Java Version 1.7.0_65 (Oracle Corporation) Scala Version version 2.10.4 *Spark Properties* *Name* *Value* spark.app.id 20150119-003609-201523392-5050-7198-0005 spark.app.name Spark Pi spark.driver.host master1 spark.driver.port 46107 spark.executor.id driver spark.fileserver.uri http://192.0.3.11:55424 spark.jars file:/home/vagrant/spark-1.2.0/examples/target/scala-2.10/spark-examples-1.2.0-hadoop1.0.4.jar spark.master mesos://zk://192.0.3.11:2181,192.0.3.12:2181,192.0.3.13:2181/mesos spark.scheduler.mode FIFO spark.tachyonStore.folderName spark-3dffd4bb-f23b-43f7-a498-54b401dc591b *System Properties* *Name* *Value* SPARK_SUBMIT true awt.toolkit sun.awt.X11.XToolkit file.encoding UTF-8 file.encoding.pkg sun.io file.separator / java.awt.graphicsenv sun.awt.X11GraphicsEnvironment java.awt.printerjob sun.print.PSPrinterJob java.class.version 51.0 java.endorsed.dirs /usr/lib/jvm/java-7-openjdk-amd64/jre/lib/endorsed java.ext.dirs /usr/lib/jvm/java-7-openjdk-amd64/jre/lib/ext:/usr/java/packages/lib/ext java.home /usr/lib/jvm/java-7-openjdk-amd64/jre java.io.tmpdir /tmp java.library.path /usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib java.runtime.name OpenJDK Runtime Environment java.runtime.version 1.7.0_65-b32 java.specification.name Java Platform API Specification java.specification.vendor Oracle Corporation java.specification.version 1.7 java.vendor Oracle Corporation java.vendor.url http://java.oracle.com/ java.vendor.url.bug http://bugreport.sun.com/bugreport/ java.version 1.7.0_65 java.vm.info mixed mode java.vm.name OpenJDK 64-Bit Server VM java.vm.specification.name Java Virtual Machine Specification java.vm.specification.vendor Oracle Corporation java.vm.specification.version 1.7 java.vm.vendor Oracle Corporation java.vm.version 24.65-b04 line.separator os.arch amd64 os.name Linux os.version 3.13.0-43-generic path.separator : sun.arch.data.model 64 sun.boot.class.path /usr/lib/jvm/java-7-openjdk-amd64/jre/lib/resources.jar:/usr/lib/jvm/java-7-openjdk-amd64/jre/lib/rt.jar:/usr/lib/jvm/java-7-openjdk-amd64/jre/lib/sunrsasign.jar:/usr/lib/jvm/java-7-openjdk-amd64/jre/lib/jsse.jar:/usr/lib/jvm/java-7-openjdk-amd64/jre/lib/jce.jar:/usr/lib/jvm/java-7-openjdk-amd64/jre/lib/charsets.jar:/usr/lib/jvm/java-7-openjdk-amd64/jre/lib/rhino.jar:/usr/lib/jvm/java-7-openjdk-amd64/jre/lib/jfr.jar:/usr/lib/jvm/java-7-openjdk-amd64/jre/classes sun.boot.library.path /usr/lib/jvm/java-7-openjdk-amd64/jre/lib/amd64 sun.cpu.endian little sun.cpu.isalist sun.io.unicode.encoding UnicodeLittle sun.java.command org.apache.spark.deploy.SparkSubmit --master mesos://zk://192.0.3.11:2181, 192.0.3.12:2181,192.0.3.13:2181/mesos --class org.apache.spark.examples.SparkPi /home/vagrant/spark-1.2.0/examples/target/scala-2.10/spark-examples-1.2.0-hadoop1.0.4.jar sun.java.launcher SUN_STANDARD sun.jnu.encoding UTF-8 sun.management.compiler HotSpot 64-Bit Tiered Compilers sun.nio.ch.bugLevel sun.os.patch.level unknown user.country US user.dir /home/vagrant/spark-1.2.0 user.home /home/vagrant user.language en user.name vagrant user.timezone Etc/UTC *Classpath Entries* *Resource* *Source* /home/vagrant/spark-1.2.0/assembly/target/scala-2.10/spark-assembly-1.2.0-hadoop1.0.4.jar System Classpath /home/vagrant/spark-1.2.0/conf System Classpath http://192.0.3.11:55424/jars/spark-examples-1.2.0-hadoop1.0.4.jar Added By User

