Heya, Yep this is a problem in the Mesos scheduler implementation that has been fixed after 0.9.0 (https://spark-project.atlassian.net/browse/SPARK-1052 => MesosSchedulerBackend)
So several options, like applying the patch, upgrading to 0.9.1 :-/ Cheers, Andy On Wed, Apr 2, 2014 at 5:30 PM, Leon Zhang <leonca...@gmail.com> wrote: > Hi, Spark Devs: > > I encounter a problem which shows error message "akka.actor.ActorNotFound" > on our mesos mini-cluster. > > mesos : 0.17.0 > spark : spark-0.9.0-incubating > > spark-env.sh: > #!/usr/bin/env bash > > export MESOS_NATIVE_LIBRARY=/usr/local/lib/libmesos.so > export SPARK_EXECUTOR_URI=hdfs:// > 192.168.1.20/tmp/spark-0.9.0-incubating-hadoop_2.0.0-cdh4.6.0-bin.tar.gz > export MASTER=zk://192.168.1.20:2181/mesos > export SPARK_JAVA_OPTS="-Dspark.driver.port=17077" > > And the logs from each slave looks like: > > 14/04/02 15:14:37 INFO MesosExecutorBackend: Using Spark's default log4j > profile: org/apache/spark/log4j-defaults.properties > 14/04/02 15:14:37 INFO MesosExecutorBackend: Registered with Mesos as > executor ID 201403301937-335653056-5050-991-1 > 14/04/02 15:14:38 INFO Slf4jLogger: Slf4jLogger started > 14/04/02 15:14:38 INFO Remoting: Starting remoting > 14/04/02 15:14:38 INFO Remoting: Remoting started; listening on addresses > :[akka.tcp://spark@zetyun-cloud3:42218] > 14/04/02 15:14:38 INFO Remoting: Remoting now listens on addresses: > [akka.tcp://spark@zetyun-cloud3:42218] > 14/04/02 15:14:38 INFO SparkEnv: Connecting to BlockManagerMaster: > akka.tcp://spark@localhost:17077/user/BlockManagerMaster > akka.actor.ActorNotFound: Actor not found for: > ActorSelection[Actor[akka.tcp://spark@localhost > :17077/]/user/BlockManagerMaster] > at > akka.actor.ActorSelection$anonfun$resolveOne$1.apply(ActorSelection.scala:66) > at > akka.actor.ActorSelection$anonfun$resolveOne$1.apply(ActorSelection.scala:64) > at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32) > at > akka.dispatch.BatchingExecutor$Batch$anonfun$run$1.processBatch$1(BatchingExecutor.scala:67) > at > akka.dispatch.BatchingExecutor$Batch$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:82) > at > akka.dispatch.BatchingExecutor$Batch$anonfun$run$1.apply(BatchingExecutor.scala:59) > at > akka.dispatch.BatchingExecutor$Batch$anonfun$run$1.apply(BatchingExecutor.scala:59) > at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72) > at akka.dispatch.BatchingExecutor$Batch.run(BatchingExecutor.scala:58) > at > akka.dispatch.ExecutionContexts$sameThreadExecutionContext$.unbatchedExecute(Future.scala:74) > at akka.dispatch.BatchingExecutor$class.execute(BatchingExecutor.scala:110) > at > akka.dispatch.ExecutionContexts$sameThreadExecutionContext$.execute(Future.scala:73) > at > scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:40) > at > scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:248) > at akka.pattern.PromiseActorRef.$bang(AskSupport.scala:269) > at akka.actor.EmptyLocalActorRef.specialHandle(ActorRef.scala:512) > at akka.actor.DeadLetterActorRef.specialHandle(ActorRef.scala:545) > at akka.actor.DeadLetterActorRef.$bang(ActorRef.scala:535) > at > akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef.$bang(RemoteActorRefProvider.scala:91) > at akka.actor.ActorRef.tell(ActorRef.scala:125) > at akka.dispatch.Mailboxes$anon$1$anon$2.enqueue(Mailboxes.scala:44) > at akka.dispatch.QueueBasedMessageQueue$class.cleanUp(Mailbox.scala:438) > at > akka.dispatch.UnboundedDequeBasedMailbox$MessageQueue.cleanUp(Mailbox.scala:650) > at akka.dispatch.Mailbox.cleanUp(Mailbox.scala:309) > at akka.dispatch.MessageDispatcher.unregister(AbstractDispatcher.scala:204) > at akka.dispatch.MessageDispatcher.detach(AbstractDispatcher.scala:140) > at > akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$finishTerminate(FaultHandling.scala:203) > at > akka.actor.dungeon.FaultHandling$class.terminate(FaultHandling.scala:163) > at akka.actor.ActorCell.terminate(ActorCell.scala:338) > at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:431) > at akka.actor.ActorCell.systemInvoke(ActorCell.scala:447) > at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:262) > at akka.dispatch.Mailbox.run(Mailbox.scala:218) > at > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) > at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > Exception in thread "Thread-0" > > Any clue for this problem? > > Thanks in advance. >