You are getting a NullPointerException because of which it gets failed. It runs at local means you are ignoring a fact that many of the classes wont be initialized on the worker executor node when you might have initialized them in your master executor JVM. To check = Does your code works when you give master as local[n] instead of local.
On Tue, Jan 14, 2014 at 7:39 PM, vuakko <[email protected]> wrote: > Spark fails to run practically any standalone mode jobs sent to it. The > local > mode works and spark-shell works even in standalone, but sending any other > jobs manually fails with worker posting the following error: > > 2014-01-14 15:47:05,073 [sparkWorker-akka.actor.default-dispatcher-5] INFO > org.apache.spark.deploy.worker.Worker - Connecting to master > spark://niko-VirtualBox:7077... > 2014-01-14 15:47:05,715 [sparkWorker-akka.actor.default-dispatcher-2] INFO > org.apache.spark.deploy.worker.Worker - Successfully registered with master > spark://niko-VirtualBox:7077 > 2014-01-14 15:47:23,408 [sparkWorker-akka.actor.default-dispatcher-14] INFO > org.apache.spark.deploy.worker.Worker - Asked to launch executor > app-20140114154723-0000/0 for Spark test > 2014-01-14 15:47:23,431 [sparkWorker-akka.actor.default-dispatcher-14] > ERROR > akka.actor.OneForOneStrategy - > java.lang.NullPointerException > at java.io.File.<init>(File.java:251) > at > > org.apache.spark.deploy.worker.Worker$$anonfun$receive$1.applyOrElse(Worker.scala:213) > at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) > at akka.actor.ActorCell.invoke(ActorCell.scala:456) > at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) > at akka.dispatch.Mailbox.run(Mailbox.scala:219) > at > > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) > at > scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > 2014-01-14 15:47:23,514 [sparkWorker-akka.actor.default-dispatcher-14] INFO > org.apache.spark.deploy.worker.Worker - Starting Spark worker > niko-VirtualBox.local:33576 with 1 cores, 6.8 GB RAM > 2014-01-14 15:47:23,514 [sparkWorker-akka.actor.default-dispatcher-14] INFO > org.apache.spark.deploy.worker.Worker - Spark home: > /home/niko/local/incubator-spark > 2014-01-14 15:47:23,517 [sparkWorker-akka.actor.default-dispatcher-14] INFO > org.apache.spark.deploy.worker.ui.WorkerWebUI - Started Worker web UI at > http://niko-VirtualBox.local:8081 > 2014-01-14 15:47:23,517 [sparkWorker-akka.actor.default-dispatcher-14] INFO > org.apache.spark.deploy.worker.Worker - Connecting to master > spark://niko-VirtualBox:7077... > 2014-01-14 15:47:23,528 [sparkWorker-akka.actor.default-dispatcher-3] INFO > org.apache.spark.deploy.worker.Worker - Successfully registered with master > spark://niko-VirtualBox:7077 > > > Master spits out the following logs at the same time: > > > 2014-01-14 15:47:05,683 [sparkMaster-akka.actor.default-dispatcher-4] INFO > org.apache.spark.deploy.master.Master - Registering worker > niko-VirtualBox:33576 with 1 cores, 6.8 GB RAM > 2014-01-14 15:47:23,090 [sparkMaster-akka.actor.default-dispatcher-15] INFO > org.apache.spark.deploy.master.Master - Registering app Spark test > 2014-01-14 15:47:23,102 [sparkMaster-akka.actor.default-dispatcher-15] INFO > org.apache.spark.deploy.master.Master - Registered app Spark test with ID > app-20140114154723-0000 > 2014-01-14 15:47:23,216 [sparkMaster-akka.actor.default-dispatcher-15] INFO > org.apache.spark.deploy.master.Master - Launching executor > app-20140114154723-0000/0 on worker > worker-20140114154704-niko-VirtualBox.local-33576 > 2014-01-14 15:47:23,523 [sparkMaster-akka.actor.default-dispatcher-15] INFO > org.apache.spark.deploy.master.Master - Registering worker > niko-VirtualBox:33576 with 1 cores, 6.8 GB RAM > 2014-01-14 15:47:23,525 [sparkMaster-akka.actor.default-dispatcher-15] INFO > org.apache.spark.deploy.master.Master - Attempted to re-register worker at > same address: akka.tcp://[email protected]:33576 > 2014-01-14 15:47:23,535 [sparkMaster-akka.actor.default-dispatcher-14] WARN > org.apache.spark.deploy.master.Master - Got heartbeat from unregistered > worker worker-20140114154723-niko-VirtualBox.local-33576 > ... > > Soon after this the master decides that the worker is dead, disassociates > it > and marks it DEAD in the web UI. The worker process however is still alive > and still thinks that it's connected to master (as shown by the log). > > I'm launching the job with the following command (last argument is the > master, replacing local there makes things run ok): > java -cp > > ./target/classes:/etc/hadoop/conf:$SPARK_HOME/conf:$SPARK_HOME/assembly/target/scala-2.10/spark-assembly-0.9.0-incubating-SNAPSHOT-hadoop2.0.0-mr1-cdh4.5.0.jar > SparkTest spark://niko-VirtualBox:7077 > > Relevant versions are: > Spark: current git HEAD fa75e5e1c50da7d1e6c6f41c2d6d591c1e8a025f > Hadoop: 2.0.0-mr1-cdh4.5.0 > Scala: 2.10.3 > > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Akka-error-kills-workers-in-standalone-mode-tp537.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. >
