Thank you for your reply, Bao.

Actually, I am running the code on EC2 with SBT. 1 master and 3 slaves

In order to reproduce the issue, I just created a directory with one scala
souce file and a "lib" directory where the assembly jar is.

$ sbt package run

will run the project.

The scala file is showed as below:


// job.scala
import org.apache.spark.SparkContext
import scala.io.Source

object anonymous extends App {
  val nameNodeURL = Source.fromFile("/root/spark-ec2/masters").mkString.trim
  val sparkPort = 7077
  val sc = new SparkContext("spark://" + nameNodeURL + ":" + sparkPort,
"Test",
    System.getenv("SPARK_HOME"),
Seq("target/scala-2.9.3/test_2.9.3-0.1.jar"))
  val rdd1 = sc.parallelize(List(1, 2, 3, 4)) 
  val a = 1 

  def run() {
    val rdd2 = rdd1.map(_ + a)
    println(rdd2.count)
  }

  run 
}


The code works, this may explains why it works in shell.
But if I changed the into an object with a main method, instead of using
"extends App", like:


// job.scala
import org.apache.spark.SparkContext
import scala.io.Source

object anonymous {
  val nameNodeURL = Source.fromFile("/root/spark-ec2/masters").mkString.trim
  val sparkPort = 7077
  val sc = new SparkContext("spark://" + nameNodeURL + ":" + sparkPort,
"Test",
    System.getenv("SPARK_HOME"),
Seq("target/scala-2.9.3/test_2.9.3-0.1.jar"))
  val rdd1 = sc.parallelize(List(1, 2, 3, 4)) 
  val a = 1 

  def run() {
    val rdd2 = rdd1.map(_ + a)
    println(rdd2.count)
  }

  def main(args: Array[String]){
    run
  } 
}


Two exceptions is thrown:

13/12/30 10:08:17 INFO cluster.ClusterTaskSetManager: Loss was due to
java.lang.ExceptionInInitializerError
java.lang.ExceptionInInitializerError
        at anonymous$$anonfun$1.apply$mcII$sp(job.scala:13)
        at anonymous$$anonfun$1.apply(job.scala:13)
        at anonymous$$anonfun$1.apply(job.scala:13)
        at scala.collection.Iterator$$anon$19.next(Iterator.scala:401)
        at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:681)
        at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:677)
        at
org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:758)
        at
org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:758)
        at org.apache.spark.scheduler.ResultTask.run(ResultTask.scala:99)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:158)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:724)
13/12/30 10:08:17 INFO cluster.ClusterTaskSetManager: Starting task 0.0:0 as
TID 2 on executor 0: ip-10-202-35-76.ec2.internal (PROCESS_LOCAL)
13/12/30 10:08:17 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:0
as 1351 bytes in 1 ms
13/12/30 10:08:17 INFO cluster.ClusterTaskSetManager: Lost TID 2 (task
0.0:0)
13/12/30 10:08:17 INFO cluster.ClusterTaskSetManager: Loss was due to
java.lang.NoClassDefFoundError
java.lang.NoClassDefFoundError: Could not initialize class anonymous$
        at anonymous$$anonfun$1.apply$mcII$sp(job.scala:13)
        at anonymous$$anonfun$1.apply(job.scala:13)
        at anonymous$$anonfun$1.apply(job.scala:13)
        at scala.collection.Iterator$$anon$19.next(Iterator.scala:401)
        at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:681)
        at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:677)
        at
org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:758)
        at
org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:758)
        at org.apache.spark.scheduler.ResultTask.run(ResultTask.scala:99)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:158)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:724)

The two snippets are doing the same thing, right?
So what's the difference ? Any thoughts?

Thank you.

Hao.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/closure-and-ExceptionInInitializerError-tp77p99.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply via email to