Hi everyone,

I have recently encountered an issue with running a very simple Spark job on
a distributed cluster. I can run other, similar jobs on this cluster, but
for some reason, the job in question will simply not execute.

The error I get when executing the job is:

14/01/25 22:08:54 WARN cluster.ClusterTaskSetManager: Loss was due to
java.lang.ClassNotFoundException
java.lang.ClassNotFoundException: job$$anonfun$1
        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:264)
        at
org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:36)
        at 
java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1593)
...

The error continues for some time. I can post the full context if that's
helpful. The code in the job that seems to cause this issue is: 

    val l = file.map(li => li.split(","))

When I remove this line, the job runs well. This would be less of a concern,
if the context of that line wasn't:

    val file = spark.textFile("hdfs://<BZ2_FILE>", 10)
    val l = file.map(li => li.split(","))
    l.saveAsTextFile("hdfs://<OUTPUT_FILE>")

The file I am reading is a simple CSV file. As mentioned before, a more
complex job using some Spray JSON objects works just fine (and in fact, is
able to be executed immediately before and after the simple job fails, on
the same cluster), but the simple job simply refuses to run if that
anonymous function is executed.

If it helps, both jobs are being compiled as 'fat JARs' using sbt-assembly.

Thanks for your assistance.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/ClassNotFoundException-with-simple-Spark-job-on-cluster-tp932.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply via email to