The problem is that at compile time, sc.objectFile() has no idea what type
of objects its loading.  Note the type of loadedFile:

loadedFile: org.apache.spark.rdd.RDD[Nothing]

that "Nothing" basically means the scala compiler has no idea what the type
of objects in the RDD are.

So, then when you call first, at runtime the jvm sees that the RDD actually
contains java.lang.Integer, and throws an exception because you can't cast
java.lang.Integer to "Nothing".

The solution to this is to pass in a type parameter to sc.objectFile:

val loadedFile = sc.objectFile[Int](...)


I wonder if spark should make the compiler prevent Nothing, using something
like this:
http://blog.evilmonkeylabs.com/2012/05/31/Forcing_Compiler_Nothing_checks/

unfortunately those compiler error msgs might be just as confusing as the
class cast exception, so I dunno if it would help prevent any issues ...



On Sat, Feb 1, 2014 at 3:46 PM, zhen <[email protected]> wrote:

> I am having trouble loading object files correctly.
> For example:
>
> val test = sc.parallelize(List(1, 2, 3))
> test.saveAsObjectFile("seqFile")
> val loadedFile = sc.objectFile("seqFile"):
> loadedFile: org.apache.spark.rdd.RDD[Nothing] = FlatMappedRDD[4] at
> objectFile at <console>:12
>
> Then if I do the following:
>
> loadedFile.first
>
> The output is:
> java.lang.ClassCastException: java.lang.Integer cannot be cast to
> scala.runtime.Nothing$
>         at <init>(<console>:15)
>         at <init>(<console>:20)
>         at <init>(<console>:22)
>         at <init>(<console>:24)
>         at <init>(<console>:26)
>         at .<init>(<console>:30)
>         at .<clinit>(<console>)
>         at .<init>(<console>:11)
>         at .<clinit>(<console>)
>         at $export(<console>)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at
> org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:629)
>         at
>
> org.apache.spark.repl.SparkIMain$Request$$anonfun$10.apply(SparkIMain.scala:897)
>         at
> scala.tools.nsc.interpreter.Line$$anonfun$1.apply$mcV$sp(Line.scala:43)
>         at scala.tools.nsc.io.package$$anon$2.run(package.scala:25)
>         at java.lang.Thread.run(Thread.java:662)
>
> It seems spark is not recognising the loaded objects as an array of
> integers.
> I am running spark 0.8.1 on the latest cloudera quick start VM
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/object-file-not-loading-correctly-tp1107.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>

Reply via email to