The problem is that at compile time, sc.objectFile() has no idea what type of objects its loading. Note the type of loadedFile:
loadedFile: org.apache.spark.rdd.RDD[Nothing] that "Nothing" basically means the scala compiler has no idea what the type of objects in the RDD are. So, then when you call first, at runtime the jvm sees that the RDD actually contains java.lang.Integer, and throws an exception because you can't cast java.lang.Integer to "Nothing". The solution to this is to pass in a type parameter to sc.objectFile: val loadedFile = sc.objectFile[Int](...) I wonder if spark should make the compiler prevent Nothing, using something like this: http://blog.evilmonkeylabs.com/2012/05/31/Forcing_Compiler_Nothing_checks/ unfortunately those compiler error msgs might be just as confusing as the class cast exception, so I dunno if it would help prevent any issues ... On Sat, Feb 1, 2014 at 3:46 PM, zhen <[email protected]> wrote: > I am having trouble loading object files correctly. > For example: > > val test = sc.parallelize(List(1, 2, 3)) > test.saveAsObjectFile("seqFile") > val loadedFile = sc.objectFile("seqFile"): > loadedFile: org.apache.spark.rdd.RDD[Nothing] = FlatMappedRDD[4] at > objectFile at <console>:12 > > Then if I do the following: > > loadedFile.first > > The output is: > java.lang.ClassCastException: java.lang.Integer cannot be cast to > scala.runtime.Nothing$ > at <init>(<console>:15) > at <init>(<console>:20) > at <init>(<console>:22) > at <init>(<console>:24) > at <init>(<console>:26) > at .<init>(<console>:30) > at .<clinit>(<console>) > at .<init>(<console>:11) > at .<clinit>(<console>) > at $export(<console>) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at > org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:629) > at > > org.apache.spark.repl.SparkIMain$Request$$anonfun$10.apply(SparkIMain.scala:897) > at > scala.tools.nsc.interpreter.Line$$anonfun$1.apply$mcV$sp(Line.scala:43) > at scala.tools.nsc.io.package$$anon$2.run(package.scala:25) > at java.lang.Thread.run(Thread.java:662) > > It seems spark is not recognising the loaded objects as an array of > integers. > I am running spark 0.8.1 on the latest cloudera quick start VM > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/object-file-not-loading-correctly-tp1107.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. >
