Ok, so I've been able to narrow down the problem to this specific case: def toCsv(userTuple: String) = {"a,b,c"} val dataTemp = Array("line1", "line2") val dataTempDist = sc.parallelize(dataTemp) val usersFormatted = dataTempDist.map(toCsv) usersFormatted.saveAsTextFile("hdfs://" + masterDomain + ":9000/user/root/" + "test_dir")
Even this simple mapping give me a java.lang.ClassCastException. Sorry, my knowledge of Scala is very rudimentary. Thanks, Niko On Tue, Mar 25, 2014 at 5:55 PM, Niko Stahl <r.niko.st...@gmail.com> wrote: > Hi, > > I'm trying to save an RDD to HDFS with the saveAsTextFile method on my ec2 > cluster and am encountering the following exception (the app is called > GraphTest): > > Exception failure: java.lang.ClassCastException: cannot assign instance of > GraphTest$$anonfun$3 to field org.apache.spark.rdd.MappedRDD.f of type > scala.Function1 in instance of org.apache.spark.rdd.MappedRDD > > The RDD is simply a list of strings. Strangely enough the same sequence of > commands when executed in the Spark shell does not cause the above error. > Any thoughts on what might be going on here? > > Thanks, > Niko >