Hello All, I am learning that there are certain imports done by Spark REPL that is used to invoke and run code in a spark shell, that I would have to import specifically if I need the same functionality in a spark jar run by command line.
I am getting into a repeated serialization error of an RDD that contains the "Map". The exact RDD is org.apache.spark.rdd.RDD[(String, Seq[Map[String,MetaData]])] where "MetaData" is a case class. This serialization is not encountered when I try to write out the RDD to disk by using saveAsTextFile or when I try to count the number of elements in the RDD using the count() function. Sadly when I run the same commands in the spark-shell, I do not encounter any error. I appreciate your help in advance. Thanks Shivani -- Software Engineer Analytics Engineering Team@ Box Mountain View, CA