Hi, Probably I am missing very simple principle , but something is wrong with my filter, i get "org.apache.spark.SparkException: Task not serializable" expetion.
here is my filter function: object OBJ { def f1(): Boolean = { var i = 1; for (j<-1 to 10) i = i +1; true; } } rdd.filter(row => OBJ.f1()) And when I run, I get the following exception: org.apache.spark.SparkException: Task not serializable at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:166) at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:158) at org.apache.spark.SparkContext.clean(SparkContext.scala:1242) at org.apache.spark.rdd.RDD.filter(RDD.scala:282) ....... Caused by: java.io.NotSerializableException: org.apache.spark.SparkConf at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1183) at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547) at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508) at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431) at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177) ........... best, /Shahab