Thanks guys, that's interesting. Though it looks like singleton object is defined at driver, spark actually will serialize closure and send to workers. The interesting thing is that ScriptEngine is NOT serializable, but till it hasn't been initialized spark can serialize the closure well. But if I force it initialize first then spark throws NotSerializeableException.
Anyway, following Christopher's suggestion to avoid reference to outside closure is better. TD, do you mean that Executors share the same SerializerInstance and there is a case that more than 1 thread call the same closure instance? -Bao. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Stateful-RDD-tp71p97.html Sent from the Apache Spark User List mailing list archive at Nabble.com.
