Bao, as described, your use case doesn't need to invoke anything like
custom RDDs or DStreams.
In a call like
val resultRdd = scripts.map(s => ScriptEngine.eval(s))
Spark will do its best to serialize/deserialize ScriptEngine to each of the
workers---if ScriptEngine is Serializable.
Now, if it makes no difference to you, consider instantiating ScriptEngine
within the closure itself, thus obviating the need for serdes of things
outside the closure.
--
Christopher T. Nguyen
Co-founder & CEO, Adatao <http://adatao.com>
linkedin.com/in/ctnguyen
On Fri, Dec 27, 2013 at 7:56 PM, Bao <[email protected]> wrote:
> It looks like I need to use DStream instead...
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Stateful-RDD-tp71p85.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>