Just to add to Christopher's suggestion, do make sure that the
ScriptEngine.eval is thread-safe. If it is not, you can use
ThreadLocal<http://docs.oracle.com/javase/7/docs/api/java/lang/ThreadLocal.html>to
make sure there is one instance per execution thread.

TD


On Fri, Dec 27, 2013 at 8:12 PM, Christopher Nguyen <[email protected]> wrote:

> Bao, as described, your use case doesn't need to invoke anything like
> custom RDDs or DStreams.
>
> In a call like
>
>     val resultRdd = scripts.map(s => ScriptEngine.eval(s))
>
> Spark will do its best to serialize/deserialize ScriptEngine to each of
> the workers---if ScriptEngine is Serializable.
>
> Now, if it makes no difference to you, consider instantiating ScriptEngine
> within the closure itself, thus obviating the need for serdes of things
> outside the closure.
>
> --
> Christopher T. Nguyen
> Co-founder & CEO, Adatao <http://adatao.com>
> linkedin.com/in/ctnguyen
>
>
>
> On Fri, Dec 27, 2013 at 7:56 PM, Bao <[email protected]> wrote:
>
>> It looks like I need to use DStream instead...
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/Stateful-RDD-tp71p85.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>
>

Reply via email to