I'd hazard that this is a generic issue.  The "store" is in the context of
the driver code, not the worker code, and that's why Spark is trying to
send it off to a worker for execution.  It's not serializable (and
shouldn't be...), so that fails.

Try making a Scala object that lives on the worker side (e.g., by static
initialization) and then access it that way.



—
p...@mult.ifario.us | Multifarious, Inc. | http://mult.ifario.us/


On Thu, Mar 6, 2014 at 1:44 PM, Jim Donahue <jdona...@adobe.com> wrote:

> After watching Ryan Weald's talk about integrating Spark Streaming with
> Algebird and Storehaus, I decided that I had to give this a try! :-)  But
> I'm having some rather basic problems.
>
> My code looks a lot like the example Ryan gives.  Here's the basic
> structure:
>
>         dstream.foreach(rdd => if(rdd.count != 0){ val data =
> rdd.collect.toMap; store.multiMerge(data)})
>
> where dstream is my Spark DStream and store is a Storehaus MySqlLongStore.
>
> The problem is that the Spark Streaming CheckpointWriter wants to
> serialize the store object and it can't (I end up getting a
> NotSerializableException).
>
> Anybody have an example of working code?  Is this problem specific to
> MySqlLongStores, or is this generic?
>
>
> Thanks,
>
> Jim Donahue
> Adobe
>

Reply via email to