I'd hazard that this is a generic issue. The "store" is in the context of the driver code, not the worker code, and that's why Spark is trying to send it off to a worker for execution. It's not serializable (and shouldn't be...), so that fails.
Try making a Scala object that lives on the worker side (e.g., by static initialization) and then access it that way. — p...@mult.ifario.us | Multifarious, Inc. | http://mult.ifario.us/ On Thu, Mar 6, 2014 at 1:44 PM, Jim Donahue <jdona...@adobe.com> wrote: > After watching Ryan Weald's talk about integrating Spark Streaming with > Algebird and Storehaus, I decided that I had to give this a try! :-) But > I'm having some rather basic problems. > > My code looks a lot like the example Ryan gives. Here's the basic > structure: > > dstream.foreach(rdd => if(rdd.count != 0){ val data = > rdd.collect.toMap; store.multiMerge(data)}) > > where dstream is my Spark DStream and store is a Storehaus MySqlLongStore. > > The problem is that the Spark Streaming CheckpointWriter wants to > serialize the store object and it can't (I end up getting a > NotSerializableException). > > Anybody have an example of working code? Is this problem specific to > MySqlLongStores, or is this generic? > > > Thanks, > > Jim Donahue > Adobe >