Is MyType serializable? Everything inside the foreachRDD closure has to be
serializable.


2014-07-09 14:24 GMT+01:00 RodrigoB <rodrigo.boav...@aspect.com>:

> Hi all,
>
> I am currently trying to save to Cassandra after some Spark Streaming
> computation.
>
> I call a myDStream.foreachRDD so that I can collect each RDD in the driver
> app runtime and inside I do something like this:
> myDStream.foreachRDD(rdd => {
>
> var someCol = Seq[MyType]()
>
> foreach(kv =>{
>   someCol :+ rdd._2 //I only want the RDD value and not the key
>  }
> val collectionRDD = sc.parallelize(someCol) //THIS IS WHY IT FAILS TRYING
> TO
> RUN THE WORKER
> collectionRDD.saveToCassandra(...)
> }
>
> I get the NotSerializableException while trying to run the Node (also tried
> someCol as shared variable).
> I believe this happens because the myDStream doesn't exist yet when the
> code
> is pushed to the Node so the parallelize doens't have any structure to
> relate to it. Inside this foreachRDD I should only do RDD calls which are
> only related to other RDDs. I guess this was just a desperate attempt....
>
> So I have a question
> Using the Cassandra Spark driver - Can we only write to Cassandra from an
> RDD? In my case I only want to write once all the computation is finished
> in
> a single batch on the driver app.
>
> tnks in advance.
>
> Rod
>
>
>
>
>
>
>
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Cassandra-driver-Spark-question-tp9177.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>

Reply via email to