Can you find out what is the class that is causing the NotSerializable
exception? In fact, you can enabled extended serialization debugging
<http://stackoverflow.com/questions/1660441/java-flag-to-enable-extended-serialization-debugging-info>
to
figure out object structure through the foreachRDD's closure that is
causing it.

On a not-so-related note, why are you collecting and then parallelizing?
That is really inefficient thing to do as all the data is being brought
back to the driver. If you just want the value to be save to cassandra, why
cant you map the pair RDD to have only values and save that to cassandra?

TD


On Wed, Jul 9, 2014 at 7:52 AM, Luis Ángel Vicente Sánchez <
langel.gro...@gmail.com> wrote:

> Yes, I'm using it to count concurrent users from a kafka stream of events
> without problems. I'm currently testing it using the local mode but any
> serialization problem would have already appeared so I don't expect any
> serialization issue when I deployed to my cluster.
>
>
> 2014-07-09 15:39 GMT+01:00 RodrigoB <rodrigo.boav...@aspect.com>:
>
> Hi Luis,
>>
>> Yes it's actually an ouput of the previous RDD.
>> Have you ever used the Cassandra Spark Driver on the driver app? I believe
>> these limitations go around that - it's designed to save RDDs from the
>> nodes.
>>
>> tnks,
>> Rod
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/Cassandra-driver-Spark-question-tp9177p9187.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>
>

Reply via email to