Re: Spark Streaming architecture question - shared memory model

dmihovilovic Sun, 20 Oct 2013 14:38:01 -0700

Any idea why the RDD is maintained so secretively "behind" the scenes. It looks 
like the only way to get the status is after updating it. There is no exposed 
method to just get the state and trying to get it buy applying a function that 
does nothing. We are doing some acrobatics to get the state but this model 
appears very odd.

The only example I have found so far is a simple updating of counts. Is anyone 
aware of a more complex examples with state updates and retrievals?

dma

On Sep 30, 2013, at 3:58 PM, Michael Malak wrote:

> Domingo Mihovilovic <[email protected]> writes:
> 
>>  Imagine that you are processing a stream data at high speed and needs to 
>> build, update,
>> and access some memory data structure where the "model" is stored.  
> 
> Normally this is done with updateStateByKey, which maintains an RDD behind 
> the scenes.
> 
> Michael Malak
> http://www.linkedin.com/in/michaelmalak

Re: Spark Streaming architecture question - shared memory model

Reply via email to