mmmmh ok.

Providing a pseudo-code would require me to be a bit more awake -- 2AM in
Belgium... have to go to sleep, otherwise the pseudo would be more pseudo
than code...

However, regarding your (LaTeX style) formula, imho in a streaming use
case, k will probably vary with the velocity... which is not the case in
batch (cause you can prepare the data beforehand).
So I guess you could find an approximated result by "simply" reducing over
a window. Note that I'm assuming that the x's are the data flowing in the
stream (there are singularities not components of rows incoming atomically
as instance of T in RDD[T]).

Maybe could you have a look at reduceByWindow (first the scaladoc then
google because there are a plenty of examples in either Spark and Storm).

HTH

Andy Petrella
Belgium (Liège)


*       *********
 IT Consultant for *NextLab <http://nextlab.be/> sprl* (co-founder)
 Engaged Citizen Coder for *WAJUG <http://wajug.be/>* (co-founder)
 Author of *Learning Play! Framework
2*<http://www.packtpub.com/learning-play-framework-2/book>


*       *********Mobile: *+32 495 99 11 04*
Mails:

   - [email protected]
   - [email protected]

Socials:

   - Twitter: https://twitter.com/#!/noootsab
   - LinkedIn: http://be.linkedin.com/in/andypetrella
   - Blogger: http://ska-la.blogspot.com/
   - GitHub:  https://github.com/andypetrella
   - Masterbranch: https://masterbranch.com/andy.petrella



On Wed, Nov 20, 2013 at 1:37 AM, Michael Kun Yang <[email protected]>wrote:

> the use case is to fit a moving average model for stock prices with the
> form:
> x_n = \sum_{i = 1}^k \alpha_i * x_{n - i}
>
> Can you please provide me the pseudo-code?
>
>
> On Tue, Nov 19, 2013 at 4:30 PM, andy petrella <[email protected]>wrote:
>
>> 1/ you mean like reshape in R?
>> 2/ Or you mean by windowing the stream on a period basis?
>>
>> 1/ if you have RDD[Seq[Any]], you can have an RDD[Seq[Seq[Any]]] using
>> `transform` then `sliding(6,6)` on the passed Seq
>> 2/ > If the period is time you may check the method ending with `window`
>> in DStream
>>     > otherwise... I don't see the use case, so I'd have some
>> difficulties helping you ^ ^
>>
>> Also, if you're building (row, col) pairs along with each cell data when
>> the data is coming along, the PairedDStreamFunctions could help you if your
>> put this pair as a key. But I'm just guessing...
>>
>> HTH (a bit :D)
>>
>> andy
>>
>>
>> On Wed, Nov 20, 2013 at 1:01 AM, Michael Kun Yang 
>> <[email protected]>wrote:
>>
>>> Hi spark-enthusiasts,
>>>
>>> I am new to spark streaming. I need to convert streaming data into
>>> table.
>>>
>>> How to convert a data stream
>>> {x_1, x_2, x_3, ..., x_n, ...}
>>> into a table with the format:
>>> x_1, x_2, x_3, x_4, x_5, x_6
>>> x_2, x_3, x_4, x_5, x_6, x_7
>>> ...
>>> x_{n + 1}, x_{n + 2}, ..., x_{n + 7}
>>> ...
>>>
>>> Thank you very much!
>>>  -Kun
>>>
>>
>>
>

Reply via email to