Re: Time series aggregation with storm Trident

Ajay Chander Mon, 14 Sep 2015 11:32:20 -0700

Hi Indranil,

Thank you for the info. I was looking at some options where I can use
trident to perform times window aggregations. In the example which you
provided I see it was using core storm implementation. Thanks for your time.



On Monday, September 14, 2015, Indranil Roy <[email protected]>
wrote:

> This might help : https://tomdzk.wordpress.com/2011/09/28/storm-esper/
>
> On Mon, Sep 14, 2015 at 10:19 PM Andrew Xor <[email protected]
> <javascript:_e(%7B%7D,'cvml','[email protected]');>> wrote:
>
>> I have, what you could do is have an external persistent store (something
>> fast, say for example Memcached or Haystack) that you have your aggregation
>> batches for a specific time-slice. For example have a 1 hour window with
>> 10-minute slices that are cleared and rotated as needed. Another problem
>> that you have to deal with is the fact that should a spout source fails
>> everything is delayed unless you have an opaque spout which of course has
>> some downsides as indicated here
>> <https://storm.apache.org/documentation/Trident-state>.
>>
>> Hope this helps.
>>
>> Kindly yours,
>>
>> Andrew Grammenos
>>
>> -- PGP PKey --
>>  <https://www.dropbox.com/s/2kcxe59zsi9nrdt/pgpsig.txt>
>> https://www.dropbox.com/s/yxvycjvlsc111bh/pgpsig.txt
>>
>> On Mon, Sep 14, 2015 at 7:40 PM, Ajay Chander <[email protected]
>> <javascript:_e(%7B%7D,'cvml','[email protected]');>> wrote:
>>
>>> Hi Guys,
>>>
>>> Right now I am trying to implement the same as mentioned by Elango in
>>> the below email. I want to perform aggregations based on a time window
>>> using trident. Anyone have done this before using trident? Any help is
>>> highly appreciated.
>>>
>>> Thank you,
>>> Ajay
>>>
>>>
>>> On Thursday, August 27, 2015, Rajasekar Elango <[email protected]
>>> <javascript:_e(%7B%7D,'cvml','[email protected]');>> wrote:
>>>
>>>> We have time series data in kafka and we want to aggregate it in storm
>>>> using trident. I was able to get data aggregated using persistentAggregate
>>>> based onFAQ <https://storm.apache.org/documentation/FAQ.html>. But
>>>> aggregation is always done within small batches, I could not figure out a
>>>> way to detect when all events for a one minute time window is processed.
>>>> Calling each after persistentAggregate(...).newValuesStream() returns
>>>> results as soon as a batch is processed, but I want to aggregate values
>>>> across multiple batches for a time window. I could not find good answer or
>>>> example online. I also see mixed opinion, some people say it's not possible
>>>> to do time window aggregation in trident, some people say it's possible
>>>> (especially FAQ <https://storm.apache.org/documentation/FAQ.html> looks
>>>> promising). The alternate option seem to be using tick tuples with storm
>>>> basic, but would prefer to do it in trident as it has better guaranteed
>>>> processing semantics and abstraction for persistence.
>>>>
>>>> Can some one provide more details or examples on how to do this?
>>>>
>>>> --
>>>> Thanks,
>>>> Raja.
>>>>
>>>
>> --
> Indranil RoyChowdhury
> +91-9830027560
>

Re: Time series aggregation with storm Trident

Reply via email to