Re: Time series aggregation with storm Trident

Indranil Roy Mon, 14 Sep 2015 09:52:33 -0700

This might help : https://tomdzk.wordpress.com/2011/09/28/storm-esper/


On Mon, Sep 14, 2015 at 10:19 PM Andrew Xor <[email protected]>
wrote:

> I have, what you could do is have an external persistent store (something
> fast, say for example Memcached or Haystack) that you have your aggregation
> batches for a specific time-slice. For example have a 1 hour window with
> 10-minute slices that are cleared and rotated as needed. Another problem
> that you have to deal with is the fact that should a spout source fails
> everything is delayed unless you have an opaque spout which of course has
> some downsides as indicated here
> <https://storm.apache.org/documentation/Trident-state>.
>
> Hope this helps.
>
> Kindly yours,
>
> Andrew Grammenos
>
> -- PGP PKey --
>  <https://www.dropbox.com/s/2kcxe59zsi9nrdt/pgpsig.txt>
> https://www.dropbox.com/s/yxvycjvlsc111bh/pgpsig.txt
>
> On Mon, Sep 14, 2015 at 7:40 PM, Ajay Chander <[email protected]>
> wrote:
>
>> Hi Guys,
>>
>> Right now I am trying to implement the same as mentioned by Elango in the
>> below email. I want to perform aggregations based on a time window using
>> trident. Anyone have done this before using trident? Any help is highly
>> appreciated.
>>
>> Thank you,
>> Ajay
>>
>>
>> On Thursday, August 27, 2015, Rajasekar Elango <[email protected]>
>> wrote:
>>
>>> We have time series data in kafka and we want to aggregate it in storm
>>> using trident. I was able to get data aggregated using persistentAggregate
>>> based onFAQ <https://storm.apache.org/documentation/FAQ.html>. But
>>> aggregation is always done within small batches, I could not figure out a
>>> way to detect when all events for a one minute time window is processed.
>>> Calling each after persistentAggregate(...).newValuesStream() returns
>>> results as soon as a batch is processed, but I want to aggregate values
>>> across multiple batches for a time window. I could not find good answer or
>>> example online. I also see mixed opinion, some people say it's not possible
>>> to do time window aggregation in trident, some people say it's possible
>>> (especially FAQ <https://storm.apache.org/documentation/FAQ.html> looks
>>> promising). The alternate option seem to be using tick tuples with storm
>>> basic, but would prefer to do it in trident as it has better guaranteed
>>> processing semantics and abstraction for persistence.
>>>
>>> Can some one provide more details or examples on how to do this?
>>>
>>> --
>>> Thanks,
>>> Raja.
>>>
>>
> --
Indranil RoyChowdhury
+91-9830027560

Re: Time series aggregation with storm Trident

Reply via email to