I have, what you could do is have an external persistent store (something fast, say for example Memcached or Haystack) that you have your aggregation batches for a specific time-slice. For example have a 1 hour window with 10-minute slices that are cleared and rotated as needed. Another problem that you have to deal with is the fact that should a spout source fails everything is delayed unless you have an opaque spout which of course has some downsides as indicated here <https://storm.apache.org/documentation/Trident-state>.
Hope this helps. Kindly yours, Andrew Grammenos -- PGP PKey -- <https://www.dropbox.com/s/2kcxe59zsi9nrdt/pgpsig.txt> https://www.dropbox.com/s/yxvycjvlsc111bh/pgpsig.txt On Mon, Sep 14, 2015 at 7:40 PM, Ajay Chander <[email protected]> wrote: > Hi Guys, > > Right now I am trying to implement the same as mentioned by Elango in the > below email. I want to perform aggregations based on a time window using > trident. Anyone have done this before using trident? Any help is highly > appreciated. > > Thank you, > Ajay > > > On Thursday, August 27, 2015, Rajasekar Elango <[email protected]> > wrote: > >> We have time series data in kafka and we want to aggregate it in storm >> using trident. I was able to get data aggregated using persistentAggregate >> based onFAQ <https://storm.apache.org/documentation/FAQ.html>. But >> aggregation is always done within small batches, I could not figure out a >> way to detect when all events for a one minute time window is processed. >> Calling each after persistentAggregate(...).newValuesStream() returns >> results as soon as a batch is processed, but I want to aggregate values >> across multiple batches for a time window. I could not find good answer or >> example online. I also see mixed opinion, some people say it's not possible >> to do time window aggregation in trident, some people say it's possible >> (especially FAQ <https://storm.apache.org/documentation/FAQ.html> looks >> promising). The alternate option seem to be using tick tuples with storm >> basic, but would prefer to do it in trident as it has better guaranteed >> processing semantics and abstraction for persistence. >> >> Can some one provide more details or examples on how to do this? >> >> -- >> Thanks, >> Raja. >> >
