Aggregating by time window with storm Trident

Rajasekar Elango Mon, 31 Aug 2015 07:59:48 -0700

We have time series data in kafka and we want to aggregate it in storm
using trident. I was able to get data aggregated using persistentAggregate
based onFAQ <https://storm.apache.org/documentation/FAQ.html>. But
aggregation is always done within small batches, I could not figure out a
way to detect when all events for a one minute time window is processed.
Calling each after persistentAggregate(...).newValuesStream() returns
results as soon as a batch is processed, but I want to aggregate values
across multiple batches for a time window. I could not find good answer or
example online. I also see mixed opinion, some people say it's not possible
to do time window aggregation in trident, some people say it's possible
(especially FAQ <https://storm.apache.org/documentation/FAQ.html> looks
promising). The alternate option seem to be using tick tuples with storm
basic, but would prefer to do it in trident as it has better guaranteed
processing semantics and abstraction for persistence.


Can some one provide more details or examples on how to do this?


-- 
Thanks,
Raja.

Aggregating by time window with storm Trident

Reply via email to