I have, what you could do is have an external persistent store (something
fast, say for example Memcached or Haystack) that you have your aggregation
batches for a specific time-slice. For example have a 1 hour window with
10-minute slices that are cleared and rotated as needed. Another problem
that you have to deal with is the fact that should a spout source fails
everything is delayed unless you have an opaque spout which of course has
some downsides as indicated here
<https://storm.apache.org/documentation/Trident-state>.

Hope this helps.

Kindly yours,

Andrew Grammenos

-- PGP PKey --
​ <https://www.dropbox.com/s/2kcxe59zsi9nrdt/pgpsig.txt>
https://www.dropbox.com/s/yxvycjvlsc111bh/pgpsig.txt

On Mon, Sep 14, 2015 at 7:40 PM, Ajay Chander <[email protected]> wrote:

> Hi Guys,
>
> Right now I am trying to implement the same as mentioned by Elango in the
> below email. I want to perform aggregations based on a time window using
> trident. Anyone have done this before using trident? Any help is highly
> appreciated.
>
> Thank you,
> Ajay
>
>
> On Thursday, August 27, 2015, Rajasekar Elango <[email protected]>
> wrote:
>
>> We have time series data in kafka and we want to aggregate it in storm
>> using trident. I was able to get data aggregated using persistentAggregate
>> based onFAQ <https://storm.apache.org/documentation/FAQ.html>. But
>> aggregation is always done within small batches, I could not figure out a
>> way to detect when all events for a one minute time window is processed.
>> Calling each after persistentAggregate(...).newValuesStream() returns
>> results as soon as a batch is processed, but I want to aggregate values
>> across multiple batches for a time window. I could not find good answer or
>> example online. I also see mixed opinion, some people say it's not possible
>> to do time window aggregation in trident, some people say it's possible
>> (especially FAQ <https://storm.apache.org/documentation/FAQ.html> looks
>> promising). The alternate option seem to be using tick tuples with storm
>> basic, but would prefer to do it in trident as it has better guaranteed
>> processing semantics and abstraction for persistence.
>>
>> Can some one provide more details or examples on how to do this?
>>
>> --
>> Thanks,
>> Raja.
>>
>

Reply via email to