Is there a way in Storm Trident to aggregate data over a certain time
period and have it flush the data out to an external data store after that
time period is up ?

Trident does not have the functionality of Tick Tuples yet, so I cannot use
that. Everything I've been researching leads to believe that this is not
possible in Storm/Trident, however this seems to me to be a fairly standard
use case of any streaming map reduce library.

For example,
If I am receiving a stream of integers
I want to aggregate all those integers over a period of 1 second, then
persist it into an external datastore.

This is not in order to count how much it will add up to over X amount of
time, rather I would like to minimize the read/write/updates I do to said
datastore.

There are many ways in order to reduce these variables, however all of them
force me to modify my schema in ways that are unpleasant. Also, I would
rather not have my final external datastore be my scratch space, where my
program is reading/updating/writing and checking to make sure that the
transaction id's line up.
Instead I want that scratch work to be done separately, then the final
result stored into a final database that no longer needs to do constant
updating.

Thanks
-- 
Raphael Hsieh

Reply via email to