You could send a signal tuple from the spout when it knows it's sent the
last tuple for a time period, or include a field in the tuple for
indicating it's the last member.
I'm curious about why you want to do this, since the purpose of storm is
to facilitate stream processing rather than the type of batch processing
you're describing.
-- Kyle
On 06/06/2014 05:14 PM, Jonathan Poon wrote:
Hi Nathan,
The sensor data I have is naturally time sorted, since its just
collecting data and emitting it to a spout. Is it possible for a bolt
to know when all of the tuples with the same time tag have been
collected and to start processing it together? Or is it only possible
for a bolt to process each tuple one at a time?
Thanks!
On Fri, Jun 6, 2014 at 3:07 PM, Nathan Leung <ncle...@gmail.com
<mailto:ncle...@gmail.com>> wrote:
You can have your bolt subscribe to the spout using fields
grouping and use time tag as your key.
On Jun 6, 2014 6:01 PM, "Jonathan Poon" <jkp...@ucdavis.edu
<mailto:jkp...@ucdavis.edu>> wrote:
Hi Everyone,
I'm currently investigating different data processing tools
for an application I'm interested in. I have many sensors
that I collect data from. However, I would like to group the
data from every sensor at predefined time intervals and
process it together.
Using Storm terminology, I would have each sensor send data to
a spout. The spouts would then send tuples to a specific bolt
that will process all of the data within a specific time
partition. Each spout will tag each event with a time id and
each bolt will process data after collecting all of the data
with the same time id tags.
Is this possible with Storm?
I appreciate your help!
Jonathan