I've not used it, but you might look at:
https://github.com/buildlackey/cep/tree/master/esper%2Bstorm%2Bkafka
-Dan

From: l.p.pe...@newcastle.ac.uk
To: user@storm.incubator.apache.org
Subject: RE: Time Partitioning of Tuples
Date: Sun, 8 Jun 2014 21:40:37 +0000







Hi,
Is there anybody who has already embedded Esper into Storm?



-Lesego



From: Dan [dcies...@hotmail.com]

Sent: 07 June 2014 02:04

To: user@storm.incubator.apache.org

Subject: RE: Time Partitioning of Tuples






You might look at Esper. I believe someone has even embedded Esper into Storm



-Dan





Date: Fri, 6 Jun 2014 15:40:08 -0700

Subject: Re: Time Partitioning of Tuples

From: jkp...@ucdavis.edu

To: user@storm.incubator.apache.org








Hi Kyle,




I'm looking for a real-time batch processing tool.  In my case, I'm looking to 
make correlations between all of the sensors at each time interval.




I could use Hadoop (Map Reduce), but it requires I need to collect all of the 
data before I can batch process each time partition of data from each sensor.




Another tool I'm also looking at is Spark Streaming, which allows me to collect 
data at different time intervals and processing that batch of data using Map 
Reduce




However, Map Reduce seems inefficient because my sensor data is already time 
sorted naturally.  In addition, I would like real-time data on the fly.




Seems like Storm might be a candidate for this application.  Please let me know 
what you think...!  Thanks for your help!



Jonathan











On Fri, Jun 6, 2014 at 3:32 PM, Kyle Nusbaum 
<knusb...@yahoo-inc.com> wrote:




You could send a signal tuple from the spout when it knows it's sent the last 
tuple for a time period, or include a field in the tuple for indicating it's 
the last member.




I'm curious about why you want to do this, since the purpose of storm is to 
facilitate stream processing rather than the type of batch processing you're 
describing.
-- Kyle
On 06/06/2014 05:14 PM, Jonathan Poon wrote:








Hi Nathan,




The sensor data I have is naturally time sorted, since its just collecting data 
and emitting it to a spout. Is it possible for a bolt to know when all of the 
tuples with the same time tag have been collected and to start processing it 
together?  Or is it only
 possible for a bolt to process each tuple one at a time?




Thanks!

   







On Fri, Jun 6, 2014 at 3:07 PM, Nathan Leung 
<ncle...@gmail.com> wrote:


You can have your bolt subscribe to the spout using fields grouping and use 
time tag as your key.


On Jun 6, 2014 6:01 PM, "Jonathan Poon" <jkp...@ucdavis.edu> wrote:







Hi Everyone,




I'm currently investigating different data processing tools for an application 
I'm interested in.  I have many sensors that I collect data from.  However, I 
would like to group the data from every sensor at predefined time intervals and 
process it together. 





Using Storm terminology, I would have each sensor send data to a spout.  The 
spouts would then send tuples to a specific bolt that will process all of the 
data within a specific time partition.  Each spout will tag each event with a 
time id and each bolt will
 process data after collecting all of the data with the same time id tags.




Is this possible with Storm?




I appreciate your help!




Jonathan































                                          

Reply via email to