Johannes,
Esper has several concepts like rolling and sliding windows. Whatever you are trying to achieve through storm by using transactions etc, can be achieved easily in Esper. In our bolt, we initialize the queries on receiving one type of message, then the real time data is used in sliding and different type of windows to give outout to a listener. Esper can help calculate certain statistical functions as well. It can query on previous window. I will suggest you look through Esper use cases to check if can meet your need. -Manoj On Tue, Nov 11, 2014 at 6:20 AM, Klausen Schaefersinho < [email protected]> wrote: > Hi, > > I guess in an production scenario your system should be able to monitor a > lot of sensors, right? So your event should also have an sensor id and > partition the stream after the spoud by the sensor id. In case you don#t > need I would not split the window saving and the distance in different > bolts unless really needed. If you just do some in memory updates your > system should be pretty fast. If you think its better to have very focused > bolts, I could imagine the following topology will do a good job. > > > [Spout] -> {fieldGrouping("sensor")} -> [WindowBolt] -> > {fieldGrouping("sensor")} -> [DistanceBolt] > > The window bolt would emit the stream window , with the sensor id > included. The distance bolt would have to hold for each sensor_id the last > window and compare it to the current window. > > Cheers > > > On Tue, Nov 11, 2014 at 3:06 PM, Kitschke, Johannes Hugo < > [email protected]> wrote: > >> Hi, >> >> yes this is a research type question. I already implemented a prototype >> framework in python with a similar flowing topology as in Storm and now I >> want to do this in parallel on a greater scale. >> >> I have another question concerning stream grouping: Since in my case a >> distance-computing-bolt has to keep the tuples for the last n seconds, I >> cannot use different tasks for this bolt (!?). What I would need is, that >> the tuples are replicated for each bolt, but the actual work should be >> distributed across the tasks. >> This does not mean that I need this (I don't think I will), but I just >> want to understand what's possible. >> >> Best, >> Johannes >> >> ------------------------------ >> *From:* Klausen Schaefersinho [[email protected]] >> *Sent:* Tuesday, November 11, 2014 10:32 AM >> *To:* [email protected] >> *Subject:* Re: Sliding window on numerical data? >> >> Hi, >> >> IBM InfoSphere Streams has some kind of visual query editor. However I >> doubt that you can model a heart beat in it. If you need to start quick >> just write your own bolt and use some kind of ring-buffer to store the last >> n events. Then writer a function that converts it to a vector. Please >> remember that the event frequency is not nessecay the sampling frequency of >> your sensor. So your events should have a time stamp from the sensor and >> you should build the vector using these time stamps, not the position in >> the ring-buffer. >> >> Do you know already that Dynamic Time Warping will work? Your tasks >> sounds pretty much like research, so until you know what model and >> representation you want to use it might also be the best to first conduct >> some experiments in R or SciPi and build the windows from a sensor log file. >> >> Cheers, >> >> Klaus >> >> >> >> On Tue, Nov 11, 2014 at 10:07 AM, Johannes Hugo Kitschke < >> [email protected]> wrote: >> >>> Hi, >>> >>> thanks for your repsonse. I had a look at both Esper and Siddhi. Please >>> correct me if I'm wrong, but I don't think these fit my purpose: Both use a >>> SQL-Like query language, right? >>> What I am aiming at, is to visually define a query (consider for example >>> the heartbeat [1]) and then compute the distance between the query and some >>> datastream (for each new datapoint). But I cannot define such a complex >>> query using this SQL-like query language!? Or did I miss something? >>> >>> Johannes >>> >>> [1] http://en.wikipedia.org/wiki/Electrocardiography#Waves_and_intervals >>> >>> On 11/11/2014 08:16 AM, Sajith wrote: >>> >>> Hi Johannes, >>> >>> You can also use Sidddhi complex event processing engine to achieve >>> your tasks. [1] is a bolt I wrote using Siddhi. >>> >>> [1] >>> https://github.com/sajithshn/siddhi-storm/blob/master/src/main/java/org/wso2/siddhi/storm/component/SiddhiBolt.java >>> >>> On Tue, Nov 11, 2014 at 1:29 AM, Manoj Jaiswal < >>> [email protected]> wrote: >>> >>>> Hi Johannes, >>>> >>>> We are doing something similar using Esper in storm. >>>> The queries are set in EsperBolt and realtime data is processed >>>> through that. >>>> >>>> -Manoj >>>> >>>> On Mon, Nov 10, 2014 at 3:54 AM, Klausen Schaefersinho < >>>> [email protected]> wrote: >>>> >>>>> Hi, >>>>> >>>>> > The main reason I write this mail is, because I have to access >>>>> the last N elements of each stream for each new arriving element >>>>> >>>>> You can not go back in a stream, so you have to write your own bolt >>>>> that stores the last windows. That should be pretty straight forward. >>>>> >>>>> >>>>> >>>>> >>>>> On Mon, Nov 10, 2014 at 12:26 PM, Johannes Hugo Kitschke < >>>>> [email protected]> wrote: >>>>> >>>>>> Hi there, >>>>>> >>>>>> I want to solve a task and wonder if Storm is suitable for this. My >>>>>> task: >>>>>> >>>>>> - input (many) numerical datastreams (e.g. stock market data, >>>>>> seismic data, ...) >>>>>> - define (many) queries of length N (may be different for each >>>>>> query) >>>>>> - compute distance (e.g. using dynamic time warping) between each >>>>>> query and the last N datapoints of each datastream >>>>>> - report matches if distance < eps >>>>>> >>>>>> This is a rough outline. I am completely new to Storm and I wonder, >>>>>> if Storm is a good candidate to solve this problem (if not, >>>>>> alternatives?). >>>>>> The main reason I write this mail is, because I have to access the last N >>>>>> elements of each stream for each new arriving element. But every example >>>>>> I >>>>>> found does only do some computation on one element at a time. >>>>>> >>>>>> While writing this, I came across the 'RollingCountBolt'. Does this >>>>>> implement the functionality I want? Does this mean each Bolt doing the >>>>>> distance computation for the last N points has to store these points? >>>>>> >>>>>> I would be interested in your thoughts, hints and ideas. >>>>>> >>>>>> Thanks! >>>>>> Johannes >>>>>> >>>>> >>>>> >>>> >>> >> >
