The terms are * ESP : http://en.wikipedia.org/wiki/Event_stream_processing * CEP : http://en.wikipedia.org/wiki/Complex_event_processing
By the way, processing streams in real time tends toward being a pleonasm. MapReduce follows a batch architecture. You keep data until a given time. You then process everything. And at the end you provide all the results. Stream processing has by definition a more 'smooth' throughput. Each event is processed at a time and potentially each processing could lead to a result. I don't know any complete overview of such tools. Esper is well known in that space. FlumeBase was an attempt to do something similar (as far as I can tell). It shows how an ESP engine fits with log collection using a tool such as Flume. Then you also have other solutions which will allow you to scale such as Storm. A few people have already considered using Storm for scalability and Esper to do the real computation. Regards Bertrand On Sun, Aug 19, 2012 at 9:44 PM, Niels Basjes <[email protected]> wrote: > Is there a "complete" overview of the tools that allow processing streams > of data in realtime? > > Or even better; what are the terms to google for? > > -- > Met vriendelijke groet, > Niels Basjes > (Verstuurd vanaf mobiel ) > Op 19 aug. 2012 18:22 schreef "Bertrand Dechoux" <[email protected]> het > volgende: > > That's a good question. More and more people are talking about Hadoop Real >> Time. >> One key aspect of this question is whether we are talking about MapReduce >> or not. >> >> MapReduce greatly improves the response time of any data intensive jobs >> but it is still a batch framework with a noticeable latency. >> >> There is multiple ways to improve the latency : >> * ESP/CEP solutions (like Esper, FlumeBase, ...) >> * Big Table clones (like HBase ...) >> * YARN with a non MapReduce application >> * ... >> >> But it will really depend on the context and the definition of 'real >> time'. >> >> Regards >> >> Bertrand >> >> >> >> On Sun, Aug 19, 2012 at 5:44 PM, mahout user <[email protected]>wrote: >> >>> Hello folks, >>> >>> >>> I am new to hadoop, I just want to get information that how hadoop >>> framework is usefull for real time service.?can any one explain me..? >>> >>> Thanks. >>> >> >> >> >> -- >> Bertrand Dechoux >> > -- Bertrand Dechoux
