Hello Everyone, I came across storm recently and I'm trying to understand it better.
Storm, unlike flume, doesn't really have any code for a sink. Read somewhere that storm is a real time stream processing engine where you don't expect data to land anywhere. What kind of a situation would this be ? One example I envision is a situation where you only want to maintain counters without the actual data itself. Is this right ? If yes, I'm assuming that these counters have to be updated in a database. How does this affect the performance ? Can I route flume streams through storm cluster to compute the counters,store the counters in hbase (instead of going flume ---> hive .---> top 10 query), effectively decreasing the number of mapreduce jobs on hadoop cluster ?
