Hello Everyone,

I came across storm recently and I'm trying to understand it better.

Storm, unlike flume, doesn't really have any code for a sink. Read
somewhere that storm is a real time stream processing engine where you
don't expect data to land anywhere. What kind of a situation would this be ?

One example I envision is a situation where you only want to maintain
counters without the actual data itself. Is this right ? If yes, I'm
assuming that these counters have to be updated in a database. How does
this affect the performance ?

Can I route flume streams through storm cluster to compute the
counters,store the counters in hbase (instead of going flume ---> hive
.---> top 10 query), effectively decreasing the number of mapreduce jobs on
hadoop cluster ?

Reply via email to