> The data producer accumulates the data in files for a few minutes and after > it send the bulks to Kafka then the storm topology processes it. > Do you see any problem with this approach?
Assuming the data is processed fast (i.e. in seconds, rather than minutes), you're wasting your resources. Your servers are doing nothing for almost all the time, and then doing a lot of work for a short period of time. That's going to be a pretty inefficient way of using server resources. -TPP