Bill, can't you just add more nodes in order to speed up the processing?
Tobias On Thu, Jul 3, 2014 at 7:09 AM, Bill Jay <bill.jaypeter...@gmail.com> wrote: > Hi all, > > I have a problem of using Spark Streaming to accept input data and update > a result. > > The input of the data is from Kafka and the output is to report a map > which is updated by historical data in every minute. My current method is > to set batch size as 1 minute and use foreachRDD to update this map and > output the map at the end of the foreachRDD function. However, the current > issue is the processing cannot be finished within one minute. > > I am thinking of updating the map whenever the new data come instead of > doing the update when the whoe RDD comes. Is there any idea on how to > achieve this in a better running time? Thanks! > > Bill >