Can you give an idea of the streaming program? Rest of the transformation you are doing on the input streams?
On Tue, Jul 22, 2014 at 11:05 AM, Bill Jay <bill.jaypeter...@gmail.com> wrote: > Hi all, > > I am currently running a Spark Streaming program, which consumes data from > Kakfa and does the group by operation on the data. I try to optimize the > running time of the program because it looks slow to me. It seems the stage > named: > > * combineByKey at ShuffledDStream.scala:42 * > > always takes the longest running time. And If I open this stage, I only > see two executors on this stage. Does anyone has an idea what this stage > does and how to increase the speed for this stage? Thanks! > > Bill >