Re: Explaination for info shown in UI

2016-02-01 Thread Yogesh Mahajan
The jobs depend on the number of output operations (print, foreachRDD, saveAs*Files) and the number of RDD actions in those output operations. For example: dstream1.foreachRDD { rdd => rdd.count }// ONE Spark job per batch dstream1.foreachRDD { rdd => { rdd.count ; rdd.count } } // TWO Spark j

Explaination for info shown in UI

2016-01-28 Thread Sachin Aggarwal
Hi I am executing a streaming wordcount with kafka with one test topic with 2 partition my cluster have three spark executors Each batch is of 10 sec for every batch(ex below * batch time 02:51:00*) I see 3 entry in spark UI , as shown below below my questions:- 1) As label says jobId for first