Are you using DStream.print()? Or something that boils down to RDD.take()? That can lead to an unpredictable number of jobs. There are other cases as well, but this one is common.
On Thu, Sep 24, 2015 at 12:04 PM, Shenghua(Daniel) Wan < [email protected]> wrote: > Hi, > I noticed that in my streaming application reading from Kafka using > multiple receivers, there are 3 jobs in one batch (via web UI). > According to DAG there are two stages, job 0 execute both 2 stages, but > job 1 and job 2 only execute stage 2. There is a disconnection between my > understanding and reality. I have gone over the book and did some googling, > but still I could not find the relationship between batch and jobs. Could > anyone share insights? > Thanks a lot! > > -- > > Regards, > Shenghua (Daniel) Wan >
