Have you taken a look at SPARK-12739 ?

FYI

On Sun, Mar 6, 2016 at 4:06 AM, Jatin Kumar <
jku...@rocketfuelinc.com.invalid> wrote:

> Hello all,
>
> Consider following two code blocks:
>
> val ssc = new StreamingContext(sparkConfig, Seconds(2))
> val stream = KafkaUtils.createDirectStream(...)
>
> a) stream.filter(filterFunc).count().foreachRDD(rdd =>
> println(rdd.collect()))
> b) stream.filter(filterFunc).window(Seconds(60),
> Seconds(60)).count().foreachRDD(rdd => println(rdd.collect()))
>
> I have observed that in case
> a) the UI behaves correctly and numbers reported for each batch are
> correct.
> b) UI reports numbers every 60 seconds but the batch-id/input-size etc are
> for the 2 sec batch after every 60 seconds i.e. 30th batch, 60th batch etc.
> These numbers become totally useless, infact confusing in this case though
> the delay/processing-time numbers are still helpful.
>
> Is someone working on a fix to show aggregated numbers which will be more
> useful?
>
> --
> Thanks
> Jatin
>

Reply via email to