Hi,

In my Streaming Job, most of the time seems to be taken by one executor. The
shuffle read records is 713758 in that one particular executor but 0 in
others. I have a groupBy followed by updateStateByKey, flatMap, map,
reduceByKey and updateStateByKey operations in that Stage. I am suspecting
that somewhere all the keys are getting collected in the same executor. But
the number of partitions seem to be same for all of them as they are all in
the same stage. Input Size / Records seems to be same in all of them. What
does Shuffle Read Size / Records indicate exactly? Is that the shuffle going
out of that executor or the shuffle performed by that executor?


Task Time       Total Tasks     Failed Tasks    Succeeded Tasks Input Size / 
Records
Shuffle Read Size / Records


2 s     12      0       12      392.3 KB / 867  18.0 KB / 0


2 s     12      0       12      411.5 KB / 878  17.9 KB / 0


2 s     12      0       12      397.7 KB / 889  18.0 KB / 0


2 s     12      0       12      387.4 KB / 834  18.0 KB / 0


1 s     8       0       8       263.6 KB / 597  11.9 KB / 0


2 s     12      0       12      397.9 KB / 902  18.0 KB / 0


2 s     12      0       12      411.1 KB / 901  18.0 KB / 0


2 s     12      0       12      370.4 KB / 837  18.0 KB / 0


34 s    12      0       12      400.8 KB / 854  349.5 KB / 713758


2 s     12      0       12      393.3 KB / 885  17.9 KB / 0


2 s     12      0       12      390.3 KB / 862  17.9 KB / 



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Task-Time-is-too-high-in-a-single-executor-in-Streaming-tp25614.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to