Hi, In my Streaming Job, most of the time seems to be taken by one executor. The shuffle read records is 713758 in that one particular executor but 0 in others. I have a groupBy followed by updateStateByKey, flatMap, map, reduceByKey and updateStateByKey operations in that Stage. I am suspecting that somewhere all the keys are getting collected in the same executor. But the number of partitions seem to be same for all of them as they are all in the same stage. Input Size / Records seems to be same in all of them. What does Shuffle Read Size / Records indicate exactly? Is that the shuffle going out of that executor or the shuffle performed by that executor?
Task Time Total Tasks Failed Tasks Succeeded Tasks Input Size / Records Shuffle Read Size / Records 2 s 12 0 12 392.3 KB / 867 18.0 KB / 0 2 s 12 0 12 411.5 KB / 878 17.9 KB / 0 2 s 12 0 12 397.7 KB / 889 18.0 KB / 0 2 s 12 0 12 387.4 KB / 834 18.0 KB / 0 1 s 8 0 8 263.6 KB / 597 11.9 KB / 0 2 s 12 0 12 397.9 KB / 902 18.0 KB / 0 2 s 12 0 12 411.1 KB / 901 18.0 KB / 0 2 s 12 0 12 370.4 KB / 837 18.0 KB / 0 34 s 12 0 12 400.8 KB / 854 349.5 KB / 713758 2 s 12 0 12 393.3 KB / 885 17.9 KB / 0 2 s 12 0 12 390.3 KB / 862 17.9 KB / -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Task-Time-is-too-high-in-a-single-executor-in-Streaming-tp25614.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org