Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/21685 What I can't really understand is why the `Scheduler Delay` is so different. ` Scheduler delay includes time to ship the task from the scheduler to the executor, and time to send the task result from the executor to the scheduler. If scheduler delay is large, consider decreasing the size of tasks or decreasing the size of task results. ` According to this in the `before` case (5min 20sec) either the source or the result set is quite big which is not the case in `after` (Avg. 160 ms). Is the the same application tested on the same dataset?
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org