Github user gaborgsomogyi commented on the issue:
https://github.com/apache/spark/pull/21685
What I can't really understand is why the `Scheduler Delay` is so different.
`
Scheduler delay includes time to ship the task from the scheduler to the
executor, and time to send the task result from the executor to the scheduler.
If scheduler delay is large, consider decreasing the size of tasks or
decreasing the size of task results.
`
According to this in the `before` case (5min 20sec) either the source or
the result set is quite big which is not the case in `after` (Avg. 160 ms). Is
the the same application tested on the same dataset?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]