[
https://issues.apache.org/jira/browse/SPARK-3983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kay Ousterhout updated SPARK-3983:
--
Description:
The reported scheduler delay includes time to get a new thread (from a
threadpool) in order to start the task, time to deserialize the task, and time
to serialize the result. None of these things are delay caused by the
scheduler; including them as such is misleading.
This is especially problematic when debugging performance of short tasks (that
run in 10s of milliseconds), when the scheduler delay can be very large
relative to the task duration.
cc [~sparks] [~shivaram]
was:
The reported scheduler delay includes time to get a new thread (from a
threadpool) in order to start the task, time to deserialize the task, and time
to serialize the result. None of these things are delay caused by the
scheduler; including them as such is misleading.
cc [~sparks] [~shivaram]
Scheduler delay (shown in the UI) is incorrect
--
Key: SPARK-3983
URL: https://issues.apache.org/jira/browse/SPARK-3983
Project: Spark
Issue Type: Bug
Reporter: Kay Ousterhout
Assignee: Kay Ousterhout
Fix For: 1.2.0
The reported scheduler delay includes time to get a new thread (from a
threadpool) in order to start the task, time to deserialize the task, and
time to serialize the result. None of these things are delay caused by the
scheduler; including them as such is misleading.
This is especially problematic when debugging performance of short tasks
(that run in 10s of milliseconds), when the scheduler delay can be very large
relative to the task duration.
cc [~sparks] [~shivaram]
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org