Github user tgravescs commented on the pull request:
https://github.com/apache/spark/pull/9051#issuecomment-147509387
So the stages page already has the # of tasks per stage and the Duration so
personally if I saw a large duration with many tasks and a small max task
duration I wouldn't look at it for certain set of issues. It might make me
look at it for changing parallelism or # of executors. The "active time" would
be more useful here too where "active time" is first task start time to last
task end time.
The summary of all task time could be useful to but it still doesn't tell
me if some set of tasks took a long time and others took very little. You
could easily have 90% of tasks finish in a few seconds and the rest take a long
time. Or it could mean that all tasks took a long time. Which might be
interesting to but answers a different question then I wanted answered at the
time.
A couple of the reasons I want to see long task times are if a particular
host is bad or if perhaps its getting skewed data.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]