[
https://issues.apache.org/jira/browse/SPARK-36070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kent Yao updated SPARK-36070:
-----------------------------
Description: We have a job that has a stage that contains about 8k tasks.
Most tasks take about 1~10min to finish but 3 of them tasks run extremely slow.
They take about 1 hour each to finish and also do their speculations. The root
cause is most likely the delay of the storage system. On the spark side, we can
record the time cost in logs for better bug hunting or performance tuning.
(was: We have a job that has a stage that contains about 8k tasks. Most tasks
take about 1~10min to finish but 3 of them tasks run extremely slow. The root
cause is most likely the delay of the storage system. On the spark side, we can
record the time cost in logs for better bug hunting or performance tuning.)
> Add time cost info for writing rows out and committing the task.
> ----------------------------------------------------------------
>
> Key: SPARK-36070
> URL: https://issues.apache.org/jira/browse/SPARK-36070
> Project: Spark
> Issue Type: Improvement
> Components: Spark Core
> Affects Versions: 3.2.0
> Reporter: Kent Yao
> Priority: Minor
>
> We have a job that has a stage that contains about 8k tasks. Most tasks take
> about 1~10min to finish but 3 of them tasks run extremely slow. They take
> about 1 hour each to finish and also do their speculations. The root cause is
> most likely the delay of the storage system. On the spark side, we can record
> the time cost in logs for better bug hunting or performance tuning.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]