Kent Yao created SPARK-37481:
--------------------------------
Summary: Disappearance of skipped stages mislead the bug hunting
Key: SPARK-37481
URL: https://issues.apache.org/jira/browse/SPARK-37481
Project: Spark
Issue Type: Bug
Components: Spark Core
Affects Versions: 3.2.0, 3.1.2, 3.3.0
Reporter: Kent Yao
## With FetchFailedException and Map Stage Retries
When rerunning spark-sql shell with the original SQL in
https://gist.github.com/yaooqinn/6acb7b74b343a6a6dffe8401f6b7b45c#gistcomment-3977315

1. stage 3 threw FetchFailedException and caused itself and its parent
stage(stage 2) to retry
2. stage 2 was skipped before but its attemptId was still 0, so when its retry
happened it got removed from `Skipped Stages`
The DAG of Job 2 doesn't show that stage 2 is skipped anymore.

Besides, a retried stage usually has a subset of tasks from the original stage.
If we mark it as an original one, the metrics might lead us into pitfalls.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]