GitHub user zsxwing opened a pull request:
https://github.com/apache/spark/pull/22771
[SPARK-25773][Core]Cancel zombie tasks in a result stage when the job
finishes
## What changes were proposed in this pull request?
When a job finishes, there may be some zombie tasks still running due to
stage retry. Since a result stage will never be used by other jobs, running
these tasks are just wasting the cluster resource. This PR just asks
TaskScheduler to cancel the running tasks of a result stage when it's already
finished. Credits go to @srinathshankar who suggested this idea to me.
This PR also fixes two minor issues while I'm touching DAGScheduler:
- Invalid spark.job.interruptOnCancel should not crash DAGScheduler.
- Non fatal errors should not crash DAGScheduler.
## How was this patch tested?
The new unit tests.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/zsxwing/spark SPARK-25773
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/22771.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #22771
----
commit 581ea53b57cc9fc0e89f2d635422653cfdfcb27f
Author: Shixiong Zhu <shixiong@...>
Date: 2018-10-16T22:07:04Z
Cancel zombie tasks in a result stage when the job finishes
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]