Github user squito commented on the pull request:
https://github.com/apache/spark/pull/6750#issuecomment-121805382
@kayousterhout I don't think that will work, but maybe I'm not seeing it.
I think the problem is, you still need some way get a handle on the zombie
TaskSetManager to be able to call allTasksInTaskSetFinished(). Right now,
taskSetFinished is ultimately getting a handle on that TaskSetManager by
looking it up in activeTaskSets [in
`statusUpdate()`](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala#L328).
So if those zombie task sets aren't in activeTaskSets, it seems like you'd
still need to keep track of them *somewhere* in TaskSchedulerImpl.
I feel like part of the problem is that "active" task sets is somewhat
vague. You might not expect it to contain task sets that have already failed
(from a fetch failed), but still happen to have tasks running. I guess
"zombie" is vague too, but in a way that is better since you aren't tricked
into thinking you know what it means.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]