Adrian Wang created SPARK-29177:
-----------------------------------
Summary: Zombie tasks prevents executor from releasing when task
exceeds maxResultSize
Key: SPARK-29177
URL: https://issues.apache.org/jira/browse/SPARK-29177
Project: Spark
Issue Type: Bug
Components: Spark Core
Affects Versions: 2.4.4, 2.3.4
Reporter: Adrian Wang
When we fetch results from executors and found the total size has exceeded the
maxResultSize configured, Spark will simply abort the stage and all dependent
jobs. But the task triggered this is actually successful, but never posted
`CompletionEvent` out, as a result it will never be removed from
`CoarseGrainedSchedulerBackend`. If dynamic allocation is enabled, there will
be zombie executor(s) remaining in resource manager, it will never die until
application ends.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]