[
https://issues.apache.org/jira/browse/SPARK-24387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16509214#comment-16509214
]
Rui Li commented on SPARK-24387:
--------------------------------
Yes, blacklisting can be used to avoid the issue. But blacklist can be turned
off, or configured to be more tolerant. So it's better to have a more reliable
solution.
> Heartbeat-timeout executor is added back and used again
> -------------------------------------------------------
>
> Key: SPARK-24387
> URL: https://issues.apache.org/jira/browse/SPARK-24387
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 2.1.0
> Reporter: Rui Li
> Priority: Major
>
> In our job, when there's only one task and one executor running, the
> executor's heartbeat is lost and driver decides to remove it. However, the
> executor is added again and the task's retry attempt is scheduled to that
> executor, almost immediately after the executor is marked as lost.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]