[ 
https://issues.apache.org/jira/browse/SPARK-37151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingbo Jiang updated SPARK-37151:
---------------------------------
    Description: A worker would retry sending the ExecutorStateChanged message 
when the previous attempt failed. This would not be an issue when the attempt 
failed with TimeoutException. But if the connection between the worker and the 
master is broken, the attempt would fail immediately, leading to the retry 
attempt also fail, and quickly reaches the max attempt limitation.  (was: An 
executor would retry sending the ExecutorStateChanged message when the previous 
attempt failed. This would not be an issue when the attempt failed with 
TimeoutException. But if the connection between the executor and the Master is 
broken, the attempt would fail immediately, leading to the retry attempt also 
fail, and quickly reaches the max attempt limitation.)

> Avoid executor state sync attempt fail continuously in a short timeframe
> ------------------------------------------------------------------------
>
>                 Key: SPARK-37151
>                 URL: https://issues.apache.org/jira/browse/SPARK-37151
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 3.2.0
>            Reporter: Xingbo Jiang
>            Assignee: Xingbo Jiang
>            Priority: Major
>
> A worker would retry sending the ExecutorStateChanged message when the 
> previous attempt failed. This would not be an issue when the attempt failed 
> with TimeoutException. But if the connection between the worker and the 
> master is broken, the attempt would fail immediately, leading to the retry 
> attempt also fail, and quickly reaches the max attempt limitation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to