[ 
https://issues.apache.org/jira/browse/SPARK-29287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingbo Jiang resolved SPARK-29287.
----------------------------------
    Fix Version/s: 3.0.0
         Assignee: Kent Yao
       Resolution: Done

Fixed by https://github.com/apache/spark/pull/25964

> Executors should not receive any offers before they are actually constructed
> -----------------------------------------------------------------------------
>
>                 Key: SPARK-29287
>                 URL: https://issues.apache.org/jira/browse/SPARK-29287
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 3.0.0
>            Reporter: Kent Yao
>            Assignee: Kent Yao
>            Priority: Major
>             Fix For: 3.0.0
>
>
> The executors send RegisterExecutor messages to the driver when onStart.
> The driver put the executor data in “the ready to serve map” if it could be, 
> then send RegisteredExecutor back to the executor.  The driver now can make 
> an offer to this executor.
> But the executor is not fully constructed yet. When it received 
> RegisteredExecutor, it start to construct itself, initializing block manager, 
> maybe register to the local shuffle server in the way of retrying, then start 
> the heart beating to driver ... 
> The task allocated here may fail if the executor fails to start or cannot get 
> heart beating to the driver in time.
> Sometimes, even worse, when dynamic allocation and blacklisting is enabled 
> and when the runtime executor number down to min executor setting, and those 
> executors receive tasks before fully constructed and if any error happens, 
> the application may be blocked or tear down. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to