GitHub user GraceH opened a pull request:

    https://github.com/apache/spark/pull/7528

    Avoid assigning tasks to "lost" executor(s)

    Now, when some executors are killed by dynamic-allocation, it leads to some 
mis-assignment onto lost executors sometimes. Such kind of mis-assignment 
causes task failure(s) or even job failure if it repeats that errors for 4 
times. 
    
    The root cause is that ***killExecutors*** doesn't remove those executors 
under killing ASAP. It depends on the ***OnDisassociated*** event to refresh 
the active working list later. The delay time really depends on your cluster 
status (from several milliseconds to sub-minute). When new tasks to be 
scheduled during that period of time, it will be assigned to those "active" but 
"under killing" executors. Then the tasks will be failed due to "executor 
lost". The better way is to exclude those executors under killing in the 
makeOffers(). Then all those tasks won't be allocated onto those executors "to 
be lost" any more. 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/GraceH/spark AssignToLostExecutor

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/7528.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #7528
    
----
commit 30a9ad07a495c937822a1445f0b3488d4e8e6f63
Author: Grace <[email protected]>
Date:   2015-07-20T06:25:47Z

    Avoid assigning tasks to lost executors

commit b5546ce45f998ded44513cb066384535e10b47a0
Author: Grace <[email protected]>
Date:   2015-07-20T06:48:19Z

    Add comments about the fix

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to